This article appeared in Scientific Computing & Automation in February 1998.

Chromatographic Data Analysis Using Digital Signal Processing Software

Digital signal processing software can be a very attractive alternative to the conventional use of chromatographic integrators and chromatography software.

By Dennis Shelly

Instrumental methods employing separation techniques such as gas, liquid or supercritical fluid chromatography typically generate two types of analytical data. First, the chromatogram is a series of concentration pulses where peak area or amplitude are proportional to the mass of each solute species. For concentration sensitive detectors, the recorded signal is concentration versus time, and the peak apex is coincident with the maximum concentration. Quantitation of a particular analyte is accomplished by calibrating detector response (as peak area or amplitude) with the concentration (or mass) of the injected solute(s).


Figure 1. Printout of Statistical Moments Worksheet. Signals and functions for each window are listed in Table I.


Figure 2. Baseline-Corrected, Smoothed Chromatographic Signal.

Second, the chromatographic signal represents mass transport of the solute species through the chromatographic column. Here, mass transport refers to bulk flow of the solute in the mobile phase, sorption and desorption of the solute between mobile and stationary phases, and the deleterious effects of molecular diffusion along the chromatographic column. These specific effects are often obscured, but nonetheless integrated into the overall shape of each chromatographic peak. Chromatographic performance can be assessed by analyzing peak shape. Indeed, verifying good chromatographic performance serves to validate the quantitation that is subsequently performed during determination of an unknown component in a mixture.

Typically, chromatographic data analysis consists of peak integration, retention time logging and analyte quantitation using the appropriate calibration. These are usually performed with an electronic integrator or data system using dedicated chromatographic analysis software. Peak start and stop can be determined manually or automatically, with the appropriately selected threshold and time increment (window) values. Fused peaks, sloping baseline and baseline penetration are artifacts that can be accommodated with most instrument/software solutions.

While most integrators perform such calculations in pseudo real time, chromatography data systems generally give comparable results, post-run. Chromatographic integrators typically cost $1300 and up and chromatography software packages start at about $2000. Some sophisticated options such as multiple calibrations, multiple method archiving, instrument control and customizable report formatting, can significantly increase the cost of such software.

A very attractive alternative to such approaches is digital signal processing software. A typical example is DADiSP, by DSP Development Corp. (Cambridge, MA). In this unique software solution, worksheets are constructed, in spreadsheet fashion, using a graphical format . Up to 100 windows may be active at once, in the most recent version of the software. Simple mathematical operations and complex transformations, alike, are easily accomplished with a highly intuitive command structure.

Fundamentally speaking, chromatographic signals are fully compatible with digital signal processing techniques; they are time-varying signals with known ordinate and abscissa units. Macros can be constructed for automatic signal processing applications. The only ancillary instruments required are an analog-to-digital convertor board and driver software. Stored chromatograms can be easily imported to DADiSP, thus chromatographic data analysis by off-line digital signal processing is functionally equivalent to post-run analysis with dedicated chromatography data systems.

Table I Summary of Statistical Moments Analysis by DADiSP
Window Command Result
1 IMPORT signal raw data imported
2 MOVAVG(W1, 3) 3-point moving average of window 1 signal
3 W2+2.3 normalize baseline to 0 by adding the scalar, 2.3
4 DERIV(DERIV(W3)) double derivative of window 3 signal
5 EXTRACT(W3, 102, 9) extract peak (9pts.) from window 3 beginning at point 102
6 INTEG(W5) integrate extracted peak to get area (M0);
maximum ordinate value is 31.0895
7 GLINE(139, 1, 1, 0) generate a line with 139 pts., time axis spacing of 1, slope of 1 and an amplitude intercept of 1
8 EXTRACT(W7, 102, 9) extract that portion of line which corresopnds to the desired peak from window 6
9 INTEG(W5*W8) integrates the product of window 5 and window 8 signals.
10 W9/31.0895 divides window 9 signal by M0, a scalar; results in the first moment (M1) as the maximum ordinate value, 105.153
11 (W8 -105.153) 2 computes the square of the difference between the signal in window 8 and first moment
12 INTEG(W11*W5) Integrates the product of window 11 and the extracted peak in window 5
13 W12/31.0895 computes the ratio of the signal in window 12 to the zeroth moment; results in the second moment, M2, as the maximum ordinate value
14 (W8-105.153) 3 computes the cube of the difference between the signal in window 8 and the first moment
15 INTEG(W14*W5) integrates the product of window 14 and the extracted peak in window 5
16 W15/31.0895 computes the ratio of the signal in window 15 to the zeroth moment; results in the third moment, M3, as the maximum ordinate value
17 W16/((W13) 1.5) divides the third moment by the second moment, raised to the 1.5 or 3/2 power; results in peak skew as the maximum ordinate value
18 (W8-105.153) 4 computes the fourth power of the difference between the signal in window 8 and the first moment
19 INTEG(18*W5) integrates the product of window 18 and the extracted peak in window 5
20 W19/31.0895 computes the ratio of the signal in window 19 to the zeroth moment; results in the fourth moment, M4, as the maximum ordinate value
21 (W20/W13) 2)-3 divides the fourth moment by the second moment squared and subtracts three from the quotient; results in peak excess as the ordinate value at maximum abscissa

Worksheet Construction
DADiSP chromatographic signal worksheets are easily constructed. In this paper, command names will be capitalized. The first step is to import the signal. This is done with the IMPORT SIGNAL menus and routines. Some choices to be made include source file type and name, ordinate axis units, abscissa axis units and number of data points. DADiSP has a built-in signal and worksheet file manager, so imported signals and complete worksheets can be easily located. Multidimensional data (time, signal 1, signal 2, etc.) can be imported column-by-column, individually or all simultaneously.

We routinely import the signal values with an assumed, constant sampling frequency. Importing the time values is a quick check for constant sampling rate. The number of data points and the sampling frequency are combined to determine the time, at a specific data point. The imported DADiSP signal is assigned a file name and archived by the file manager. Next, a WORKSHEET is initialized by specifying the number of windows to be drawn. Next, one ENTERS a window and brings in the converted-imported signal. A series of transformations is constructed by moving from window to window, linking them with the appropriate commands.

Two Experimental Analyses
We report two analyses in this paper, chromatographic performance analysis by statistical moments and gated integration for improvement of signal-to-noise ratio. Raw chromatographic data were obtained using a Micro LC system [1], equipped with a Data Translation DT2805 analog-to-digital converter board and a XT-class computer, running custom data acquisition software. The digital signal processing software was DADiSP versions 1.05 and 4.0 by DSP Development Corp. (Cambridge, MA). The worksheets were created using version 1.05 on a XT computer. Publication-quality printouts were obtained with version 4.0 running on a Pentium computer. Complete compatibility was observed between these two versions of the software; i.e. the version 1.05 worksheets were easily imported into version 4.0 with no loss of functionality.

The chromatographic performance analysis was done on a series of polynuclear aromatic hydrocarbon standards obtained in a sixteen component mixture from Chem Service (West Chester, PA). This mixture was separated on a Spherisorb ODS2 C18 microcolumn (0.25 mm I.D. x 750 mm) using isocratic elution with a 92/8 acetonitrile/water mobile phase.

For the gated integration experiment a series of C14 to C24 carboxylic acids was derivatized with bromomethyl coumarin, forming fluorescent derivatives. Here, the same column was used with a 96/4 acetonitrile/water mobile phase. Both separations were performed at flow rates corresponding to reduced velocities of three.

Chromatographic Performance Analysis
Chromatographic figures of merit are most readily and reliably obtained by statistical moment analysis [2]. Here, the zeroth moment is the peak area, the first moment is the peak centroid, the second moment gives peak variance, the third moment is used to calculate peak skew and the fourth moment will yield peak excess. Implementing the moments analysis by digital signal processing software is merely an organized sequence of applied mathematical manipulations. Each of the moment calculations was performed according to published equations and format [2].


Figure 3. Printout of Window 10, First Statistical Moment.


Figure 4. Printout of Gated and Continuous Integration Worksheet. Signals and functions for each window are listed in Table III.

The moments analysis was constructed in a 21-window worksheet (Figure 1) with detailed composition shown in Table 1. Figure 2 shows the contents of window 3, the smoothed chromatographic signal which was baseline corrected. The zeroth moment is found by multiplying the maximum ordinate value by the abscissa range in window 6, i.e. 31.09 counts times 8 s or 248.72 counts.s. The 8 s time increment was found by noting the zero-crossing points in the second derivative of the signal (window 4). The first moment is the maximum ordinate value (105.153) of the signal contained in window 10, and shown in Figure 3. From window 13 we get the second statistical moment, as the maximum ordinate value for the signal, 1.9 s2.

These values are very reasonable, since the peak is only 8 seconds wide (the full width 6s, would imply that s2 would be about 1.8 s2) and centered between 100 and 110 sec, as shown in Figure 2. In all three cases the appropriate values can be immediately displayed using the ZOOM and CURSOR commands. The third moment is calculated, sequentially, in windows 14, 15 and 16.

Note that the third moment (displayed in window 16) is relatively symmetrical about the peaks retention time of 105 s. The peak skew is calculated in window 17; and this function has a curious shape. The maximum ordinate value is about 0.103 and there is a distinct minimum in the curve close to 102 s. This corresponds to the leading edge of the peak and the second data point in this segment. Such a feature is probably due to the very low number of data points in the chromatogram. Besides, the ratio of the third moment to the 3/2 power of the second moment is the moment coefficient of skewness [3], a scalar quantity. Finally, the fourth moment is calculated in windows 17, 18, 19 and 20. The maximum ordinate value in window 20 is the fourth moment. Peak skew is the quotient of the fourth moment to the square of the second moment, minus 3 and is the ordinate value at the maximum of the abscissa range in window 21. The value displayed from DADiSP was -0.1113. The negative value indicates a platykurtic (flat-topped) peak [3]. Window 3 was laser printed and analyzed manually using the equations for each of the chromatographic figures of merit [2].

Table II shows results of DADiSP and manual calculations for the polynuclear aromatic hydrocarbon peak. Agreement is reasonable and would have been better, perhaps, had the sampling frequency been closer to optimum. Only six different commands were used for the entire worksheet, all of which could be displayed simultaneously. Evaluating additional chromatograms simply required importing a new signal in window 1 and modifying the parameters for each command, as appropriate.

Table II Summary of Chromatographic Figures of Merit Calculations
  Method
Figure (unit) DADiSP* Manual
Area (counts*s) 31.0895 27.93
Centroid (s) 105.153 104.96
Variance (s2) 1.90873 1.860
Peak Skew 0.1029 0.168
Peak Excess -0.111326 0.198
*: values not truncated from computer display

TABLE III Summary of Gated and Continuous Integration Worksheet
Window Command Result
1 IMPORT Signal raw data imported
2 EXTRACT(W1,1500,6527) a 6527 point portion of window 1 signal is extraced, beginning at point 1500
3 INTEG(W2) window 2 signal is integrated (continuous)
4 EXTRACT(W1,4525,504) a 504 point portion of window 1 signal is extracted, beginning at point 4525
5 EXTRACT(W1,5029,600) a 600 point portion of window 1 signal is extracted, beginning at point 5029
6 INTEG(W2) this is an expanded view of the continuously integrated signal
7 INTEG(W4) integrate extracted signal, as specified in window 4
8 INTEG(W5) integrate extracted signal, as specified in window 5

Enhancing Signal-to-Noise
Digital signal processing can also be used to enhance the signal-to-noise ratio of recorded analytical signals. A number of techniques, such as Savitsky-Golay smoothing, digital filtering and correlation have been used. Recently, a unique integration technique, gated integration, has been shown to dramatically improve signal-to-noise ratio of acquired signals [4]. The approach is simple - integrate the signal over regions where there is analyte signal present. In this way, baseline noise is less influential in biasing the integration.


Figure 5. Windows 1 and 2 from Continuous and Gated Integration Worksheet. A. Window 1: expanded view of raw chromatogram; B. Window 2: Extracted fatty acid derivative peaks.
Combining a high sensitivity analytical instrument, such as laser induced fluorescence Micro LC, with powerful digital signal processing techniques is an ideal combination, resulting in further enhanced detectability . Such is the rationale behind applying gated integration by DADiSP to laser induced fluorescence Micro LC chromatographic signals.

An eight-window worksheet was constructed as a comparison of gated and continuous integration. Figure 4 is a computer printout of this worksheet. Table III lists the signals and mathematical operations on them.

Figure 5A is a raw chromatographic signal (expanded view) from a derivatized mixture of C14, C16, C18, C20, C22 and C24 fatty acids, separated by reversed phase Micro LC. Note the number of peaks and the fact that the first three are off scale. The first peak appeared in the blank chromatogram and was attributed to the derivatizing reagent. The photon counting detector was set with a counting period of 64 ms. Note the time axis. All components elute within 1200 s, or 20 min.

Figure 5B shows the extracted fatty acid derivative peaks, with the ordinate expansion set to autoscale. Here, the pair of peaks around 1050 s is relatively obscure. Enhancing signal-to-noise with digital signal processing should improve detectability and may even yield new chemical information in terms of the "extra" peaks.

The last two peaks in the chromatogram were analyzed by continuous and gated integration. The extracted fatty acid derivative peaks are continuously integrated in window 3, using the INTEG command. Starting continuous integration at the point of injection would have included the reagent peak, significantly biasing the calculation. Though, it could have been subtracted away, it is much easier to extract and then integrate only the desired peaks.

From window 3 it is possible to identify the peak start and stop points and to obtain the integration values for the final two peaks at around 1050 s. The peak start and stop points are confirmed by taking the second derivative and noting the zero-crossing points (data not shown). The final two peaks were designated 6a and 6b.


Figure 6. Continuous Integration Results. A. Integration of extracted fatty acid derivative signal; B. Expanded view of last few peaks. Arrows indicate segment of baseline used for noise calculations.
Peak heights for these two peaks were obtained by examining window 1 using the ZOOM and CURSOR commands. Figure 6 shows the results of continuous integration. Figure 6A shows the continuously integrated fatty acid derivative chromatogram while Figure 6B shows an expanded portion of the last few peaks. The areas, by continuous integration, of peaks 6a and 6b were obtained from this window. The arrows in Figure 6A indicate the segment of baseline which was used to obtain baseline noise values for signal-to-noise calculations. Figure 7 shows extracted portions of the original chromatographic signal (window 1). From the second derivative, peak 6a was determined to extend from 974 s to 1056 s (119 data points) while peak 6b was found from 1056 s to 1145 s (132 data points), as shown by the arrows. If we integrate these peaks only over these regions, peak areas that are relatively immune from baseline noise are generated. Signal values of 1092 and 939 were obtained from these two peaks, respectively, using the ZOOM and CURSOR command and a little math.

The signal-to-noise ratios were calculated for peaks 6a and 6b using peak height, continuous integration and gated integration data. Peak heights were obtained from window 1 using the ZOOM and CURSOR commands. The peak height noise was obtained from the selected portion of raw baseline (from 1474 to 1520 s) using the STATS command. The continuous integration signal values for peaks 6a and 6b were obtained from window 3 (see Figure 6). Companion noise values were obtained from the selected portion of integrated baseline (see Figure 6A).


Figure 7. Peaks 6a and 6b Extracted From the Original Chromatographic Signal. A. Peak 6a; B. Peak 6b. Arrows and vertical cursors indicate the start of peak 6a and the endpoint of peak 6b, respectively.
The gated integration signal values, for each of these peaks, were obtained from windows 7 and 8, respectively. Likewise, the corresponding noise values were obtained from the selected portion of baseline. Peaks 6a and 6b have different gated integration noise values because their integration was based on different numbers of data points, 390 for peak 6a (504-114) and 424 for peak 6b, as shown in Figure 7.

The results of signal-to-noise calculations for each of these three techniques and two chromatographic peaks are listed in Table IV. There is a 22 percent improvement in signal-to-noise ratio for peak 6a and a 6 percent improvement in signal-to-noise ratio for peak 6b, comparing gated integration to continuous integration.

Recalling Figure 5 and the two doublets obtained for C22 and C24 -methyl coumarin derivatives, the retention time data for the series of peaks was further analyzed. A plot of log retention time versus carbon number for the first four peaks and for the earlier eluting peaks in the doublets revealed a line with slope of 0.047 and intercept of 1.88 (r = 0.996). The predicted retention time values for C23 and C25 derivatives agreed to within 4 percent. Repeated regression modeling with other assignments to the standard set revealed poorer correlations. Therefore, we can say with some confidence that the standard mixture was contaminated with C23 and C25 fatty acids.

Conclusions
Graphical digital signal processing software is an attractive alternative to dedicated chromatography software. Software approaches to digital signal processing have been shown to greatly facilitate the analysis of chemical measurements [5].

Table IV Results of Signal-to-Noise Calculations
  Peak 6a Peak 6b
Method S N S/N S N S/N
Peak Height 18.39 2.396 7.68 12.79 2.396 5.33
Continuous Integration 1205 160 7.53 1076 160 6.73
Gated Integration 1092 119 9.17 939 132 7.11

The highly intuitive command structure speeds worksheet construction and the graphical interface enables rapid, enhanced interpretation of calculations. An extensive list of built-in functions and commands permits a much wider range of mathematical operations to be performed for more detailed and meaningful analyses. There is complete control of each manipulation as complex calculations are performed in an organized, sequential fashion. Highly relevant, application-specific analyses are performed at the basic signal level, with no intervening layers of code or text to distract the analyst.

The two worksheets, described above, were rapidly and easily constructed with DADiSP. The statistical moments worksheet produced five statistical moments and four chromatographic figures of merit in just 21 windows with minimal operator intervention. The gated integration worksheet required only eight windows, enabled continuous as well as gated integration and yielded improved detectabilities for two components.

Acknowledgements
The author is grateful to T.J. Edkins of Johnson & Johnson Pharmaceutical Research Institute for helpful discussions and interest in the gated integration worksheet. Significant contributions were made by Brad Boring and Ahmad Abbas, two graduate students who helped develop the statistical moments analysis worksheets as part of Analytical Separations Science and Technology, CHEM 5317. Financial support was provided by the National Institutes of Health, under grant IR15 GM40114-01, by MACH I, Inc. under SBIR Phase II contract N 60921-90-C-0157, sponsored by the Naval Surface Warfare Center, and by Westinghouse-Hanford Co. under contract MJG-SVV-334295.

References
1. T.J. Edkins and D.C. Shelly, Anal. Chim. Acta, 246, 151-159 (1991).
2. J.P. Foley and J.G. Dorsey, Anal. Chem., 55, 730-737 (1983).
3. M. Spiegel, Theory and Problems of Statistics, Schaum's Outline Series, McGraw-Hill, NY, 91 (1991).
4. E. Voigtman, Appl. Spectrosc., 45, 237-241 (1991).
5. T. O'Haver, J. Chem. Ed., 68, A147-A150 (1991).

Dennis C. Shelly is an associate professor in the Department of Chemistry and Biochemistry at Texas Tech University. His research interests include multiparameter sensing strategies for bioanalytical instrumentation, novel applications of analytical spectroscopy for materials chemistry studies and developing undergraduate curricula that utilize chemical enterprises as learning organizations. He can be reached at Department of Chemistry and Biochemistry, Texas Tech University Lubbock, TX 79409-1061 E-mail: kzdcs@ttu.edu.

Scientific Computing & Automation
Copyright 1998 Gordon Publications/Cahners Publishing

Back To What's New at DSP

OR

Back to Articles and Applications



Company| Products| Training| Partners| Q & A| Download| What's New| Request Info| DSP Home Page

Copyright © 1997 DSP Development Corporation and Onward Technologies, Inc. All rights reserved.