Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Chau Chemometrics From Basics to Wavelet Transform

.pdf
Скачиваний:
119
Добавлен:
15.08.2013
Размер:
2.71 Mб
Скачать

resolution enhancement

199

where X1 includes only chromatographic baseline drift and X2 includes both chromatographic baseline drift and spectral background.

In order to remove the baseline drift and the spectral background from the data matrices X1 and X2, we can apply a WT decomposition to each line of the two data matrices and use the detail coefficients on a definite scale k to combine new data matrices X1d and X2d . In both matrices X1d and X2d , chromatographic baseline should have been corrected. After that, for the matrix X2d , we can further remove the spectral background by subtracting the transformed background from it using the information in zero-component regions. Note that in practice, it is very important to determine the decomposition scale k . Generally, k = 1 or 2 can be used for those signals with a low sampling frequency, but k = 3, 4, or 5 should be used for those signals with a high sampling frequency. In this example, k = 2 was used. Furthermore, different wavelet basis used give different results. There is no general rule for selection of the best wavelet basis for a given signal except trial and error. In this example, Daubechies wavelet filters with length L = 4 were used.

In order to investigate the effect of the method, we can examine the rankmap of the data matrices as it can reveal the factor number of these data matrices. Figure 5.34 shows the rankmaps of the data matrices X, X1, X2 before and after background removal. By comparing the rankmaps before and after background removal, it is clear that the factor corresponding to the chromatographic baseline and the spectral background in panels (a), (b), and (c) disappeared in panels (d), (e) and (f) (of Fig. 5.34). This indicates that both the chromatographic baseline and the spectral background are completely removed.

An experimental HPLC-DAD data matrix was processed by the method in an article published in Chemometrics and Intelligent Laboratory Systems

[37:261--269 (1997)] by Shen et al. Figure 5.35a shows the rankmap of the data matrix of a sample consisting of two isomers of a pharmacologically active drug. It is clear that the factor number is not correct. There are two factors caused by background absorption. But in Figure 5.35b, which shows the rankmap of the matrix after background removal by the method, only two main factors remain. In this work, decomposition scale k = 3 and Daubechies wavelet filters with length L = 4 were adopted.

5.4.RESOLUTION ENHANCEMENT

Poor resolution is a very common problem encountered by chemists in spectral or chromatographic studies. In spectral studies, resolution of a spectrum is very important to ensure a correct assignment of the spectral

200

application of wavelet transform in chemistry

Figure 5.34. Rankmaps of the data matrices x, x1, x2 before (a),(b),(c) and after (d),(e),(f) background removal by WT decomposition.

peaks. The problem of poor resolution is always an obstruction in spectral analysis. Another problem in both spectral and chromatographic studies is peak overlapping. We must resort to mathematical techniques to solve these problems. Usually, these problems are solved by using techniques such as linear or nonlinear regression analysis, curve-fitting procedures, derivative procedures, neural networks, and chemical factor analysis.

5.4.1. Numerical Differentiation Using

Discrete Wavelet Transform

Derivative calculation is a powerful technique used in analytical chemistry to resolve spectra, sharpen peaks, determine potentiometric titration endpoints, carry out quantitative analysis, and perform similar procedures.

resolution enhancement

201

Figure 5.35. Rankmap of an experimental HPLC-DAD data matrix before (a) and after (b) background removal by WT decomposition.

Although derivative calculation is a useful tool for data analysis, it has a major drawback in increasing the noise level in higher-order derivative calculation. In order to improve the signal-to-noise ratio (SNR) of higher-order derivative calculation via conventional numerical differentiation, noise reduction is usually performed before calculating the successive order derivative, which will lead to inconvenience and complication in the calculation.

In practice, the simplest method of derivative calculation is numerical differentiation. Other methods used in chemical studies for derivative calculation include Fourier transform and the Savitsky--Golay method. The former method cannot improve the SNR in its result, and the latter method involves the selection of suitable parameters.

On the basis of the characteristics of the wavelet functions, a method for derivative calculation using DWT with Daubechies wavelet was proposed. The approximate first derivative of a signal x can be computed by the difference between two scale coefficients

x(1) = cj ,D2m cj ,D2m˜

m = m˜

 

 

(5.74)

where x(1) represents the first derivative of the signal x, D

2m

and D

2m˜

rep-

 

 

 

 

resent two Daubechies wavelet functions, and cj ,D2m and cj ,D2m˜ are the

202

application of wavelet transform in chemistry

j th-scale discrete approximations obtained by Equations (5.1) and (5.2) with the two wavelets, respectively. In practice applications, j = 1 is generally used. But for noisy signals, a higher value of j can be used to improve the SNR of the calculated result.

Higher-order derivative computation can be achieved by using the result obtained from the lower-derivative calculation as an input for WT calculation

x

(n)

(n−1)

(n−1)

m = m˜ and n > 1

(5.75)

 

= cj ,D2m

cj ,D2m

 

 

 

˜

 

 

where x(n) represents the nth-order derivative and c(n−1) and c(n−1)

rep-

j ,D2m

j ,D2m˜

 

resent the j th-scale approximation coefficients obtained by WT from the (n 1)th-order derivative with the Daubechies wavelet functions D2m and D2m˜ .

Example 5.10: Approximate Derivative Calculation of Simulated Signals Using DWT. Figure 5.36a shows three types of typical analytical

Figure 5.36. A simulated signal (a) and its first- (a) and second- (c) order derivatives calculated by DWT method.

resolution enhancement

 

 

 

203

signals simulated by Gaussian, Lorentzian, and sigmoid functions

 

 

=

 

W1/2

 

2

 

 

xgauss

 

A0 exp

4 ln(2)

 

t t0

 

 

(5.76)

 

 

 

 

 

 

=

 

 

1

+

W1/2

 

2

−1

xlorentz

 

A0

 

4

t t0

 

 

 

 

 

 

 

xsigmoid

=

 

 

A0

 

 

 

 

 

 

 

1 + ek (t t0)

 

 

 

(5.77)

(5.78)

where A0, t0, and W1/2. are the amplitude, position, and the full width at half maximum of the simulated peak, respectively, and k is a parameter to control the gradient of the curve. Figure 5.37a shows a noisy signal by the curve (a) in Figure 36 plus a random noise at SNR = 100.

Figure 5.37. A simulated noisy signal (a) and its first derivative calculated by DWT with j = 1

(b) and j = 3 (c), respectively.

204 application of wavelet transform in chemistry

Figure 5.36b,c shows the firstand second-order derivatives obtained by Equations (5.74) and (5.75), respectively, using j = 1 and Daubechies N = 18 and 8 as the filters (D2m and D2m˜ ). It can be seen that very good results can be obtained. The only drawback is that the number of data points is reduced to 12 for the first-order derivative and 14 for the second-order derivative.

Figure 5.37b shows the first derivative using the same method and the parameters as in Figure 5.36b. It can be seen that, when the signal is noisy, the SNR of the result will be even worse than that of the signal. In this case, we can use a higher value of j for the calculation. Figure 5.37c is obtained using j = 3, which is an acceptable result. But there is also the problem of the reduction of the data point number. The number of data point in Figure 5.37c is only 18 th of the original length.

Although interpolation can solve the problem of the data point number,

we can also use Equations (5.32) and (5.33) for calculation of the cj ,D2m

(n−1)

(n−1)

in Equation (5.75). The

and cj ,D2m˜ in Equation (5.74) or cj ,D2m

and cj ,D2m

 

˜

 

benefit of using the two equations is that the number of data points in cj does not change. Consequently, the length of the calculated derivatives will remain the same as that of the original signal. Figures 5.38 and 5.39 show the firstand the second-order derivatives of the simulated signals with the same filters as in the DWT method and j = 1 and j = 4, respectively. It can be seen that all the results are acceptable. Only the symmetry of the peaks is slightly distorted because of the effect of the noise in Figure 5.39c.

Computational Details of Example 5.10

1.Generate the signal with 2000 data points (Fig. 5.36a) by using Gaussian, Lorentzian, and sigmoid equations.

2.Extend the data point number to 2048.

3.Make two wavelet filters---Daubechies18 and 8.

4.Set resolution level J = 1.

5.Perform DWT with the two filters, respectively, to obtain the wavelet coefficients.

6.Calculate the first derivative by subtracting the approximate coefficients one from another.

7.Display Figure 5.36b.

8.Perform DWT on the coefficients obtained in step 5 with the two filters, respectively.

9.Calculate the second derivative from the approximate coefficients.

10.Display Figure 5.36c.

resolution enhancement

205

Note: The results in Figure 5.37 were obtained in a similar way with the only difference in J = 3 for the second-derivative calculation, and the results in Figures 5.38 and 5.39 were obtained by using the improved algorithm with J = 1 and 4, respectively.

5.4.2.Numerical Differentiation Using Continuous

Wavelet Transform

CWT with some specific wavelet functions can also be used for approximate derivative calculation of analytical signals. The Haar wavelet function, for instance, is one of the appropriate wavelet bases because of its symmetric

Figure 5.38. A simulated signal (a) and its first- (b) and second- (c) order derivatives calculated by the improved WT algorithm with j = 1.

206

application of wavelet transform in chemistry

Figure 5.39. A simulated noisy signal (a) and its first- (b) and second- (c) order derivatives calculated by the improved WT algorithm with j = 4.

property. Just as in the method using DWT, the approximate nth-order derivative of an analytical signal can be obtained by applying n times of the CWT to the signal. CWT is continuous in terms of scaling and shifting; that is, during computation, the analyzing wavelet can be shifted smoothly over the full domain of the analyzed signal at any scale. Therefore, it possesses much stronger resolving ability than does DWT in both time and frequency domains. In comparison with the other methods, the SNR of the derivatives for noisy signals can be greatly improved using the CWT method with a proper scale parameter a for ψa(t ) in Equation (5.49).

In practice, analytical signals are discrete; thus, a discrete form of the Equation (5.49) is used

| |

 

f (n s )ψ

(n

i ) s

 

 

s

 

 

Wf (a, i s ) =

 

 

 

(5.79)

a

n

 

a

 

 

 

 

 

 

 

resolution enhancement

207

where i and n are indices of the data point, s corresponds to the sampling interval, and ψ(t ) is the Haar wavelet function:

 

 

1

0 ≤ t

< 21

 

 

 

 

 

 

 

 

 

= −

 

1

 

 

 

ψ(t )

 

1

 

t

< 1

(5.80)

2

 

 

 

 

 

 

 

 

 

 

 

0

other

 

 

 

 

 

 

It is obvious that because the Haar wavelet function is of a ladder shape, the result from Equation (5.79) should correspond to the first derivative of f (n s ), and the nth-order derivative can be obtained by applying CWT to the (n − 1)th-order derivative.

Example 5.11: Approximate Derivative Calculation of Simulated Signals Using CWT. Figures 5.40 and 5.41 show the firstand second-order derivatives of the two simulated signals used in the DWT method above. They are computed by using the CWT method with different values of the scale parameter a. It can be seen that, for the signal without noise in Figure 5.40, all the results at different values of a are satisfactory. Only the width of the peaks in the calculated derivatives becomes increasingly broad with increase in the value of a. But for the noisy signal in Figure 5.41, it can be seen that the effect of noise can be eliminated with a relative large value of a.

Computational Details of Example 5.11

1.Generate the signal with 2000 data points (Fig. 5.36a) using Gaussian, Lorentzian, and sigmoid equations.

2.Extend the data point number to 2048.

3.Perform CWT with Haar wavelet and the scale parameter a = 1, 20, and 50 to obtain the firstand second-order derivatives, and display them in Figure 5.40 and 5.41, respectively, for the simulated signal with and without noise.

Taking the Gaussian peak as an example, we can investigate the effect of the scale parameter a on the peak width of its derivative. Figure 5.42 shows the relationship between increase in peak width and the value of a. The increase is represented in percent of the peak width in the derivative by the conventional numerical method. The values of the circled points connected by the dotted line are obtained by manual measurement, and the solid line represents the fitted curves. It is clear that the peak width

208

application of wavelet transform in chemistry

Figure 5.40. The first- (left column) and second- (right column) order derivatives of the simulated signal without noise calculated by CWT with Haar wavelet and the scale parameter a = 1 (a), 20 (b), and 50 (c).

increases with the value of a, but the increase is less than 10% when the parameter a is less than 100.

Figure 5.43, curve (a) shows a photoacoustic spectrum of a rareearth compound. There are three overlapping absorption peaks in the range of 400--500 nm. It is evident that high magnitude noise exists in the spectrum, which makes the position of peaks hard to determine. If the spectral data are directly processed by numerical differentiation methods, the SNR of the derivatives will be too low to see anything but noise. Figure 5.43, curves (b) and (c) are respectively the first and second derivatives computed by the CWT method with a = 100. It can be seen that the derivatives are smooth and clean, and the SNR is even better than that of the original signal. Although there is still a small-magnitude noise in the derivatives, it does not affect determination of the position of peaks, because the overlapping peaks are clearly resolved in the derivatives.

Соседние файлы в предмете Химия