Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Chau Chemometrics From Basics to Wavelet Transform

.pdf
Скачиваний:
119
Добавлен:
15.08.2013
Размер:
2.71 Mб
Скачать

data denoising and smoothing

179

5.2.3. Denoising and Smoothing Using Wavelet

Packet Transform

As discussed in Section 5.1.2, WPT is a development of WT. The only difference between the two methods is the decomposition tree. In WT, only the approximation coefficients at each scale are used for further decomposition, but in WPT, the further decomposition is applied to both the approximation and the detail coefficients. The procedures of data denoising and smoothing using WPT can be summarized as follows:

1.Apply a WPT to the original signal w00 = fnoise up to a predefined scale J and to obtain the coefficients wjp for j = 1, . . . , J and p = 0, . . . , 2j −1 − 1 using Equations (5.7) and (5.8).

2.Find the best basis to express the original signal.

3.Suppress the coefficients in best-basis selection that are considered to be attributed to noise by thresholding methods for denoising, or remove those coefficients that are thought to represent high frequency fluctuation for smoothing.

4.Reconstruct the denoised or the smoothed signal by the inverse WPT using Equation (5.11).

As for the thresholding operation in step 2, all the methods used in the WT denoising and smoothing can be employed.

The improved algorithm by Equations (5.32)--(5.34) can also be used for WPT smoothing. Decomposition and reconstruction can be implemented by

L

 

 

 

wj2,kp =

=

 

hj ,m wjp−1,k m

 

(5.42)

 

 

 

 

m

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

wj2,kp+1 =

L

gj ,m wjp−1,k m

 

 

 

 

 

=

 

(5.43)

 

 

 

 

m

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p

L

 

2p

 

L

 

2p+1

 

wj ,k

=

=

 

hj ,m wj +1,k m +

=

gj ,m wj +1,k m

(5.44)

 

m

1

 

 

 

m

1

 

 

 

 

 

 

 

 

 

 

As in WT, the frequency of the coefficients wj0 will be lower than 2j fc , and

wjp for p = 1, . . . , 2j − 1 will be in the range of p × 2j fc (p + 1) × 2j fc . Compared with WT smoothing, it is more important for WPT smoothing

to develop a scale threshold estimation method because the components

180

application of wavelet transform in chemistry

of the coefficients are much more complicated. It is impossible to examine all the wjp by trial and error. Unfortunately, it is difficult to derive a simple equation for the determination of both j and p as a threshold. In practical applications, we can simply use Equation (5.41) to estimate a minimum scale j th and use p th = 0 as the threshold, that is, to decompose the signal with J = j th to obtain the coefficients {wpj } and set all the wjp with p > 0 to zero in the reconstruction calculation. It is easy to prove that the result of such an approach will be exactly the same as that of WT smoothing with scale threshold j th. Therefore, if a satisfactory result can be obtained with the threshold j th and p th = 0, WPT is not a good choice; it would be better to use WT. We have mentioned that the difference between WPT and WPT is the decomposition tree, and WPT gives a finer decomposition than WT. Therefore, if we cannot obtain a satisfactory result with the threshold j th and p th = 0, we should perform the WPT decomposition with J = j th +2, j th +3, or an even higher scale, and then estimate an approximate threshold p th by

min (fwjp ) = (p + 1) × 2j fc = (p + 1) × 2j

1

= fcut =

1

(5.45)

2 S

2Wth

i.e.,

 

 

 

 

 

 

p th = 2j

S

− 1

 

 

 

(5.46)

 

 

 

 

Wth

 

 

 

By such an approach, trials sometimes, may still be needed to determinate the parameter p th in practical applications. It is clear that only those wjp with j = J and p around p th, instead of all the wjp , need be examined. (Note: The best-basis selection step is not necessary if we use the scale threshold in smoothing.)

Example 5.7: Denoising and Smoothing of Simulated and Experimental Chromatograms Using WPT. Figure 5.19 shows the denoised and the smoothed results of the simulated chromatogram in Figure 5.14, curve (b) using the WPT. The denoised curve (a) in Figure 5.19 is obtained with Symmlet4 filter and J = 10 by hard thresholding. The best basis was selected using Coifman--Wickerhauser entropy method, and the threshold value ε = 0.008 × [max(w) − min(w)] was determined by trial and error. The smoothed curve (b) in Figure 5.19 is obtained with the same filter and J = 4. j th = 4 and p th = 0 were used as the scale threshold. Both the denoising and the smoothing give us a satisfactory result. If we further compare the denoised and the smoothed results, we can find that the smoothed result is superior to the denoised result.

data denoising and smoothing

181

Figure 5.19. Comparison of the denoised (a) and smoothed (b) results by WPT of the simulated noisy chromatogram.

Computational Details of Example 5.7

Denoising: Figures 5.19 (a):

1.Generate the original signal with 1024 data points using the Gaussian equation.

2.Make a wavelet filter---Symmlet4.

3.Normalize the data to noise level 1.

4.Set decomposition level D = 10 (dyadic length of the signal).

5.Perform WPT to obtain the WP coefficients.

6.Find best basis according to the entropy criteria.

7.Apply hard thresholding to the WP coefficients of the best basis.

8.Perform inverse WPT with the WP coefficients after thresholding to obtain the denoised signal.

9.Display Figure 5.19, curve (a).

Smoothing: Figure 5.19 (b):

1.Generate the original signal with 1024 data points using the Gaussian equation.

2.Make a wavelet filter---Symmlet4.

3.Set scale threshold J th = 4 and P th = 0, and set decomposition level J = J th.

4.Perform WPT to obtain the WP coefficients.

5.Perform inverse WPT with WP coefficients within the scale threshold to obtain the smoothed signal.

6.Display Figure 5.19, curve (b).

182

application of wavelet transform in chemistry

Figure 5.20. Plots of the smoothed chromatograms by WPT with scale threshold of j th = 7 and p th = 2 (a), 3 (b), 4 (c), 5 (d), 6 (e), and 7 (f).

As mentioned above, if we perform smoothing of the experimental chromatogram in Figure 5.17 using the improved algorithm of WPT by scale threshold j th = 4 and p th = 0, the smoothed result should be same as the result in Figure 5.18, curve (b). If we want to smooth out the small fluctuation remained in that curve, we can perform a further smoothing using the WPT by scale threshold j th = 7 and p th < 7, because WPT further decomposes the w40 into w70 w77 and the fcut for j th = 4 and p th = 0, 1 × 2−4fc , is equal to that for j th = 7 and p th = 7, 8 × 2−7fc .

Figure 5.20 shows the smoothed results by scale threshold j th = 7 and p th = 2 7. It is evident that the threshold j th = 7 and p th = 7 gives the same result with that in Figure 5.18 (b). Going from curve (f) to curve (a), it is clear that the curves become increasingly smooth with the decrease of p th. Therefore, we can conveniently choose a curve as our smoothed result.

5.2.4. Comparison between Wavelet Transform and Conventional Methods

In order to compare the WT or WPT denoising and smoothing with the conventional methods, the simulated and the experimental chromatograms are smoothed by moving-average, Savitsky--Golay, and FFT filtering methods, respectively. Figures 5.21 and 5.22 show their results. The smoothing window or filter width for the three methods is respectively 25, 13, and 13 points for the simulated chromatogram in Figures 5.21 and 25, 17, and 17 points for the experimental chromatogram in Figure 5.22. By comparing these

baseline/background removal

183

Figure 5.21. A simulated noisy chromatogram (SNR = 20) (a) and the smoothed results with the use of moving-average (b), Savitsky--Golay (c), and FFT (d) filtering.

curves with those by WT or WPT, it can be found that, for the simulated signal, all the methods can give similar results, but for the experimental signal, WT and WPT give more satisfactory results.

5.3.BASELINE/BACKGROUND REMOVAL

In many cases, the baseline drift or background in an analytical signal is just like the noise, which often increases the difficulties in further processing. The baseline drift mainly induces errors in the determination

Figure 5.22. The experimental chromatogram (a) and the smoothed results with the use of moving-average (b), Savitsky--Golay (c), and FFT (d) filtering.

184

application of wavelet transform in chemistry

of the peak height and peak area, which are very important parameters for signals analysis. The background always blurs the analytical signals. It is difficult or even impossible to analyze a signal with a strong background. In practice, an artificial baseline or background is usually drawn beneath the peak, although the error cannot be completely eliminated by this method. Therefore, most chemists prefer to find the exact shape of the baseline or background and then subtract it from the original signal. But it is not easy to obtain the ‘‘true’’ baseline or background because they are generally represented by curves instead of linear functions.

5.3.1.Principle and Algorithm

Baseline drift or background interference can be classified as a long-term noise. This property differs from that of common noise in that the frequency of the drifting baseline or background is always quite lower than the signals to be analyzed. In wavelet decomposition, the baseline or background component in an analytical signal should be easy to separate from the drifting signals. The removal procedure is similar to WT denoising and smoothing. The only difference is that the coefficients representing the lower-frequency components are suppressed. The following two methods are generally used.

Method A

1.Decompose the experiment data into discrete approximations cj and discrete details dj by Equations (5.1) and (5.2).

2.Examine the cj by visual inspection to find a cj that resembles the drifting baseline or background, and denote the scale j as jmax. The jmax can also be determined by examination of the dj , because there should be no information of the signal in djmax +1.

3.Reconstruct the signal by Equation (5.5) from only those dj with j jmax, that is, set cjmax to zeros.

Method B

1.Decompose the experiment data into discrete approximations cj and discrete details dj by Equations (5.32) and (5.33).

2.Examine the cj by visual inspection to find a cj (denote the scale j as jmax) that resembles the drifting baseline or background.

3.Subtract the selected cjmax from the original signal by c0 cjmax or sometimes c0 f × cjmax , where f is an arbitrary factor.

baseline/background removal

185

5.3.2. Background Removal

Background removal is a universal problem in spectral studies, such as absorption spectroscopy and reflection spectroscopy. Example 5.8 describes a procedure to remove the background absorption from an EXAFS spectrum using the methods A and B, respectively.

Example 5.8: Background Removal of an EXAFS Spectrum. Extended X-ray absorption fine structure (EXAFS) is an X-ray absorption spectrum. A typical experimental absorption spectrum of copper with synchrotron radiation light source is shown in Figure 5.23a. For analyzing the EXAFS spectrum, we must separate the useful information (oscillation part) from the total raw spectrum. Then convert the oscillation part into k space by using the equation k = [0.263(E E0)]1/2, where E0 is the absorption edge (the E0 for Cu is 8393.5 eV). At last, the oscillation signal is filtered by FFT and then fitted by the theoretical equation

χ(k ) = j

N

 

 

2

2

 

 

 

 

 

 

j

fj (k ) ek

σj

e−2rj

 

 

 

 

 

krj2

 

 

 

 

(5.47)

 

 

 

 

 

 

0.2625r

E

 

 

× sin 2krj + ϕ(k ) +

0

 

 

j

 

 

k

 

 

 

 

where χ(k ) is the filtered oscillation signal multiplied by a factor k 3, j is the number of the coordination shells, fj (k ) is the amplitude value that can be obtained from handbook, ϕ(k ) is the phase displacement of scattering,E0 is the difference between the theoretical value and the experimental value of the E0, N is coordination number, r is coordination distance, σ is Debye--Waller factor, and λ is electron mean-free path. The aim is to obtain the structural parameter N, r , σ, and λ. Generally, the cubic spline interpolation method is used for the background removal and the leastsquares method is used for the curve fitting. In our experience, it will take several hours to analyze one spectrum.

We can use method A for the background removal of an EXAFS spectrum. Figure 5.24a shows an experimental spectrum of a Cu sample, and panel (b) shows the spectrum in k space. In order to decompose the spectrum into its approximation and details, we can perform a WT on the k -space spectrum with Equations (5.1) and (5.2). Figure 5.25 shows the decomposed results, {c4, d4, d3, d2, d1}, obtained with Daubechies8 (L = 8) filter and J = 4. It is clear that the information on the EXAFS oscillation is decomposed into the dj and that c4 represents the smooth background absorption. Therefore, if we reconstruct the spectrum from the dj components only, we will obtain the spectrum without the background absorption. The dotted line in Figure 5.24c shows the reconstructed

186

application of wavelet transform in chemistry

Figure 5.23. General procedures for analying an EXAFS spectrum: (a) raw spectrum;

(b) oscillation part; (c) Filtered oscillation signal.

spectrum. From the result, it is obvious that the background is removed. The solid line in Figure 5.24c shows the result obtained by the conventional cubic spline curve fitting after many trials. By comparing the two curves, it can be seen that their main shapes are almost the same, but the result by the WT method is superior to that of the conventional method in both shape and noise level in the high-k region.

Method B can also be used for removing the background in this example. Figure 5.26 shows the decomposed approximations obtained by using

baseline/background removal

187

Figure 5.24. An experimental EXAFS spectrum of Cu sample (a), the converted spectrum in k space (b), and background-removed results obtained by conventional (c-solid) and WT (c-dot) methods.

Equations (5.32) and (5.33) of the improved algorithm. It can be seen that the oscillation signal is gradually removed from c0 to c4, but the smoothed background remains. Therefore, if we subtract c4 from c0, the oscillation part can be obtained, which is shown in Figure 5.27, curve (a) by the dotted line. The solid line shows the results obtained by the conventional cubic spline interpolation method for comparison. Comparing with the two curves in Figure 5.27, curve (a), we can also find that there is no significant difference between them except that the results obtained by the WT method are superior to those obtained by the conventional method in both shape and noise level in the high-k region.

188

application of wavelet transform in chemistry

Figure 5.25. Plots of the decomposed approximation (c4) and details (d1, d2, d3, d4) obtained by applying WT to the k -space EXAFS spectrum with the MRSD algorithm.

Computational Details of Example 5.8

Method A:

1.Load experimental spectrum (Fig. 5.24a) and the spectrum in k space, (Fig. 5.24b).

2.Extend the spectral data to an integer power of 2.

3.Make a wavelet filter---Daubechies4.

4.Set resolution level J = 4 (8 − 4).

Соседние файлы в предмете Химия