Chau Chemometrics From Basics to Wavelet Transform

Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Казанский национальный исследовательский технологический университет

Предмет:

Химия

Файл:

εB = min{rk } is used, depending on the parameters e and p deﬁned by

.pdf

Скачиваний:

119

Добавлен:

15.08.2013

Размер:

2.71 Mб

Скачать

☆

<<< < Предыдущая 7 8 9 10 11 12 13 14 15 16 17 1819 / 3319 20 21 22 23 24 25 26 27 28 29 30 31 > Следующая >>>

data denoising and smoothing

169

last element is 0. A risk value for every coefﬁcient, rk , is then computed by

r	k =	(nj − 1) + bk + ak ck	(5.26)
		nj

The coefﬁcient that has the minimum rk is selected as the threshold value for the scale j :

ε = min{rk }

(5.27)

It should be noted that the absolute value of the coefﬁcient is used for the thresholding operation.

2. Visually Calibrated Adaptive Smoothing Method. This method is generally called the VISU method. The threshold value is computed

simply by
ε = (2 log n)1/2	(5.28)

where n is the number of the elements in the coefﬁcient vector. This value can be used for hardor soft-thresholding operation. For methods 1 and 2, the thresholding sometimes is performed only on the coefﬁcients in the index interval [2J + 1, n] (the wavelet or detail coefﬁcients part) where J is the predeﬁned largest decomposition scale. J must be smaller than log2 (N) when the length of the signal is N.

3. Hybrid Thresholding Method. This method is called HYBRID method because it is a soft-thresholding method where in some cases the VISU threshold εA = (2 log nj )1/2 is used, and in other cases SURE threshold

where J is the largest decomposition scale, wj and nj are the coefﬁcients of the scale j and its length, respectively. If e < p, then soft VISU thresholding is performed, if not, the smaller value in εA and εB is used to perform the soft-thresholding operation. In this method, the thresholding is performed on the coefﬁcients of each scale as in the SURE method.

4.Median Absolute Deviation (MAD) Method. The MAD method is a soft-thresholding method. It is also applied to the individual scales. The threshold used for each scale j is the same as in VISU method,

i.e., ε = (2 log n)1/2, but the coefﬁcients must be normalized by wj /sj , where sj = median(|wj |)/0.6745, before the thresholding is performed.

5.MINMAX Thresholding Method. This method ﬁnds the optimum threshold for each scale according to the mean-square error between the

170	application of wavelet transform in chemistry

original signal and the denoised signal. A set of threshold values is used that satisfy ε ≤ (2 log n)1/2. When the coefﬁcient number n is very large, the optimum threshold ε will approach (2 log n)1/2.

In practice uses, we can use the thresholds provided in the literature. For example, we can use the following threshold suggested by Donoho et al. in the WaveLab (http://www-stat.stanford.edu/ wavelab/):

{εj }j =1,... ,16 = {0, 0, 0, 0, 0, 1.27, 1.47, 1.67, 1.86, 2.05, 2.23,

(5.30)

2.41, 2.6, 2.77, 2.95, 3.13}

It should be noted that, for most of these above methods, the threshold must be applied to the data vector normalized to the noise level 1. It would be a good idea to normalize the signal f by f/s before performing the denoising or smoothing, where s can be computed by Equation (5.21).

Example 5.4: Denoising of a Simulated Chromatogram. Curves (a) in Figures 5.13 and 5.14 are simulated chromatograms with 1024 data point by Gaussian function

		j	cj exp				t − t0,j	2
V		n			4 ln (2)		t − t0,j		(5.31)
V	=	=		−	4 ln (2)	W1/2,j			(5.31)
	=	=		−		W1/2,j
		1

Figure 5.13. Plots of a simulated clean chromatogram (a), noisy chromatogram with random noise of SNR = 100 (b), and the denoised results by WT with hard- (c), soft- (d), SURE (e), VISU (f ), HYBRID (g), and MINMAX (h) thresholding methods, respectively.

data denoising and smoothing

171

Figure 5.14. Plots of a simulated clean chromatogram (a), noisy chromatogram with random noise of SNR = 20 (b), and the denoised results by hard- (c), soft- (d), SURE (e), VISU (f ), HYBRID (g), and MINMAX (h) thresholding methods, respectively.

where V is the simulated chromatogram, t is retention time, n is the number of peaks, and cj , t0,j , and W1/2,j are the concentration, position, and full width at half height of the peak j , respectively. Curve (b) in Figure 5.13 shows a noisy chromatogram generated by adding random noise in curve (a). The SNR calculated by the ratio of the maximum of the signal and the maximum of the noise is 100. Curve (b) in Figure 5.14 is generated in the same way, but the SNR is 20.

Curves (c)--(h) in Figures 5.13 and 5.14 are the denoised results achieved by different thresholding methods. The Symmlet4 (L = 8) ﬁlter and the decomposition level J = 5 were adopted in the calculation. Curves

(c)and (d) in Figure 5.13 are obtained, respectively, by hardand softthresholding methods with the threshold ε = 0.003 × [max (w) − min (w)], which is determined by trial and error. In Figure 5.14, curves (c) and

(d)are obtained in the same way as in Figure 5.13, but the threshold ε = 0.015 × [max (w) − min (w)] is adopted. The thresholds for all the other thresholding methods are determined by the abovementioned equations. Soft thresholding is adopted for the HYBRID method, while hard thresholding operation is adopted for the other methods. In order to compare the denoised results with the original simulated signal, RMS values between curves (a) and (c)--(h) are calculated, respectively. These values are 0.0064, 0.0081, 0.0055, 0.0065, 0.0092, and 0.0060 for the results in

172	application of wavelet transform in chemistry

Figure 5.13 (SNR = 100), and 0.0288, 0.0325, 0.0287, 0.0289, 0.0355, and 0.0304 for the results in Figure 5.14 (SNR = 20). It can be seen that all the methods can give acceptable results, but the distortion will increase with the noise level.

Computational Details of Example 5.4

1.Generate the noisy signal with 1024 data points; see curve (b), Figure 5.13 using the Gaussian equation.

2.Make a wavelet ﬁlter---Symmlet4.

3.Normalize the data to noise level 1.

4.Set resolution level J = 5 (10 − 5).

5.Perform forward WT to obtain the wavelet coefﬁcients.

6.Perform thresholding to the wavelet coefﬁcients to suppress the smaller coefﬁcients, construct the signal by applying inverse WT to the suppressed coefﬁcients, and display the denoised results, respectively, by

6.1.Hard thresholding with manually determined threshold [Fig. 5.13, (c)].

6.2.Soft thresholding with manually determined threshold [Fig. 5.13, (d)].

6.3.SURE (hard) thresholding [Fig. 5.13, (e)].

6.4.HYBRID (soft) thresholding [Fig. 5.13, (f)].

6.5.VISU (hard) tresholding [Fig. 5.13, (g)].

6.6.MINMAX (hard) thresholding [Fig. 5.13, (h)].

Note: The only difference between Figures 5.13 and 5.14 is the SNR of the noisy signal.

From the denoised results curves (c) and (d) in both Figures 5.13 and 5.14, it can be seen that both the hard and soft thresholding by manually determined threshold ε can give satisfactory results. But these results are obtained by trial and error, which is tedious and time-consuming. Among the results of SURE, VISU, HYBRID, and MINMAX, whose thresholds are determined automatically by the abovementioned equations, the SURE method result is the best. It is clear that curves (e) in both Figure 5.13 and 5.14 are clean and the least distorted. Although curves (f) and (g) are also clean enough, there are obvious distortions, especially for the last small peak in the shoulder. From curve (h) in the two ﬁgures, it is clear that the MINMAX is not a good method for denoising of the chromatogram, because

data denoising and smoothing

173

there is still noise in the denoised result, especially when the SNR is low in Figure 5.14.

5.2.2.Smoothing

According to the principle of the WT smoothing, the procedure of smoothing with WT can be summarized as follows:

1.Apply a WT to noisy signal fnoisy and obtain the wavelet coefﬁcients vector w.

2.Remove those elements that are thought to represent high-frequency ﬂuctuation in w.

3.Apply the inverse transform to the suppressed w to obtain the smoothed signal fsmoothed.

It can be seen that the only difference between denoising and smoothing is the second step. For denoising, coefﬁcients thought to be small enough are suppressed, but for smoothing, the coefﬁcients representing high-frequency information are removed. According to the principles of WT, the detail coefﬁcients at low scale are generally attributed to high-frequency components.

In order to remove the coefﬁcients corresponding to the high-frequency component, a threshold for cutting off those coefﬁcients must be provided for smoothing. Unfortunately, it is not so easy to determine a frequency threshold as in denoising. Therefore, the threshold value is again determined by trial and error.

As indicated in Figure 5.3, 5.7, and 5.9, it is difﬁcult to determine a cutoff position from the ﬁgure. Therefore, an improved algorithm was proposed for DWT calculation. It can be described by

1
			L
cj ,k =	√		m		1 hj ,m cj −1,k −m	(5.32)
		2		=

1
			L
dj ,k =	√		m		1 gj ,m cj −1,k −m	(5.33)
		2		=

where the hj and gj are obtained by inserting 2j −1 − 1 zeros into the every adjacent element of h and g in Equation (5.1) and (5.2). Subsequently, the length of the ﬁlter L will be doubled. By this algorithm, both cj = {cj ,k }

174	application of wavelet transform in chemistry
and dj	= {dj ,k } keep the same length with cj −1 = {cj −1,k }. Therefore, we

can plot cj and dj , respectively, to inspect their frequency. Furthermore, the algorithm can be used to decompose the signals c0 with any length without using the special techniques such as coefﬁcient position retaining (CPR) method.

The corresponding reconstruction algorithm can be described by
						1
cj ,k =	1 L				1 hj ,m cj +1,k −m +			L		1 gj ,m dj +1,k −m	(5.34)
	√		m			√		m
		2		=			2		=

Example 5.5: Data Smoothing of a Simulated Chromatogram. Figure 5.15 shows the cj and dj obtained by Equations (5.32) and (5.33) with a Daubechies4 (L = 4) ﬁlter and J = 5 from the simulated signal of curve

(b) in Figure 5.14. It is clear that the d1 d4 are attributed to the highfrequency components (noise). Therefore, the smoothed results can be obtained by reconstruction by Equation (5.34) where d1 d4 is set to zeros, which is shown in Figure 5.16. By comparison of the smoothed result with the simulated signal in Figure 5.16, it can be seen that there is almost no distortion after the smoothing. The RMS between the two curves is only 0.0028, which is smaller than any of the denoising methods discussed above.

Figure 5.15. Plot of the wavelet coefﬁcients obtained by WT with the improved algorithm.

data denoising and smoothing

175

Figure 5.16. Comparison of the simulated clean chromatogram (a) and the smoothed result by WT with the improved algorithm (b).

Computational Details of Example 5.5

1.Generate the noisy signal with 1024 data points [Fig. 5.14 (a),(b)] using the Gaussian equation.

2.Make a wavelet ﬁlter---Daubechies4.

3.Set resolution level J = 5.

4.Perform WT to obtain the c and d components with the improved algorithm.

5.Display Figure 5.15.

6.Perform smoothing by replacing the d1 − d4 with zeros and constructing the signal with inverse WT.

7.Display Figure 5.16.

8.Display the RMS between the original signal and the smoothed signal.

We can also propose a method to estimate an approximate scale threshold for deciding the coefﬁcients at which scale should be set to zero using the characteristics of Equations (5.32) and (5.33) and the concept of the Nyquist critical frequency in sampling theory. The Nyquist critical frequency is deﬁned by

fc =	1	(5.35)
	2 S

where S is sampling time interval. For a signal with the Nyquist critical frequency fc , the frequency of the cj and dj calculated by Equations (5.32)

176	application of wavelet transform in chemistry

and (5.33) will match the following inequality:

2−j fc ≤ fdj ≤ 2−j +1fc (5.36)

fcj > 2−j fc

Then, if we take the peak of an analytical signal (e.g., a chromatogram) as a periodic signal, following the deﬁnition of the Nyquist critical frequency fc , the frequency of the peak can be deﬁned by

fp =	1	(5.37)
	2Wb

where Wb denotes the width of the peak. Because the aim of the smoothing is to remove high-frequency ﬂuctuation and retain the chromatographic signal, the frequency of the coefﬁcients to be removed, fcut, should be much higher than that of the analytical peak:

where Wth denotes the estimated maximal width of noise, then rearrange the Equation (5.38) and perform a logarithm on both sides, we obtain

−j + log2				1		= log2		1
−j + log2				2 S		= log2		2Wth	(5.40)
W	th	W
	th		b
Therefore, if we denote the j		in Equation (5.40) as j th, we have
	j th = log2					Wth
	j th = log2						S		(5.41)
	W		th		W
			th			b

We can estimate an approximate scale threshold j th by Equation (5.41); all the detail coefﬁcients at the scale lower than the j th should be set to zero in the reconstruction calculation. The difﬁculty in using the method is that the parameter Wth must be estimated by experience.

Example 5.6: Data Smoothing of an Experimental Chromatogram.

Figure 5.17 shows an experimental chromatogram measured by an reverse-phase HPLC and postcolumn reaction detection with arsenazo III. The sample is composed of six rare-earth ions (Lu, Yb, Tm, Er, Ho, and

data denoising and smoothing

177

Figure 5.17. Experimental chromatogram with great level noise of a mixed rare-earth solution sample.

Yb). The chromatogram recorded between 1.505 and 11.811 min sampled every 0.005 min is shown. The data length is 2048. The noise is caused by the strong absorption of the postcolumn reaction agent and the pulse of the mobile phase. Therefore, the noise is an oscillation as shown in the enlarged part of the ﬁgure. It is impossible to denoise such a chromatogram by other commonly used denoising methods. But we can easily smooth the chromatogram by using the WT smoothing method mentioned above.

At ﬁrst, we can determine a scale threshold by examination of the enlarged part of the chromatogram. It is easy to ﬁnd that the maximal width

Figure 5.18. Comparison of the smoothed results of the noisy chromatogram by WT with general (a) and the improved (b) algorithms.

178	application of wavelet transform in chemistry

of the noise is 16 points (0.08 min). By Equation (5.41), we can calculate the scale threshold j th = log2 (0.008/0.005) = log2 (16) = 4. Therefore, if we perform a WT decomposition with J = 4 and Dabechies4 ﬁlter, and then reconstruct the signal with all the detail coefﬁcients (d1 d4) set to zero, we can obtained the smoothed result, which is shown in Figure 5.18. It should be noted that the scale threshold by Equation (5.41) can be used for both the general algorithm by Equations (5.1), (5.2), and (5.5) and the improved algorithm by Equations (5.32)--(5.34). Curve (a) in Figure 5.18 is obtained by the former algorithm, and curve (b) is obtained by the later one. It can be seen that they are almost the same.

Computational Details of Example 5.6

Mallat algorithm: Figure 5.18 (a):

1.Load the experimental chromatographic signal (Fig. 5.17).

2.Make a wavelet ﬁlter---Symmlet4.

3.Set resolution level J = 5 (10 − 5).

4.Perform WT to obtain the wavelet coefﬁcients with the Mallat algorithm.

5.Perform smoothing by replacing the detail coefﬁcients of last Four scales with zeros and constructing the signal with inverse WT.

6.Display Figures 5.17 and 5.18 (a).

7.Display the RMS between the original signal and the smoothed signal.

Improved algorithm: Figure 5.18 (b):

1.Load the experimental chromatographic signal (Fig. 5.17).

2.Make a wavelet ﬁlter---Daubechies4.

3.Set resolution level J = 4.

4.Perform WT to obtain the c and d components with the improved algorithm.

5.Perform smoothing by replacing the d1 − d4 with zeros and constructing the signal with inverse WT.

6.Display Figures 5.17 and 5.18 (b).

7.Display the RMS between the original signal and the smoothed signal.

It should be noted that, in Examples 5.4, 5.5, and 5.6, the ﬁlters used are not optimized, better results may be obtained if we use better ﬁlters.

<<< < Предыдущая 7 8 9 10 11 12 13 14 15 16 17 1819 / 3319 20 21 22 23 24 25 26 27 28 29 30 31 > Следующая >>>

Соседние файлы в предмете Химия

#
15.08.201321.36 Mб49Carey F.A. - Organic Chemistry (2004)(en).djvu
#
15.08.201321.36 Mб41Carey F.A. Advanced organic chemistry 5ed., MGH, 2004.djvu
#
15.08.201311.62 Mб29Carey F.A. Advanced organic chemistry. Part A structure and mechanisms 1938.djvu
#
15.08.20138.77 Mб20Carey F.A. Advanced organic chemistry. Part B reaction and synthesis 1938.djvu
#
15.08.20139.85 Mб27Cazes J., Scott R.P.W. Chromatography theory New York 2002.djvu
#
15.08.20132.71 Mб119Chau Chemometrics From Basics to Wavelet Transform.pdf
#
15.08.20135.81 Mб280Chemiluminescence in Analytical Chemistry.pdf
#
15.08.20133.98 Mб19Chen The electron capture detector.pdf
#
15.08.20133 Mб86Chivers T. - A Guide to Chalcogen-Nitrogen Chemistry (2005)(en).pdf
#
15.08.201332.51 Mб64Clayden J. - Organic chemistry (Oxford, 2000).pdf
#
15.08.20136.12 Mб50Cohen M.F., Wallace J.R. - Radiosity and realistic image synthesis (1995)(en).pdf