Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Казанский национальный исследовательский технологический университет

Предмет:

Химия

Файл:

Chau Chemometrics From Basics to Wavelet Transform

.pdf

Скачиваний:

119

Добавлен:

15.08.2013

Размер:

2.71 Mб

Скачать

☆

<<< < Предыдущая 1 2 3 45 / 335 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

digital smoothing and filtering methods

%The parameter win_num is the window size which can be chosen %to have a value of 7 to 17, say 7 9 11 13 15 17;

%The parameter poly_order is the polynomiar order which %can be chosen to have a value of 2 or 3, and 4 or 5.

[m1,n1]=size(x);

y=zeros(size(x)); if win_num==7

if poly_order==2 | poly_order==3 coef1=[-2 3 6 7 6 3 -2]/21;

for j=1:n1 for i=4:m1-3

y(i,j)=coef1(1) x(i-3,j)+coef1(2) x(i-2,j)+coef1(3) x(i-1,j)+ . . .

coef1(4) x(i,j)+coef1(5) x(i+1,j)+coef1(6) x(i+2,j)+ . . .

coef1(7) x(i+3,j);

end end else

coef1=[5 -30 75 131 75 -30 5]/231; for j=1:n1

for i=4:m1-3

y(i,j)=coef1(1) x(i-3,j)+coef1(2) x(i-2,j)+coef1(3) x(i-1,j)+ . . .

coef1(4) x(i,j)+coef1(5) x(i+1,j)+coef1(6) x(i+2,j)+ . . .

coef1(7) x(i+3,j);

end end end

elseif win_num==9

if poly_order==2|poly_order==3 coef1=[-21 14 39 54 59 54 39 14 -21]/231; for j=1:n1

for i=5:m1-4

y(i,j)=coef1(1) x(i-4,j)+coef1(2) x(i-3,j)+coef1(3) x(i-2,j)+ . . .

coef1(4) x(i-1,j)+coef1(5) x(i,j)+coef1(6) x(i+1,j)+ . . .

coef1(7) x(i+2,j)+coef1(8) x(i+3,j)+coef1(9) x(i+4,j);

end end else

coef1=[15 -55 30 135 179 135 30 -55 15]/429; for j=1:n1

for i=5:m1-4

y(i,j)=coef1(1) x(i-4,j)+coef1(2) x(i-3,j)+coef1(3) x(i-2,j) . . .

+coef1(4) x(i-1,j)+coef1(5) x(i,j)+coef1(6) x(i+1,j) . . .

+coef1(7) x(i+2,j)+coef1(8) x(i+3,j)+coef1(9) x(i+4,j);

end end

end

elseif win_num==11

if poly_order==2|poly_order==3

coef1=[-36 9 44 69 84 89 84 69 44 9 -36]/429;

30 one-dimensional signal processing techniques in chemistry

for j=1:n1 for i=6:m1-5

y(i,j)=coef1(1) x(i-5,j)+coef1(2) x(i-4,j)+coef1(3) x(i-3,j) . . .

+coef1(4) x(i-2,j)+coef1(5) x(i-1,j)+coef1(6) x(i,j)+ . . .

coef1(7) x(i+1,j)+coef1(8) x(i+2,j)+coef1(9) x(i+3,j) . . .

+coef1(10) x(i+4,j)+coef1(11) x(i+5,j);

end end else

coef1=[18 -45 -10 60 120 143 120 60 -10 -45 18]/429; for j=1:n1

for i=6:m1-5

y(i,j)=coef1(1) x(i-5,j)+coef1(2) x(i-4,j)+coef1(3) x(i-3,j) . . .

+coef1(4) x(i-2,j)+coef1(5) x(i-1,j)+coef1(6) x(i,j)+ . . .

coef1(7) x(i+1,j)+coef1(8) x(i+2,j)+coef1(9) x(i+3,j) . . .

+coef1(10) x(i+4,j)+coef1(11) x(i+5,j);

end end end

elseif win_num==13

if poly_order==2|poly_order==3

coef1=[-11 0 9 16 21 24 25 24 21 16 9 0 -11]/143; for j=1:n1

for i=7:m1-6

y(i,j)=coef1(1) x(i-6,j)+coef1(2) x(i-5,j)+coef1(3) x(i-4,j) . . .

+coef1(4) x(i-3,j)+coef1(5) x(i-2,j) . . .

+coef1(6) x(i-1,j)+coef1(7) x(i,j)+coef1(8) x(i+1,j)+ . . .

coef1(9) x(i+2,j)+coef1(10) x(i+3,j)+coef1(11) x(i+4,j) . . .

+coef1(12) x(i+5,j)+coef1(13) x(i+6,j);

end end else

coef1=[110 -198 -135 110 390 600 677 600 390 110 -135 -198 110]/2431;

for j=1:n1 for i=7:m1-6

y(i,j)=coef1(1) x(i-6,j)+coef1(2) x(i-5,j)+coef1(3) x(i-4,j) . . .

+coef1(4) x(i-3,j)+coef1(5) x(i-2,j)+coef1(6) x(i-1,j) . . .

+coef1(7) x(i,j)+coef1(8) x(i+1,j)+ . . .

coef1(9) x(i+2,j)+coef1(10) x(i+3,j)+coef1(11) x(i+4,j) . . .

+coef1(12) x(i+5,j)+coef1(13) x(i+6,j);

end end end

elseif win_num==15

if poly_order==2|poly_order==3

coef1=[-78 -13 42 87 122 147 162 167 162 147 122 87 42 -13 -78]/1105; for j=1:n1

for i=8:m1-7

y(i,j)=coef1(1) x(i-7,j)+coef1(2) x(i-6,j)+coef1(3) x(i-5,j) . . .

digital smoothing and filtering methods

+coef1(4) x(i-4,j)+coef1(5) x(i-3,j)+coef1(6) x(i-2,j) . . .

+coef1(7) x(i-1,j)+coef1(8) x(i,j)+coef1(9) x(i+1,j) . . .

+coef1(10) x(i+2,j)+coef1(11) x(i+3,j)+coef1(12) x(i+4,j) . . .

+coef1(13) x(i+5,j)+coef1(14) x(i+6,j)+coef1(15) x(i+7,j);

end end else

coef1=[2145 -2860 -2937 -165 3755 7500 10125 11063 10125 7500 3755 -165 -2937 -2860 2145]/46189;

for j=1:n1 for i=8:m1-7

y(i,j)=coef1(1) x(i-7,j)+coef1(2) x(i-6,j)+coef1(3) x(i-5,j) . . .

+coef1(4) x(i-4,j)+coef1(5) x(i-3,j) . . .

+coef1(6) x(i-2,j) . . .

+coef1(7) x(i-1,j)+coef1(8) x(i,j)+coef1(9) x(i+1,j) . . .

+coef1(10) x(i+2,j)+coef1(11) x(i+3,j)+coef1(12) x(i+4,j) . . .

+coef1(13) x(i+5,j)+coef1(14) x(i+6,j)+coef1(15) x(i+7,j);

end end end

elseif win_num==17

if poly_order==2|poly_order==3

coef1=[-21 -6 7 18 27 34 39 42 43 42 39 34 27 18 7 -6 -21]/323; for j=1:n1

for i=9:m1-8

y(i,j)=coef1(1) x(i-8,j)+coef1(2) x(i-7,j)+coef1(3) x(i-6,j) . . .

+coef1(4) x(i-5,j)+coef1(5) x(i-4,j) . . .

+coef1(6) x(i-3,j)+coef1(7) x(i-2,j)+coef1(8) x(i-1,j) . . .

+coef1(9) x(i,j)+coef1(10) x(i+1,j)+coef1(11) x(i+2,j) . . .

+coef1(14) x(i+5,j)+coef1(12) x(i+3,j)+coef1(13) x(i+4,j) . . .

+coef1(15) x(i+6,j)+coef1(16) x(i+7,j)+coef1(17) x(i+8,j);

end end else

coef1=[195 -195 -260 -117 135 415 660 825 883 825 660 415 135 -117 -260 -195 195]/4199;

for j=1:n1 for i=9:m1-8

y(i,j)=coef1(1) x(i-8,j)+coef1(2) x(i-7,j)+coef1(3) x(i-6,j) . . .

+coef1(4) x(i-5,j)+coef1(5) x(i-4,j)+coef1(6) x(i-3,j) . . .

+coef1(7) x(i-2,j)+coef1(8) x(i-1,j)+coef1(9) x(i,j)+ . . .

coef1(10) x(i+1,j)+coef1(11) x(i+2,j)+coef1(12) x(i+3,j) . . .

+coef1(13) x(i+4,j)+coef1(14) x(i+5,j) . . .

+coef1(15) x(i+6,j)+coef1(16) x(i+7,j)+coef1(17) x(i+8,j);

end end end end

32 one-dimensional signal processing techniques in chemistry

Table 2.2. Weights of Savitsky--Golay Filter for Smoothing Based on a Quadratic/Cubic Polynomial

Points	25	23	21		19	17	15	13	11	9	7

−12	1, 265
−11	−345	95
−10	−1, 122	−38	11,	628
−9	−1, 255	−95	−6,	460	340
−8	−915	−95 −13,		005	−255	195
−7	−255	−55 −11,		220	−420	−195	2, 145
−6	590	10	−3,	940	−290	−260 −2, 860		110
−5	1, 503	87	6,	378	18	−117 −2, 937		−198	18
−4	2, 385	165	17,	655	405	135	−165	−135	−45	15
−3	3, 155	235	28,	190	790	415	3, 755	110	−10 −55		5
−2	3, 750	290	36,	660	1, 110	660	7, 500	390	60	30	−30
−1	4, 125	325	42,	120	1, 320	825	10, 125	600	120	135	75
0	4, 253	−339	44,	003	1, 393	883	11, 063	677	143	179	131
1	4, 125	325	42,	120	1, 320	825	10, 125	600	120	135	75
2	3, 750	290	36,	660	1, 110	660	7, 500	390	60	30	−30
3	3, 155	235	28,	190	790	415	3, 755	110	−10 −55		5
4	2, 385	165	17,	655	405	135	−165	−135	−45	15
5	1, 503	87	6,	378	18	−117 −2, 937		−198	18
6	590	10	−3,	940	−290	−260 −2, 860		110
7	−255	−55 −11,		220	−420	−195	2, 145
8	−915	−95 −13,		005	−255	195
9	−1, 255	−95	−6,	460	340
10	−1, 122	−38	11,	628
11	−345	95
12	1, 265
	30, 015	6, 555	260,	015	7, 429	4, 199	46, 189	2, 431	429	429	231

2.1.3. Kalman Filtering

Kalman ﬁltering is a kind of optimal linear recursive estimation method. Its operation speed is very high, and relatively small memory space is required for computation. Kalman ﬁltering has been extensively used in engineering, especially in space technology. Recursive operation is the key feature of the method. Here we will ﬁrst introduce what recursive operation is before discussing Kalman ﬁltering in detail.

The basic idea of recursive operation is its efﬁcient use of the results obtained previously and also the newly acquired information so as to avoid unnecessary repeated calculation. Let us ﬁrst have a look at the basic feature of the recursive operation through a simple example. The mean

digital smoothing and filtering methods

signal intensity

10	x 10-3	Smoothing with window size=7							x 10-3	Smoothing with window size=11
10									10
	solid line: original signal					(a)								(b)
8	red dashed line: smoothed								8
	cross line: noisy signal
6								intensity	6
4								intensity	4
2								signal	2
0									0
-2									-2
20		40	60	80	100	120	140		20	40	60	80	100	120	140
				signal point								signal point

x 10-3 Smoothing with window size=13

x 10-3 Smoothing with window size=17

	10								10
						(c)								(d)
	8								8
intensity	6							intensity	6
intensity	4							intensity	4
signal	2							signal	2
	0								0
	-2								-2
	20	40	60	80	100	120	140		20	40	60	80	100	120	140
				signal point								signal point

Figure 2.3. Smoothing results obtained by the Savitsky--Golay ﬁlter with different window sizes. They are depicted by four plots with the original curve (solid line), the raw noisy signals (cross line), and the smoothed curve (dashed line) with window size of 7 (a), 11 (b), 13 (c), and 17 (d).

value is usually evaluated using the following formula

x¯ =	xi	(2.11)
	n

where xi denotes the sum of n observations, say xi (i = 1, . . . , n). When one measures a new xi (i = n + 1), one has to calculate the mean again using Equation (2.11). Hence, all the n observations obtained before should be stored in the computer for future use. However, for recursive operation, a new mean can be evaluated through the following formula without using all the observations:

x	x	xn+1 − x¯n			(2.12)

¯n+1	= ¯n +	n	+	1

Comparing this formula with Equation (2.11), one can obviously see that the recursive operation is faster and more efﬁcient, and this is the attractive feature of Kalman ﬁltering.

34 one-dimensional signal processing techniques in chemistry

Kalman ﬁlter is based on a dynamic system model

x(k ) = F(k , k − 1)x(k − 1) + w(k )

(2.13)

and a measurement model

y (k ) = h(k )t x(k − 1) + e(k )

(2.14)

where x(k ), y (k ), and h(k ) denote the state vector, the measurement, and the measurement function vector, respectively. The variable k represents a measurement point that can be time, wavelength, or other. It should be noted that F(k , k − 1) is the system transition matrix which represents how the system transits from state (k − 1) to state k . Very often, it is an identity matrix for smoothing purposes. w(k ) denotes the dynamic system noise, and could be a zero vector approximately because the smoothing ﬁlter can be regarded as a static model. e(k ) is the measurement noise, which can be a stochastic variable with zero mean and constant variance obeying the Gaussian distribution.

The core recursive state estimate update in Kalman ﬁltering is given by the following equation

x(k ) = x(k − 1) + g(k )[ y (k ) − h(k )t x(k − 1)]

(2.15)

where the vector g(k ) is called Kalman gain. Comparing this equation with Equation (2.12), one can easily see the similarity between the two. The Kalman gain, g(k ), corresponds to 1/(n + 1) in Equation (2.12) and is used to adjust the difference between the state vectors x(k ) and x(k − 1) through the term of measurement difference, of [y (k ) − h(k )t x(k − 1)]. Through Equation (2.15), one can also see that the state estimate update is just based on the newly measured y (k ) and the state vector x(k − 1) obtained before. Equation (2.15) makes the efﬁcient usage of recursive operation possible.

The Kalman gain can be determined by the following formula

g(k ) = P(k − 1)h(k )[h(k )t P(k − 1)h(k ) + r (k )]−1

(2.16)

where r (k ) represents the variance of the measurement noise e(k ). P(k − 1) is the covariance matrix of the system estimated from the (k − 1) observations obtained before through

P(k ) = [I − g(k − 1)h(k )t ]P(k − 1)[I − g(k − 1)h(k )t ]
+ g(k − 1)r (k )g(k − 1)t	(2.17)

where I is an identity matrix.

digital smoothing and filtering methods

From the discussion above, it can be seen that the Kalman gain vector can be deduced through Equation (2.16) if the initial values of x(k ) and P(k ), say, x(0) and P(0), are known. Then, the next x(k ) and P(k ) can be computed through Equations (2.15) and (2.17) until convergence is attained.

In summary, the procedure of Kalman ﬁltering can be carried out via the following steps:

1. Setting the initial values:

x(0) = 0, P(0) = γ 2I

(2.18)

where γ 2 is an initial estimation of variance of measurement noises that might be given by the following empirical formula

γ 2	= a	r (1)	(2.19)
		[h(1)t h(1)]1/2

The factor a can inﬂuence the calculation accuracy and can have values from 10 to 100. It is worthwhile to note that the initial value of P(0) is crucial for the estimation. If its value is too small, it can result in bias estimation. Yet, if its value is too high, it is difﬁcult to have the computation converging to the desired value.

2. Recursive calculation loop:

g(k ) = P(k − 1)h(k )[h(k )t P(k − 1)h(k ) + r (k )]−1 x(k ) = x(k − 1) + g(k )[ y (k ) − h(k )t x(k − 1)]

P(k ) = [I − g(k − 1)h(k )t ]P(k − 1)[I − g(k − 1)h(k )t ] + g(k − 1)r (k )g(k − 1)t

where r (k ) is the variance of measurement noises that can be determined by the variance of real noise. This loop procedure is repeated until the estimates become stable.

In Kalman ﬁltering algorithm, the innovative series is very important and might provide information about whether the results obtained are reliable. The innovative series can be obtained by the following equation:

v (k ) = y (k ) − h(k )t x(k − 1)

(2.20)

In fact, the series is the difference between the measurement and estimation and can be regarded as a residual at the k point. The innovative series should be a white noise with zero mean if the ﬁltering model used is correct. Otherwise, the results obtained are not reliable.

36 one-dimensional signal processing techniques in chemistry

Kalman ﬁltering can be applied for ﬁltering, smoothing, and prediction. The most common application is known in multicomponent analysis.

2.1.4. Spline Smoothing

In addition to the smoothing methods based on digital ﬁlters as discussed previously, the other widely used one in signal processing is spline functions. The main advantage of spline functions is their differentiability in the entire measurement domain.

Among various spline functions, the cubic spline function is the most common one and is deﬁned as follows

y = S(x ) = Ak (x − xk )3 + Bk (x − xk )2 + Ck (x − xk ) + Dk

(2.21)

where Ak , Bk , Ck , and Dk are the spline coefﬁcients at data point k . The cubic spline function S(x ) or y for observations on the abscissa intervals x1 < x2 < · · · < xn satisﬁes the following conditions:

1.The intervals are called knots. The knots may be identical with the index points on the x axis (abscissa).

2.Within the knots k , S(x ) obeys the continuity constraint on the function and on its twofold derivatives.

3.S(x ) is a cubic function in each subrange [xk , xk −1] for k = 1, . . . , n−1 considered.

4.Outside the range from x1 to xk , S(x ) is a straight line.

For a ﬁxed interval between the data points xk and xk −1, the following relationships are valid for the signal values and their derivatives:

yk = Dk

yk +1 = Ak (x − xk )3 + Bk (x − xk )2 + Ck (x − xk ) yk = S (xk ) = Ck

yk +1 = 3Ak (x − xk )2 + 2Bk (x − xk ) + Ck yk = S (xk ) = 2Bk

yk +1 = 6Ak (x − xk ) + 2Bk

The spline coefﬁcients can be determined by a method that also smoothes the data under study at the same time. The ordinate values yˆk are calculated such that the differences of the observed values are positive

digital smoothing and filtering methods		37
proportional jumps rk in their third derivative at point xk :
rk	= S (xk ) − S (xk +1)	(2.22)
rk	= pk (yk − yˆk )	(2.23)

The proportionality factors pk are determined by cross-validation. In contrast with polynomials, spline functions may be applied to approximate and smooth any kind of curve shape. It should be mentioned that many more coefﬁcients must be estimated and stored in comparison with the polynomial ﬁlters because different coefﬁcients apply in each interval. A disadvantage is valid for smoothing splines where the parameter estimates are biased. Therefore, it is more difﬁcult to describe the statistical properties of spline functions than those of linear regression.

In MATLAB, there is a cubic spline function, named csaps. csaps(X , Y , p, X ), which returns a smoothed version of the input data (X , Y ) by cubic smoothing spline, and the result depends on the value of the smoothing parameter p (from 0 to 1). For p = 0, the smoothing spline corresponds to the least-squares straight-line ﬁt to the data, while at the other extreme, with p = 1, it is the ‘‘natural’’ or variational cubic spline interpolation. The transition region between these two extremes is usually only a rather small range of values for p and its optimal value strongly depends on the nature of the data. Figure 2.4 shows an example of smoothing by a cubic spline smoother with different p values. From the plots as given in the ﬁgure, one can see that the choice of the right value for parameter p is crucial. The smoothing results are satisfactory if one makes a good choice as depicted in Figure 2.4c. In order to make it easier for the readers to understand the smoothing procedure using the cubic spline smoother, a MATLAB source code is given in the following frame:

xi=[0:.05:1.5];

yi=cos(xi)

ybad=yi+.2 (rand(size(xi))-.5); figure(2)

subplot(221),plot(xi,yi,‘k:’,xi,ybad,‘kx’),grid on title(‘Original curve: dashed line; Noisey data: cross’) axis([0 1.5 0 1.2])

xlabel(‘Varibale (x)’) ylabel(‘Signal, (y)’)

yy1=csaps(xi,ybad,.9981,xi); subplot(222),plot(xi,yi,‘k:’,xi,ybad,‘kx’,xi,yy1,‘k’), grid on title(‘Smoothed curve: solid line with p=9981’)

axis([0 1.5 0 1.2])

38 one-dimensional signal processing techniques in chemistry

xlabel(‘Varibale (x)’) ylabel(‘Signal, (y)’)

yy2=csaps(xi,ybad,.9756,xi); subplot(223),plot(xi,yi,‘k:’,xi,ybad,‘kx’,xi,yy2,‘k’), grid on title(‘Smoothed curve: solid line with p=9756’)

axis([0 1.5 0 1.2]) xlabel(‘Varibale (x)’) ylabel(‘Signal, (y)’) yy3=csaps(xi,ybad,.7856,xi);

subplot(224),plot(xi,yi,‘k:’,xi,ybad,‘kx’,xi,yy3,‘k’), grid on title(‘Smoothed curve: solid line with p=7856’)

axis([0 1.5 0 1.2]) xlabel(‘Varibale (x)’) ylabel(‘Signal, (y)’)

Usually, it is difﬁcult to choose the best value for the parameter p without experimentation. If one has difﬁculty in doing this but has an idea of the noise level in Y , the MATLAB command spaps(X , Y , tol) may help. Select

Original curve: dashed line; Noisy data: cross

Smoothed curve: solid line with p=0.9981

				(a)
	1				1
(y)	0.8			(y)	0.8
(y)				(y)
Signal,	0.6			Signal,	0.6
Signal,	0.4			Signal,	0.4
	0.4				0.4
	0.2				0.2
	0	0.5	1	1.5	0
	0	0.5	1	1.5

			(b)
0	0.5	1	1.5

Variable (x)	Variable (x)
Smoothed curve: solid line with p=0.9756	Smoothed curve: solid line with p=0.7856

				(c)
	1				1
(y)	0.8			(y)	0.8
(y)				(y)
Signal,	0.6			Signal,	0.6
Signal,	0.4			Signal,	0.4
	0.4				0.4
	0.2				0.2
	0	0.5	1	1.5	0
	0	0.5	1	1.5

			(d)
0	0.5	1	1.5

Variable (x)

Figure 2.4. Smoothing results obtained by a cubic spline smoother with different values of the parameter p: (a) the original curve and the raw noisy signals; (b) the smoothed curve with p = 0.9981; (c) the smoothed curve with p = 0.9756; (d) the smoothed curve with p = 0.7856.

<<< < Предыдущая 1 2 3 45 / 335 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Соседние файлы в предмете Химия

#
15.08.201321.36 Mб49Carey F.A. - Organic Chemistry (2004)(en).djvu
#
15.08.201321.36 Mб41Carey F.A. Advanced organic chemistry 5ed., MGH, 2004.djvu
#
15.08.201311.62 Mб29Carey F.A. Advanced organic chemistry. Part A structure and mechanisms 1938.djvu
#
15.08.20138.77 Mб20Carey F.A. Advanced organic chemistry. Part B reaction and synthesis 1938.djvu
#
15.08.20139.85 Mб27Cazes J., Scott R.P.W. Chromatography theory New York 2002.djvu
#
15.08.20132.71 Mб119Chau Chemometrics From Basics to Wavelet Transform.pdf
#
15.08.20135.81 Mб280Chemiluminescence in Analytical Chemistry.pdf
#
15.08.20133.98 Mб19Chen The electron capture detector.pdf
#
15.08.20133 Mб86Chivers T. - A Guide to Chalcogen-Nitrogen Chemistry (2005)(en).pdf
#
15.08.201332.51 Mб64Clayden J. - Organic chemistry (Oxford, 2000).pdf
#
15.08.20136.12 Mб50Cohen M.F., Wallace J.R. - Radiosity and realistic image synthesis (1995)(en).pdf