- •Вопрос 1 Structure of a Speech Coding System
- •Вопрос 3 Desirable Properties of a Speech Coder
- •Вопрос 4 About Coding Delay
- •Вопрос 5 classification of speech coders
- •Вопрос 6 Origin of Speech Signals
- •Вопрос 7 Structure of the Human Auditory System
- •Вопрос 8 Absolute Threshold
- •Вопрос 9 speech coding standards
- •Вопрос 10 pitch period estimation
- •Вопрос 11 linear prediction
- •Вопрос 12 Error Minimization
- •Вопрос 13/14 Prediction Schemes
- •0 10 20
- •Вопрос 15 long-term linear prediction
- •0 0.5 1
- •0 0.5 1
- •Вопрос 16/17 Linear Predictive Coding (lpc)
- •16. Speech encoding. Lpc encoder
- •Overview
- •Lpc coefficient representations
- •Applications
- •20 / 21 . Speech encoding. Celp coder
- •22/23. Speech encoding. Ld-celp coder
- •14.1 Strategies to achieve low delay
- •24/25 Speech encoding. Acelp (g.729) coder
- •35. Jpeg2000 in video compression(mjpeg)
- •36. Coding for high quality moving pictures(mpeg-2)
Вопрос 12 Error Minimization
The system identification problem consists of the estimation of the AR parameters
^ai from s½n], with the estimates being the LPCs. To perform the estimation, a criter- ion must be established. In the present case, the mean-squared prediction error
8
J ¼ E.e2½n]. ¼ E<
:
M
s½n]þ X ais½n — i]
i¼1
!29
=
;
ð4:3Þ
is minimized by selecting the appropriate LPCs. Note that the cost function J is precisely a second-order function of the LPCs. Consequently, we may visualize the dependence of the cost function J on the estimates a1, a2, .. ., aM as a bowl- shaped (M þ 1)-dimensional surface with M degrees of freedom. This surface is characterized by a unique minimum. The optimal LPCs can be found by setting the partial derivatives of J with respect to ai to zero; that is,
qJ qak ¼
(
2E s½n]þ
M
X ais½n —
i¼1
! )
i] s½n — k] ¼ 0
ð4:4Þ
for k ¼ 1; 2; ... ; M. At this point, it is maintained without proof that when (4.4) is satisfied, then ai ¼ ^ai; that is, the LPCs are equal to the AR parameters. Justification
of this claim appears at the end of the section. Thus, when the LPCs are found, the system used to generate the AR signal (AR process synthesizer) is uniquely identified.
Normal Equation
Equation (4.4) can be rearranged to give
M
Efs½n]s½n — k]g þ X aiEfs½n — i]s½n — k]g ¼ 0ð4:5Þ
i¼1
or
for k ¼ 1; 2; ... ; M, where
M
X aiRs½i — k] ¼ —Rs½k] ð4:6Þ
i¼1
Rs½i — k]¼ Efs½n — i]s½n — k]g; ð4:7Þ
Rs½k]¼ Efs½n]s½n — k]g: ð4:8Þ
Equation (4.6) defines the optimal LPCs in terms of the autocorrelation Rs½l] of the signal s½n]. In matrix form, it can be written as
Rsa ¼ —rs; ð4:9Þ
where
0 Rs½0] Rs½1] ··· Rs½M — 1] 1 B Rs½1] Rs½0] ··· Rs½M — 2] C
Rs ¼ B
.. .. . .
.. C; ð4:10Þ
B C
B C
. . . .
Rs½M — 1] Rs½M — 2] ··· Rs½0]
a ¼ ½a1 a2 ··· aM]T ; ð4:11Þ
rs ¼ ½ Rs½1] Rs½2] ··· Rs½M]]T : ð4:12Þ
Equation (4.9) is known as the normal equation. Assuming that the inverse of the correlation matrix Rs exists, the optimal LPC vector is obtained with
a ¼ —Rs—1rs: ð4:13Þ
Equation (4.13) allows the finding of the LPCs if the autocorrelation values of s½n]
are known from l ¼ 0 to M.
Prediction Gain
The prediction gain of a predictor is given by
PG ¼ 10 log10
.s2.
s s
e
¼ 10 log10
.E.s2½n]..
Efe2½n]g
ð4:14Þ
and is the ratio between the variance of the input signal and the variance of the prediction error in decibels (dB). Prediction gain is a measure of the predictor’s per- formance. A better predictor is capable of generating lower prediction error, leading to a higher gain.
ВОЗМОЖНОЕ ДОПОЛНЕНИЕ К 12 ВОПРОСУ!!Minimum Mean-Squared Prediction Error
From Figure 4.1 we can see that when ai ¼ ^ai, e½n] ¼ x½n]; that is, the prediction error is the same as the white noise used to generate the AR signal s½n]. Indeed, this is the optimal situation where the mean-squared error is minimized, with
x
or equivalently, the prediction gain is maximized.
The optimal condition can be reached when the order of the predictor is equal to or higher than the order of the AR process synthesizer. In practice, M is usually unknown. A simple method to estimate M from a signal source is by plotting the prediction gain as a function of the prediction order. In this way it is possible to determine the prediction order for which the gain saturates; that is, further increas- ing the prediction order from a certain critical point will not provide additional gain. The value of the predictor order at the mentioned critical point represents a good estimate of the order of the AR signal under consideration.
As was explained before, the cost function J in (4.3) is characterized by a unique minimum. If the prediction order M is known, J is minimized when ai ¼ ^ai, leading
to e½n]¼ x½n]; that is, prediction error is equal to the excitation signal of the AR process synthesizer. This is a reasonable result since the best that the prediction- error filter can do is to ‘‘whiten’’ the AR signal s½n]. Thus, the maximum prediction
gain is given by the ratio between the variance of s½n] and the variance of x½n] in decibels.
Taking into account the AR parameters used to generate the signal s½n], we have
M
x
i¼1
which was already derived in Chapter 3. The above equation can be combined with (4.9) to give
s
. Jmin .
4:17Þ
rs Rs a ¼ 0 ð
and is known as the augmented normal equation, with 0 the M × 1 zero vector. Equation (4.17) can also be written as
M .
i
s½
—
]
¼
0; k ¼ 1; 2; ... ; M
ð4:18Þ
where a0 ¼ 1.
i¼0