Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
ga2ap.doc
Скачиваний:
22
Добавлен:
10.02.2016
Размер:
7.06 Mб
Скачать

20 / 21 . Speech encoding. Celp coder

CODE-EXCITED LINEAR PREDICTION

The CELP coder relies on the long-term and short-term linear prediction models.

Figure 11.1 shows the block diagram

of the speech production model, where an excitation sequence is extracted from the

codebook through an index. The extracted excitation is scaled to the appropriate

level and filtered by the cascade connection of pitch synthesis filter and formant

synthesis filter to yield the synthetic speech. The pitch synthesis filter creates periodicity in the signal associated with the fundamental pitch frequency, and the formant synthesis filter generates the spectral envelope.

Encoder Operation

A block diagram of a generic CELP encoder is shown in Figure 11.9. This encoder

is highly simplistic and serves only as an illustration. Subsequent chapters contain

the details of operation of different standard CELP coders. The encoder works as

follows:

Input speech signal is segmented into frames and subframes. As explained in

Chapter 4, the scheme of four subframes in one frame is a popular choice.

Length of the frame is usually around 20 to 30 ms, while for the subframe it is

in the range of 5 to 7.5 ms.

Short-term LP analysis is performed on each frame to yield the LPC.

Afterward, long-term LP analysis is applied to each subframe (Chapter 4).

Input to short-term LP analysis is normally the original speech, or preemphasized speech; input to long-term LP analysis is often the (short-term)prediction error. Coefficients of the perceptual weighting filter, pitch synthesis

filter, and modified formant synthesis filter are known after this step.

The excitation sequence can now be determined. The length of each excitation

codevector is equal to that of the subframe; thus, an excitation codebook

search is performed once every subframe. The search procedure begins with

the generation of an ensemble of filtered excitation sequences with the

corresponding gains; mean-squared error (or sum of squared error) is

computed for each sequence, and the codevector and gain associated with

the lowest error are selected.

The index of excitation codebook, gain, long-term LP parameters, and LPC

are encoded, packed, and transmitted as the CELP bit-stream.

Decoder Operation

A block diagram of the CELP decoder is shown in Figure 11.10. It basically

unpacks and decodes various parameters from the bit-stream, which are directed

to the corresponding block so as to synthesize the speech. A postfilter is added at

the end to enhance the quality of the resultant signal; structure of this filter is

described in Section 11.5.

  1. 22/23. Speech encoding. Ld-celp coder

LOW-DELAY CELP

In the process of speech encoding and decoding, delay is inevitably introduced.

Loosely defined, delay is the amount of time shift between the speech signal at

the input of the encoder with respect to the synthetic speech at the output of the

decoder, when the output of the encoder is directly connected to the input of the

decoder. For schemes such as PCM and ADPCM (Chapter 6), the speech signal

is encoded on a sample-by-sample basis: a few bits are found for each sample

with the result transmitted immediately; the delay associated with these schemes

is negligible