Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Одесская национальная академия связи им. А.С. Попова

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

ga2ap.doc

Скачиваний:

Добавлен:

10.02.2016

Размер:

7.06 Mб

Скачать

☆

<<< < Предыдущая 1 2 3 45 / 185 6 7 8 9 10 11 12 13 14 15 16 17 18 > Следующая >>>

Вопрос 6 Origin of Speech Signals

The speech waveform is a sound pressure wave originating from controlled movementsof anatomical structures making up the human speech production

Figure 1.7 Diagram of the human speech production system.

system. A simpliﬁed structural view is shown in Figure 1.7. Speech is basically generated as an acoustic wave that is radiated from the nostrils and the mouth when air is expelled from the lungs with the resulting ﬂow of air perturbed by the constrictions inside the body. It is useful to interpret speech production in terms of acoustic ﬁltering. The three main cavities of the speech production system are nasal, oral, and pharyngeal forming the main acoustic ﬁlter. The ﬁlter is excited by the air from the lungs and is loaded at its main output by a radiation impedance associated with the lips.

The vocal tract refers to the pharyngeal and oral cavities grouped together. The nasal tract begins at the velum and ends at the nostrils of the nose. When the velum is lowered, the nasal tract is acoustically coupled to the vocal tract to produce the nasal sounds of speech.

The form and shape of the vocal and nasal tracts change continuously with time, creating an acoustic ﬁlter with time-varying frequency response. As air from the lungs travels through the tracts, the frequency spectrum is shaped by the frequency selectivity of these tracts. The resonance frequencies of the vocal tract tube are called formant frequencies or simply formants, which depend on the shape and dimensions of the vocal tract.

Inside the larynx is one of the most important components of the speech produc- tion system—the vocal cords. The location of the cords is at the height of the ‘‘Adam’s apple’’—the protrusion in the front of the neck for most adult males. Vocal cords are a pair of elastic bands of muscle and mucous membrane that open and close rapidly during speech production. The speed by which the cords open and close is unique for each individual and deﬁne the feature and personality of the particular voice.

Modeling the Speech Production System

In general terms, a model is a simpliﬁed representation of the real world. It is designed to help us better understand the world in which we live and, ultimately, duplicate many of the behaviors and characteristics of real-life phenomenon. However, it is incorrect to assume that the model and the real world that it repre- sents are identical in every way. In order for the model to be successful, it must be able to replicate partially or completely the behaviors of the particular object or fact that it intends to capture or simulate. The model may be a physical one (i.e., a model airplane) or it may be a mathematical one, such as a formula.

The human speech production system can be modeled using a rather simple structure: the lungs—generating the air or energy to excite the vocal tract—are represented by a white noise source. The acoustic path inside the body with all its components is associated with a time-varying ﬁlter. The concept is illustrated in Figure 1.9. This simple model is indeed the core structure of many speech coding algorithms, as can be seen later in this book. By using a system identiﬁcation

Group 13702 Group 13700

White noise generator

Time- varying filter

Output speech

Group 13693 Group 13678 ^LungsTrachea

Pharyngeal cavity Nasal cavity

Oral cavity Nostril Mouth

Figure 1.9 Correspondence between the human speech production system with a simpliﬁed system based on time-varying ﬁlter.

technique called linear prediction (Chapter 4), it is possible to estimate the para- meters of the time-varying ﬁlter from the observed signal.

The assumption of the model is that the energy distribution of the speech signal in frequency domain is totally due to the time-varying ﬁlter, with the lungs produ- cing an excitation signal having a ﬂat-spectrum white noise. This model is rather efﬁcient and many analytical tools have already been developed around the concept.

НЕ ВОПРОС. ВОзМОЖНО ПОПАДЁТСЯ СХЕМА!!!General Structure of a Speech Coder

Figure 1.12 shows the generic block diagrams of a speech encoder and decoder. For the encoder, the input speech is processed and analyzed so as to extract a number of parameters representing the frame under consideration. These parameters are encoded or quantized with the binary indices sent as the compressed bit-stream

Group 13061

Analysis and processing

Input PCM

speech

Group 13057 Group 13054 Group 13051

Extract and encode

parameter 1

Index 1

Extract and encode

parameter 2

ndex 2

Extract and encode

parameter N

Index N

…

Group 13035 I

Pack

Bit-stream

Group 13030

Unpack
Index 1	Index 2	Index N

Bit-stream

Group 13026 Group 13023 Group 13020

Decode parameter 1

Decode parameter 2

Decode parameter N

…

Group 13015 Group 13013 Group 13011 Group 13009

Combine and processing	speec

Synthetic h

Figure 1.12 General structure of a speech coder. Top: Encoder. Bottom: Decoder.

(see Chapter 5 for concepts of quantization). As we can see, the indices are packed together to form the bit-stream; that is, they are placed according to certain prede- termined order and transmitted to the decoder.

The speech decoder unpacks the bit-stream, where the recovered binary indices are directed to the corresponding parameter decoder so as to obtain the quantized parameters. These decoded parameters are combined and processed to generate the synthetic speech.

Similar block diagrams as in Figure 1.12 will be encountered many times in later chapters. It is the responsibility of the algorithm designer to decide the functionality and features of the various processing, analysis, and quantization blocks. Their choices will determine the performance and characteristic of the speech coder.

<<< < Предыдущая 1 2 3 45 / 185 6 7 8 9 10 11 12 13 14 15 16 17 18 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
12.11.201969.63 Кб0electric current_прктик_раб.doc
#
10.02.20165.42 Mб9ELMAGN_lab_2_1-4_3_1-2.doc
#
10.02.20161.1 Mб27Filosofia.docx
#
10.02.201621.48 Кб13finansovaya_strategia_predpriatia_dlya_302.docx
#
10.02.20161.51 Mб19Functions of several variables- Textbook..pdf.pdf
#
10.02.20167.06 Mб22ga2ap.doc
#
10.02.201643.9 Mб39his_uk1.pdf
#
18.08.201940.66 Кб3INDIVIDUAL_NE_ZAVDANNYa_DO_TEMI_1.docx
#
10.02.2016684.03 Кб18ISDN Основные виды обслуживания.doc
#
31.08.201925.13 Mб2ISDN Основные виды обслуживания.doc
#
02.12.2018250.37 Кб1kapital оригинал.doc