Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Phonetics

.pdf
Скачиваний:
172
Добавлен:
01.05.2015
Размер:
10.59 Mб
Скачать

Feature Hierarchy 275

grouped together as aspects of another feature, Aperture. This grouping reflects the fact that the four features Stop, Fricative, Approximant, and Vowel all depend on the degree of closure of the articulators. In some older feature systems, these possibilities are split into two groups, but it is now thought better to recognize that they form a continuum. The changes in the pronunciation of Peter Ladefoged’s name, for example, fall on this continuum. The name is of Danish origin. Peter pronounced it [ "1œdIfoÁgId ] in English, with consonants as they once were in Danish. These stops first became fricatives, which later became approximants in Danish [ "lœ…DEfo…VED ], later [ ·lœ…D̞Efo…V̟ED ], and now perhaps more like [ "lœ…D̞fo…wD ], making it apparent that there is a continuum going from [stop] through [fricative] to [approximant]. (Note the use of the diacritic

§ more open

[ ], meaning , turning the fricative symbols into symbols for approximants.) The name is simply two Danish words put together, lade, a barn, and foged, something like a steward or bailiff; so Ladefoged = Barnkeeper. Spanish also has a process whereby stops first become fricatives and then approximants.

The manner category Stop has only one possible value, [stop], but Fricative has two: [sibilant] and [nonsibilant]. The possible values for Approximant and Vowel will be discussed in the next paragraph, but first we should note that there are two other Manner features, Trill and Tap, each of which has only a single possible value, respectively [trill] and [tap]. The further relationships among all the Manner features are beyond the scope of this book.

As shown in Figure 11.6, Approximant and Vowel dominate other features. There are five principal features, the first of which, Height, has five possible values [high], [mid-high], [mid], [mid-low], and [low]. As far as we know, no language distinguishes more than five vowel heights. Backness has only three values, [front], [center], and [back]. As we saw in Chapter 9, when discussing Japanese [ ¨ ], there are two kinds of Rounding: Protrusion with possible values [protruded] and [retracted], and Compression, with possible values [compressed] and [separated]. The feature Tongue Root has two possible values: [+ATR] and [−ATR]. Pharyngealized sounds may be classified as having the opposite of an advanced tongue root and are therefore [−ATR]. The feature Rhotic has only one possible value, [rhotacized].

Separate figures have not been drawn for the other two Supra-Laryngeal features, Nasality and Laterality, as each of them is itself a terminal feature. Nasality has the possible values [nasal] and [oral]; Laterality has the possible values [lateral] and [central].

The Laryngeal possibilities, shown in Figure 11.7, involve three features. Glottal Stricture specifies how far apart the vocal folds are. Languages make use of five possibilities: [voiceless]; [breathy voice], as we saw in languages such as Hindi; [modal voice], which is the regular voicing used in every language; [creaky voice] in languages such as Hausa; and [closed], forming a glottal stop. Many in-between possibilities occur, but if we are simply providing categories for the degrees of glottal opening that are used distinctively, these five are sufficient. A separate feature, Glottal Timing, is used to specify voiceless aspirated

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

276CHAPTER 11 Linguistic Phonetics

Figure 11.6 Features dominated by the features Vowel and Approximant.

Figure 11.7 Features dominated by the feature Laryngeal.

stops and breathy voiced aspirated stops. A third feature, Glottal Movement, is also included among the Laryngeal features to allow for the specification of implosives and ejectives. Some books, including previous editions of this book, prefer to consider these sounds as simply involving a different airstream mechanism. This is the way we began describing them at the beginning of Chapter 6. At the end of that chapter, we included them in the summary of actions of the glottis. As pointed out there, they interact with other Laryngeal features, and are accordingly put at this point in the hierarchy.

The arrangement in Figure 11.7 leaves the Airstream feature, shown in Figure 11.8, dominating only two features, Pulmonic and Velaric. Both of these

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

A Problem with Linguistic Explanations 277

Figure 11.8 Features dominated by the feature Airstream.

have only one value. In a more elaborate arrangement, it would be appropriate to consider whether the pulmonic airstream mechanism varied in force, but this possibility will not be considered here.

The figures in this chapter provide a hierarchical arrangement of the features required to describe nearly all the sounds of the world’s languages. Try working through this hierarchy from the top down so that you get a complete specification of a variety of sounds. Table 11.1 gives a partial specification of a number of English segments.

A PROBLEM WITH LINGUISTIC EXPLANATIONS

We now turn to a discussion of the phonetics of the individual. Current phonetic research and theory focuses to a large extent on topics such as speech motor control, the representation of speech in memory, and the interaction of speech perception and production in language change (the topics of the next three sections), because it is in topics such as these that we find explanations for language sound patterns. For example, we have given a name to the phenomenon of “assimilation” and can describe it by saying that adjacent sounds come to share some phonetic properties. But if we restrict ourselves to the terminology and knowledge base of linguistic phonetics, we are restricted to descriptions of sound patterns and not their explanation. In fact, explanations that are restricted in this way often fall into the fallacy of reification—acting as if abstract things are concrete. Here’s an explanation that falls into this trap: Assimilation happens because there is a tendency in pronunciation for adjacent sounds to share phonetic properties. This “explanation” is even more impressive if we state it as a formal constraint on sequences of sounds: AGREE(x)

Adjacent output segments have the same value of the feature x. The problem is that the explanation is just a restatement of the description. Assimilation is when adjacent segments share features, so the “explanation” says nothing more than that when we look at language we see that assimilation happens. The explanation has this form: the tendency to assimilate (a cross-linguistic generalization) exists because there is a tendency to assimilate (reified as a specific “explanatory principle”). In the following sections, we look to the private phonetic knowledge of the individual for a more satisfying way to explain language sound patterns.

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

278 CHAPTER 11

Linguistic Phonetics

 

 

 

 

 

 

 

 

 

 

 

TABLE 11.1

A partial feature specification of some English segments (vowels that may

 

 

 

 

not occur in all accents are omitted).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Labial

[bilabial]

p, b, m

 

 

 

 

 

 

 

 

 

 

[labiodental]

f, v

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[dental]

T, D

 

 

 

 

 

 

 

 

 

 

Place

 

Coronal

[alveolar]

t, d, n, l, s, z

 

 

 

 

 

 

 

 

 

 

[post-alveolar]

r

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[palato-alveolar]

S, Z

 

 

 

 

 

 

 

 

 

 

 

 

 

Dorsal

[velar]

k, g, N

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[stop]

p, t, k, b, d, g, m, n

 

 

 

 

 

 

 

 

 

 

 

 

 

Fricative

[sibilant]

s, S, z, Z

 

 

 

 

 

 

 

 

 

 

[nonsibilant]

f, T, v, D

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[high]

i, u,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[mid-high]

I, Á, eI, oÁ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Height

[mid]

E, ∏

 

 

 

 

 

 

 

 

 

 

Aperture

 

 

 

 

[mid-low]

”, O

 

 

 

 

Approximant

 

 

 

 

 

 

 

 

 

 

[low]

œ, A

 

 

 

 

Vowel

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Backness

[front]

i, I, eI, ”, œ

 

 

 

 

 

 

 

 

 

 

 

 

[back]

A, O, oÁ, Á, u

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Rounding

[rounded]

O, oÁ, Á, u

 

 

 

 

 

 

 

 

 

 

 

 

[unrounded]

i, I, eI, ”, œ, A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[nasal]

m, n, N

 

Nasality

 

 

 

 

 

 

 

 

 

[oral]

(all others)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[lateral]

l

 

Laterality

 

 

 

 

 

 

 

 

 

[central]

(all others)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Laryngeal

 

 

Glottal Structure

 

[voiceless]

p, t, k, f, T, s, S

 

 

 

 

 

 

[(modal) voice]

(all others)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CONTROLLING ARTICULATORY MOVEMENTS

Underlying our linguistic description of [ p ], to take one simple sound as an example of speech motor control, is a dizzying array of muscular complexity involving dozens of muscles in the chest, abdomen, larynx, tongue, throat, and face. And all of these must be contracted with varying degrees of tension in

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Controlling Articulatory Movements 279

specific sequence and duration of contraction. For example, in producing a lip closure movement, there are two main muscles (depressor labii inferior and incisivus inferior) that depress the lower lip, that is, pull the lower lip away from the upper lip. These muscles must relax so as not to oppose the lip closure motion of [ p ] too much. There are also two main muscles that when contracted will move the lower lip toward the upper lip (obicularis oris inferior, mentalis). These two must be given enough tension to overcome the tension of the lip-depressing muscles. As the formulations above “too much” and “enough tension” imply, the actual degree of tension needed for lip closure cannot be specified in absolute terms but depends on the tension of the opposing muscles. Furthermore, the tension of the cooperating muscles must be coordinated. For instance, obicularis oris inferior (OOI) and mentalis must trade off with each other so that if mentalis is not very tense for a particular [ p ], the OOI will compensate with greater tension.

So coordination of the four main lower lip muscles is complicated and can’t be specified with predetermined target “tension” levels because the actual degree of muscle fiber activation for raising the lower lip in [ p ] depends on the tension of the other lip muscles. But the situation is even more complex than this because the lower lip moves up and down as the jaw moves up and down. So the muscles that depress the jaw (geniohyoid, mylohyoid, and digastricus) must also be coordinated with the muscles that raise the jaw (masseter and temporalis). The activation of these jaw muscles depends on the tension of the lip muscles and, just as the muscles within the lower lip may trade off with each other, so also the jaw muscles may trade with the lip muscles so that in one [ p ], there is more jaw movement, while in another, there is more lower lip movement, and in yet a third [ p ], the upper lip does more work. Movement of the jaw also depends on its starting location. If the jaw is already relatively closed, such as in the utterance [ ipi ], there may be no need for it to move as part of the [p ] production, while in [ apa ], the jaw-closing muscles might be quite active.

The number of free parameters (separate muscle activations) that must be controlled in speech has been called a degrees of freedom problem because a flat control structure in which the tension of each muscle is separately controlled presents a control problem of exceeding complexity. The solution to the degrees of freedom problem that has been achieved in the speech motor control system (and most other motor control systems, like swallowing, walking, reaching, looking, etc.), is to organize the control system hierarchically in goal-oriented coordinative structures.

For example, one of the gestures involved in saying [ p ] is lip closure. The coordinative structure for lip closure is illustrated in Figure 11.9. This structure specifies an overall task “close the lips” at the top node, and subtasks such as “raise the lower lip” and “lower the upper lip” are coordinated with each other to accomplish the overall task. Some subtasks also require further reduction of the goal into smaller subtasks. For example, “raise the lower lip” is present twice in the figure—first as a specification of the absolute position of the lower

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

280 CHAPTER 11 Linguistic Phonetics

lip (which is coordinated with the upper lip in accomplishing the task “close the lips”) and second as a specification of the relative position of the lower lip (which is coordinated with the jaw in accomplishing the task “raise the (absolute position of) the lower lip”). The idea with coordinative structures is that each gesture is defined in terms of subtasks, and thus there is no direct control from a task like “close the lips” to the muscles. Instead, the muscles have more limited goals and the degrees of freedom problem in coordination is solved by dividing the overall task into smaller, simpler coordination problems.

One way that we know about the coordinative structures for speech is by looking at how articulators and muscles may trade off with each other. For example, when we track the locations of the upper and lower lips in a sequence of [ pApApApApA ], we find that on some instances of [ p ], the lower lip may raise more than it does on other instances of [ p ]. In those instances where the lower lip doesn’t reach as far toward the upper lip, we find that the upper lip compensates with a greater magnitude of movement toward the lower lip. Similar patterns of compensation are seen for all of the subtasks illustrated in Figure 11.9—the jaw and lower lip compensate for each other, and the OOI and mentalis compensate for each other. These patterns of compensation, or trading relations in speech, are motor equivalences—different motor activation patterns producing the same result.

For further simple examples of coordinative structures, we will consider the production of vowels. As you can see quite easily for yourself, it is possible to produce the same vowel with many different jaw positions. Try to say [ i ] with your teeth almost together, and then with them fairly far apart. There need be very little, if any, difference in the sounds you produce. The same is true of many other vowels. In fact, it is possible to produce a complete set of English vowels with your teeth almost together or held apart by a wedge such as a small coin. Obviously, the motor activity must be very different in these two circumstances.

Figure 11.9 Part of the coordinative structure involved in lip closure.

 

 

 

 

 

 

close the lips

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

raise the

 

lower the

 

 

 

 

lower lip

 

upper lip

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

raise the

 

 

raise the

 

 

lower lip

 

 

jaw

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

OOI

 

 

mentalis

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Memory for Speech 281

When the teeth are held far apart, you can feel the muscles of the tongue raising it up in the jaw when you say [ i ]. When the teeth are close together, the raising of the jaw itself contributes greatly to the lifting of the tongue for [ i ]. You can also observe the results of the motor equivalence of different gestures that people use when making vowels by watching them say the words heed, hid, head, had. You will probably be able to see that some people lower the tongue by lowering the jaw as they say this series of words. But others keep the jaw comparatively steady and simply lower the tongue within the jaw.

Motor control nearly always involves considering speech production in more detail than is necessary for the description of differences in meaning and is open to much variation within and across speakers. This is why we say that speech motor control involves private phonetic knowledge. In each of the cases we have been considering, it is possible to produce the sounds in a variety of specific physiological or articulatory ways. Thus, if two [ p ] sounds have the same lip closure, they are linguistically equivalent, irrespective of the pattern of jaw and lip coordination used to produce the closure. Similarly, the different jaw positions in vowels will not affect the position of the highest point of the tongue or its shape relative to the upper surface of the vocal tract.

Although there is quite a bit more to speech motor control than this, we can nonetheless see how investigation of speech motor control may offer some additional insight into phonological patterns like assimilation. We are closer to an explanation of assimilation by being able to note that the production of speech is accomplished by ensembles of gestures that in essence compete for control of the muscles of the vocal tract. One segment requires that the lower lip-raising muscles be more active than the lower lip depressors, while an adjacent segment requires the opposite pattern of activation. Because muscle activations come on and off gradually over time, we have a good start of toward explanation of the tendency for adjacent sounds to become like each other from patterms of gestural activation (and motor control principles generally).

MEMORY FOR SPEECH

As we have seen, the speech to which we are exposed is quite diverse. Different speakers of the same language will have somewhat different productions depending on vocal tract physiology and their own habits of speech motor coordination. We are also exposed to a variety of speech styles ranging from very careful pronunciations in various types of public speaking to the quite casual style that is typical between friends.

This “lack of phonetic invariance” has posed an important problem for phonetic theory as we try to reconcile the fact that shared phonetic knowledge can be described using IPA symbols and phonological features with the fact that the individual phonetic forms that speakers produce and hear on a daily basis span a very great range. The lack of invariance problem also has great practical significance for engineers who try to get computers to produce and recognize speech.

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

282 CHAPTER 11 Linguistic Phonetics

One way to account for phonetic variability across languages is to posit language-specific phonetic implementation rules. This approach assumes a universal set of phonetic features such as we find in the IPA, coupled with a lan- guage-specific set of statements to specify the phonetic targets for each phonetic feature. For example, both Navajo and Mandarin have voiceless aspirated stops, but as we saw in Chapter 6, the VOT of the Navajo aspirated stops is much longer than the VOT of the Mandarin aspirated stops. The implementation approach says that there is one feature [+ spread glottis], and that it is implemented differently in Navajo and Mandarin.

The phonetic implementation approach becomes more complicated when we try to account for stylistic pronunciation variation. Part of the complication comes from the fact that it is not plausible to assume that all languages have the same set of reduction processes mapping careful speech into casual speech. For example, as we discussed in a previous chapter, vowel devoicing in Japanese is usually described as affecting the high vowels [ i] and [ u ] only, with a statement like “high vowels devoice between voiceless consonants.” The problem is that mid vowels also devoice in Japanese—but this devoicing process is not categorical. The mid vowel devoicing rule is something like “mid vowels devoice sometimes between voiceless consonants, with increasing probability of devoicing as speech rate increases.” Devoicing is a phonetic reduction process, in which contrastive phonetic information is lost or neutralized as a function of speech rate or style. And although vowel devoicing does occur in other languages (see if you get it in potato), it is by no means universal or uniform in character across languages. Thus, each language needs a set of phonetic implementation rules to account for stylistic variation.

The prognosis for the phonetic implementation approach becomes even more dire when we look at individual differences among speakers. Anatomical differences between speakers, whether they be large differences such as those between children and adults, or small differences related to the size and shape of the palate within an otherwise homogenous group, have an impact on speech motor control. In response to variability of this sort, the phonetic implementation approach must hope that these sources of individual phonetic variation are quite small relative to the larger—and presumably more rule-governed—sources of variation. Experience in automatic speech recognition, which is still troublingly unreliable for large-vocabulary multiple-talker systems, suggests that individual variation is a substantial problem for the implementation model.

In this section, we have been discussing phonetic variability (across languages, styles of speech, and different speakers of the same language), yet the section is entitled “memory for speech.” This is because the main alternative to the phonetic implementation approach is a theory that focuses on how experiences are encoded in memory. It is worth noting that the phonetic implementation view assumes that words are stored in memory in their most basic phonetic form, from which we calculate phonetic variation using phonetic implementation rules. Given the problems of the phonetic implementation approach, an

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Memory for Speech 283

alternative theory—that many instances of each word are stored in memory—is suggested. This exemplar theory of phonetics holds that variability is memorized rather than computed. Figure 11.10 illustrates a phonetic category (for example, the vowel [ u ]) in this theory. The axes of the figure stand for two phonetic dimensions, perhaps F1 and F2, or alternatively, the location and degree of tongue-body constriction. Obviously, real phonetic spaces have many more dimensions than this. Rather than posit the existence of an abstract phonetic entity [ u ] from which each exemplar must be derived, in exemplar theory, the representation of [ u ] is the set of exemplars. By cross-classifying each exemplar as also an exemplar of citation speech or casual speech, the model also provides a representation of these speech styles. As you can see, exemplar theory relies heavily on stored exemplars using processes of selection and storage rather than processes of transformation to define the range of variability found in speech.

Exemplar theories (as of this writing, there are many competing proposals regarding the details of exemplar theory) offer a shift in perspective on several core concepts in phonetics.

Language universal features Broad phonetic classes (e.g., aspirated vs. unaspirated) derive from physiological constraints on speaking or hearing, but detailed phonetic definitions are arbitrary—a matter of community norms. This theory tends to disfavor cognitive universals and sees instead a role for physiological or physical universals.

Speaking styles No one style is basic (from which others are derived), because all are stored in memory. Bidialectal speakers store two dialects, and all speakers control a range of speaking styles. Listeners may learn to recognize new varieties of speech—regional dialects, or computer-mangled synthesis—by storing exemplars of them.

Figure 11.10 A hypothetical cloud of [ u ] exemplars.

Citation forms

Casual forms

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

284 CHAPTER 11 Linguistic Phonetics

Generalization and productivity Interestingly, productivity—the hallmark of linguistic knowledge in the phonetic implementation approach—is the least developed aspect of exemplar theory.

Sound change The Neogrammarians (around the turn of the twentieth century) argued that sound change is phonetically gradual and operates across the whole lexicon. They conceived of this in an exemplar theory where sound change is a gradual shift of the exemplar “cloud” as new instances are added. Note that in the phonetic implementation model, phonetically gradual sound change requires two distinct yet logically independent mechanisms—change in phonetic implementation rules, then, after a big enough shift, change in a feature value.

THE BALANCE BETWEEN PHONETIC FORCES

When we consider how sounds pattern within a language, we must take into account both the speaker’s point of view and the listener’s point of view. Speakers often like to convey their meaning with the least possible articulatory effort. Except when they are trying to produce very clear speech, they will tend to produce utterances with a large number of assimilations, with some segments left out, and with the differences between other segments reduced to a minimum. Producing utterances in this way allows a speaker to follow a principle of ease of articulation. The main way to reduce articulatory effort is by using coarticulations between sounds. As a result of coarticulations, languages change. For example, in an earlier form of English, words such as nation, station contained [s], so that they were pronounced [ "nasion ] and [ "stasion ]. As a result of gesture overlap in some exemplars, the blade of the tongue became raised during the fricative, in anticipation of the position needed for the following high front vowel. Thus, the [ s ] became [ S ], [ i ] was lost, and the unstressed [o] became [ǝ]. (The t was never pronounced in English. It was introduced into the spelling by scholars who were influenced by Latin.)

Further examples are not hard to find. Coarticulations involving a change in the place of the nasal and the following stop occurred in words such as improper and impossible before these words came into English through Norman French. In words such as these, the [ n ] that occurs in the prefix in- (as in intolerable and indecent) has changed to [ m ]. These changes are even reflected in the spelling. There are also coarticulations involving the state of the glottis. Words such as resist and result are pronounced as [ rǝ"zIst ] and [ rE"zØlt ], with a voiced consonant between the two vowels. The stems in these words originally began with the voiceless consonant [ s ], as they still do in words such as consist and consult, in which the [ s ] is not intervocalic. In all these and in many similar historical changes, one or more segments are affected by adjacent segments so that there is an economy of articulation. These are historical cases of the phenomenon of assimilation, which we discussed at the beginning of Chapter 5.

Ease of articulation cannot be carried too far. Listeners need to be able to understand the meaning with the least possible effort on their part. They would

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]