宁波话元音的语音学研究
上QQ阅读APP看书,第一时间看更新

Chapter One:Introduction

1.1 General theoretical background of vowel production

Since its initiation by Bell (1867) during the mid nineteenth century, it has been the tradition to describe vowels based on certain basic articulatory features:high-low, front-back, and rounded-unrounded.This descriptive model prescribes for each vowel a unique tongue position in terms of the height and backness of the tongue arch in the vocal tract.This model is widely accepted in the field and constitutes an essential base both for the IPA vowel chart and for the distinctive features in phonological theory (Chomsky & Halle, 1968).In the British phonetic tradition, the vowel height and backness are explicitly described in terms of the highest point of the tongue (Jones, 1909).For instance, according to Jones (1956), Cardinal Vowels 1, 2, 3, and 4 ([i e ɛ a]) are “vowels of the ‘front’ series”and Cardinal vowels 5, 6, 7, and 8 ([ɑ ɔ o u]) are “vowels of the ‘back’series”.In purely articulatory terms, he defines Cardinal vowel 1 ([i]) as “the sound in which the raising of the tongue is as far forward as possible and as high as possible consistently with its being a vowel” and Cardinal Vowel 5 ([ɑ]) as “a sound in which the back of the tongue is lowered as far as possible and retracted as far as possible consistently with the sound being a vowel” (p.31).From Cardinal vowel 1 to 2 and to 3 and to 4, “the tongue is lowered through approximately equal intervals”; similarly, from Cardinal Vowel 5 to 6 and to 7 and to 8, “the tongue is raised through approximately equal though smaller intervals” (p.32).

However, this classical model of vowel description has never been validated by any empirical articulatory data.In fact, it has been contradicted by a large number of the lingual articulation data based on various kinds of techniques in the literature.For example, as early as in 1910, Meyer found that [I] has an unexpected lower tongue position than [e] based on his plastopalatogram data.More convincing is a massive X-ray study from Russel (1928) wh.zh conclusively discredited the traditional height/backness articulatory description of vowels.Based on a large number of X-ray data of American English speakers, he found that the tongue position was often not as predicted according to Bell’s model.For example, [I] might be lower than [e]; [u] was possibly a front vowel; the articulation position for [ɑ], [ɔ] and [ɒ] was rather in the pharynx cavity.

Russell’s work did not attract much attention from his contemporary phoneticians.Several decades later, the problem was highlighted by Ladefoged and his colleagues (Ladefoged, 1967, 1971, 1975, 1976; Ladefoged, De Clerk, Lindau & Papçun, 1972).Like Russell, Ladefoged pointed out that the traditional vowel descriptive model actually described the auditory impressions of vowels but translated them into physiological terms, because the traditional vowel articulation dimensions like high-low and front-back did not correspond to the physiological reality; rather they were correlated with vowels’ acoustic features.For instance, Ladefoged et al.(1972) found that the tongue contour is almost identical for the vowels [I e ɛ] in some American English speakers.Furthermore, Ladefoged (1975, 1976) pointed out that the highest points of the tongue in a set of Cardinal Vowels produced by Jones “are not at all similar to the descriptions of that have been given of these vowels”.And “there is not even a good separation between front vowels and back vowels”; “and the relative heights are wildly wrong” (1976:10-12).Similar contradictions between the traditional vowel descriptive model and actual vowel articulation were also found in the cineradiographic data of Perkell (1969) (see Lieberman, 1976).The psycho-acoustical experiment in Ladefoged (1967) showed that trained phoneticians could make fairly accurate judgments of the vowel height and backness even when working from the recorded data, which led him to believe that vowel height and backness has nothing to do with what has done with the tongue; rather, it is just because these trained phoneticians “could label the formant structure of the vowels they heard and imitated” (Fromkin, 1985:5).As illustrated in Ladefoged (1975, 1976), the traditional Cardinal Vowel chart correlates well with an auditorily scaled acoustic F1/F2 (or F1/F2-F1) vowel chart.

Although to some phoneticians the criticism of the traditional vowel descriptive system was an exaggeration (e.g., Catford, 1981; Fischer-Jørgensen, 1985), the acoustical or acoustical-auditory proposal for vowel description was widely accepted by phoneticians and there was a broad consensus that traditional vowel parameters such as height and backness should be better understood as acoustically or perceptually defined than as articulatorily defined (e.g., Jakobson et al., 1952; Ladefoged et al.1972; Lieberman, 1976; Lindau, 1978[1]; Nearey, 1978; Wood, 1982).The apparent inconsistency found in the vowel articulation led researchers to reconsider the articulatory dimensions used in vowel description.As Wood (1979:25) stated, the confusing pictures obtained from X-ray studies of vowel articulation were due to “the wrong articulatory variables (height and fronting) rather than to articulatory irregularity”.In the modeling study of vowel production, Stevens & House (1955) rejected height and backness and replaced them with a new parameterization, namely the degree of vocal tract constriction and the location of constriction relative to the glottis.These dimensions are claimed to be more directly related to vowel acoustics, because the constriction location determines the size of the front and back cavity in the vocal tract and the degree of constriction affects the coupling of the two cavities.

The proposed articulatory dimensions have become the basis of the mainstream in the field of speech production studies.The proposal completely changes the understanding of vowel articulation.While in the general studies of speech production, the length of the vocal tract was treated as a continuum of constriction locations (Stevens & House, 1955; Fant, 1960; Lindblom & Sundberg, 1971), effort has been made to find the explicit constriction locations in vowel production.In the quantal theory of speech production, Stevens (1972, 1989) identified three constriction regions where vowel spectra are relatively insensitive to moderate variations of constriction location:at the palate for [i], at the velum for [u], and in the pharynx for [ɑ].Based on both the empirical radiographic analysis from many languages and the theoretical three-parameter nomograms by Stevens and House (1955), Wood (1979) found there are four constriction locations for vowels:along the hard palate for [i-ɛ]-like and [y-ø]-like vowels, along the soft palate for [u-ʊ-ɨ]-like vowels, in the upper pharynx for [o-ɔ]-like and [ɣ]-like vowels, and in the lower pharynx for [ɑ-a-æ]-like vowels (see also Gunnilstam, 1974, for a purely theoretical inference of constriction locations from Fant (1960)’s three-parameter nomograms).He stated that these four narrowly constricted locations are those places where F1 and F2 are least sensitive to variations of constriction location and thus it supports Stevens’s proposal of the quantal nature of vowels.The new constriction location/degree model provides new dimensions for investigating speech production and sheds light on the uninvestigated aspects of tongue mechanism and even the underlying speech motor control system.In the past several decades, development of new techniques such as x-ray microbeam and electromagnetic articulograph makes it possible to collect large-scale articulatory data, based on which theoretical aticulatory models can be tested by the empirical data.Pilot studies of Perkell & Nelson (1982, 1985) and Perkell & Cohen (1989), using articulatory data obtained from x-ray microbeam, tested the articulatory stability of the production of three quantal vowels [i a u].Their results basically supported Stevens’s quantal theory.More recently, following Perkell and his coworkers’ pilot studies, Beckman et al.(1994), using a more quantifiable method, revealed that many vowels, not just the point vowels [i u ɑ], “have target articulatory constrictions and are more or less quantal, in the sense that constriction degree seems to be controlled more precisely than constriction location” (pp.489).The palatal vowel [i] has the smallest variability, the velar vowel [u] shows the next smallest variability, and the other vowels [ɑ æ ɚ] are less quantal.