Phorum 2008

SCHEDULE OF TALKS: Fall 2008 

SEPTEMBER 8 - SAM TILSEN, UNIVERSITY OF CALIFORNIA, BERKELEY: EVIDENCE FOR COVARIABILITY OF RHYTHMIC AND INTERGESTURAL TIMING

It is well-known that temporal patterns in speech occur on multiple timescales, but we understand less clearly how these patterns on different timescales interact with each other. Speech gestures normally occur on a relatively fast timescale. Previous work has shown that there exists a gestural c-center effect in complex syllable onsets (e.g. [spa]), whereby initiation of tongue blade and lip movements associated with [s] and [p] are equally displaced in opposite directions from initiation of the tongue body movement associated with the vowel [Browman, C. and Goldstein, L., Phonetica 45, 140-155. (1988)]. In contrast, speech rhythms occur on a relatively slower timescale. In metronome-driven phrase repetition tasks, rhythmic timing is more variable for higher-order target ratios of intervals between stressed syllables and phrases [Cummins, F. and Port, R., Journal of Phonetics 26, 145-171. (1998)].

An experiment was conducted to investigate how the above gestural and rhythmic patterns interact. Gestural kinematics were recorded using electromagnetic articulometry during a repetition task in which subjects uttered the phrase "take on a spa" to two-beat metronome rhythms of varying difficulty. Significantly greater variability in relative timing of tongue and lower lip movements associated with [s] and [p] was observed with the more difficult and variable target rhythms, which demonstrates that rhythmic and gestural systems interact in a non-trivial way. This means that--in terms of an analogy--the "speech box step" is performed with more temporal variability during the more difficult "speech waltz" than during the "speech rumba".

A dynamical model will be presented that is capable of simulating the observed covariability patterns. Building upon previous work from other researchers, the model treats phrase, foot, syllable, and gestural systems as limit-cycle oscillators, which synchronize through multi-frequency phase-coupling. In the presence of noise, stronger coupling between rhythmic systems results in lower intergestural variability. The model shows how hierarchical temporal patterns involving prosodic and gestural structure can be usefully conceptualized with dynamical systems.

SEPTEMBER 15 - HONG-YING ZHENG, THE CHINESE UNIVERSITY OF HONG KONG: TEMPORAL DYNAMICS OF LEXICAL TONE PROCESSING IN BRAIN INDEXED BY ERP

The talk presents a study which investigates the electrophysiological signature for Categorical Perception (CP) of lexical tones. Mandarin level and rising tones and nonspeech pitches counterparts are used. Two time regions of MisMatch negativity (MMN) corresponding to level and rising portion of pitch contours are identified. Only the first region of MMN is modulated by deviant types. It is concluded that 1) CP effect of lexical tones has been observed in the pre-attentive stage, mainly from the level portion; 2) in the same stage, brain activates differently to lexical tones from nonspeech pitches; 3) we attribute all the difference observed to language experience but further studies are needed.

SEPTEMBER 22 - KEITH JOHNSON, UNIVERSITY OF CALIFORNIA, BERKELEY: THE LINGUISTIC BASIS OF PHONETIC COHERENCE

Drawing on evidence from (1) listeners' tolerance for wide phonetic variability in speech perception, (2) the gestural nonidentity of speech sounds, and (3) the language specificity of linguistic phonetics, I will argue that phonetic coherence is based in the linguistic equivalence rather than a (hypothetical) gestural or auditory equivalence of speech sounds. Further, consideration of talkers' and listeners' sensitivity to context suggests that the linguistic frame of reference for speech is highly adaptable and is sensitive to the contexts in which speech is produced.

SEPTEMBER 29 - JESSICA MORRISON, UNIVERSITY OF CALIFORNIA, BERKELEY: PHONOLOGICAL FORM INFLUENCES MEMORY FOR FORM-MEANING MAPPINGS IN ADULT SECOND-LANGUAGE LEARNERS

Previous studies have examined the role that L2 word form plays in production and recall abilities (deGroot, 2006; Ellis and Beaton, 1993). This talk will focus on whether phonological form also affects L2 learners' ability to learn meanings. In particular, we ask whether hard-to-pronounce words, defined as having phones/phone combinations not present in the learner's native language, are more difficult to learn meanings for, and further, if learnability differences are due to interference from difficulty with production or more general representational difficulties.

We exposed participants to Polish word-novel object pairs, some easy- and some hard-to-pronounce, and tested their ability to match words with their meanings. Results showed that upon initial testing, only participants who repeated words aloud showed more difficulty with hard-to-pronounce words. Further experiments showed that this effect may result from forcing learners to attend to the difficulty of these word forms, as the effect could be reproduced with two other means of drawing attention to the words, subvocal repetition and hearing another English speaker repeat the words. In a final experiment, participants were engaged in an articulatory suppression task during learning; these participants also showed more difficulty learning meanings for hard-to-pronounce words, demonstrating that this effect cannot simply be attributed to interference from producing difficult words. The results of this study suggest that more difficult phonological forms lead to weaker representations of the word form, which is more difficult to link with meaning in memory.

OCTOBER 6 - RONALD L. SPROUSE (UNIVERSITY OF CALIFORNIA, BERKELEY), MARIA-JOSEP SOLÉ (UNIVERSITAT AUTÒNOMA DE BARCELONA), JOHN J. OHALA (UNIVERSITY OF CALIFORNIA, BERKELEY): ORAL CAVITY ENLARGEMENT IN RETROFLEX STOPS. EXPERIMENTAL DATA AND PHONOLOGICAL PATTERNS.

It is generally recognized that front-articulated stops are more compatible with voicing than back-articulated stops because the larger oral volume and compliance of front stops allows for longer trans-glottal airflow. Phonological patterns suggest that retroflex stops may be an exception to this pattern and may be more compatible with voicing than their dental/alveolar counterparts. An experiment was conducted in which an artificial leak was created in three speakers, and the effect of the abrupt closure of this leak on voicing was measured for [b d ɖ g]. The results of this experiment show that [b d g] follow the pattern of longer voicing for more front articulations. The retroflex was an exception — for two speakers voicing for [ɖ] persisted longer than for [d], and for one speaker [ɖ] voicing persisted longer than for [b d]. We believe that the greater surface area presented by the concave shape of the tongue during retroflexes as compared to dentals/alveolars allows for greater passive cavity expansion (i.e. compliance) and is a possible explanation of the observed pattern.

In addition to the differences found during the prolonged steady-state stops, another factor may favor a longer voicing in natural connected speech: some forward movement of the tongue apex during the retroflex articulation. That is, the movement over time of retroflex sounds may by itself change vocal tract volume during the stop closure.

OCTOBER 13 - ALEX BRATKIEVICH (UNIVERSITY OF CALIFORNIA, BERKELEY), RAMON ESCAMILLA (UNIVERSITY OF CALIFORNIA, BERKELEY), MARILOLA PEREZ (UNIVERSITY OF CALIFORNIA, BERKELEY), HANNAH PRITCHETT (UNIVERSITY OF CALIFORNIA, BERKELEY), RUSSELL RHODES (UNIVERSITY OF CALIFORNIA, BERKELEY): DISCUSSION OF AO PHONOLOGY

Ao is a Tibeto-Burman language spoken in Nagaland (far north-eastern India). In addition to presenting our current hypotheses about the segment inventory, phonological processes, and the tonal system, we will discuss the following issues in more depth: the status of word-final glottal stop, VOT distinctions, syllable structure, and variability in the pronunciation of vowels.

OCTOBER 20 - MOLLY BABEL, UNIVERSITY OF CALIFORNIA, BERKELEY: CONVERGENCE AND DIVERGENCE IN NEW ZEALAND ENGLISH

A strict reading of exemplar-based models of speech perception and production (e.g. Goldinger 1998) assumes phonetic convergence is a natural cognitive reflex. At first pass, Pickering and Garrod (2004) also adopt the view that convergent speech behavior at all levels — phonetic, syntactic, lexical — is automatic and without social consideration. Research supporting the automatic view has found that phonetic convergence interacts with speaker-independent factors like response latency and word frequency (Goldinger 1998). Recently, Nielsen (2007) found an interaction between convergence and abstract phonological knowledge. On the other hand, Communication Accommodation Theory (CAT; Giles & Coupland 1991) has always maintained that convergent and divergent speech behavior arises from talkers wanting to reinforce socially meaningful differences. Bourhis and Giles (1977) find that convergence and divergent speech patterns predictably occur in certain social situations. Within a more experimental tradition, Namy et al. (2002) find that female participants converged more than male participants in a shadowing experiment. Pardo (2006), however, found male participants to converge more than female participants in a collaborative map task. These results suggests that some aspects of phonetic convergence are affected by social factors.

In this talk I report on an experiment designed to replicate Bourhis & Giles (1977) when solely looking at phonetic behavior. Participants who were native speakers of New Zealand English were either insulted (the negative condition) or flattered (the positive condition) by a speaker of Australian English in the midst of completing a word repetition task. The stimuli consisted of words involved in the ear/air merger and monophthongs from the lexical sets KIT, DRESS, TRAP, BARN, THOUGHT, and STRUT. After the speech production task, participants completed an implicit association task (IAT; Greenwald et al., 1998) that examined implicit biases towards Australia. Thus far, the speech behavior in the monophthongs has been analyzed. Talkers imitated the Australian only in the DRESS and TRAP vowels in the shadowing block. These two vowel sets involve some of the greatest differences between Australian and New Zealand English, but do not correspond to the most salient accent differences (Bayaard, 2000). The experimental condition did not influence the results. Participants' scores on the IAT did, however, correlate with the degree of convergence. Participants were most likely to imitate the Australian talker if their IAT score reflected a positive association with Australia. This correlation was apparent in both the shadowed and post-task blocks. The results of this experiment have implications for how social cognition should be reflected in exemplar theories of speech production and perception.

OCTOBER 27 - OLGA DMITRIEVA, STANFORD UNIVERSITY: EXPLAINING UNIVERSALS IN GEMINATE DISTRIBUTION AND TYPOLOGY: EXPERIMENTAL EVIDENCE FROM RUSSIAN

Cross-linguistically, geminates tend to occur in certain kinds of phonetic environments: e.g. intervocalically, and after a short stressed vowel (Thurgood 1993). They also show cross-linguistic preferences in manner of articulation: low sonority and voiceless geminates are encountered more frequently than voiced and sonorant geminates (Podesva 2000, 2002). These principles are categorical in some languages, while in others they emerge as statistical tendencies. In Russian, geminates adjacent to stressed vowels are less likely to degeminate than those adjacent to unstressed vowels. Among geminates adjacent to stressed vowels, the ones preceded by stress are less likely to degeminate than the ones followed by stress. Consonants degeminate less often if they are word-initial compared to word-final. Intervocalic geminates are more protected from degemination than pre-consonantal geminates. Among different manners of articulation, stops and fricatives degeminate less often then nasals and liquids (Kasatkin and Choj 1999, Dmitrieva 2007).

Such uniformity in geminate distribution and typology across languages warrants an explanation. In this study, I explore the hypothesis that certain phonetic environments and manners of articulation provide articulatory and perceptual advantages for the realization and preservation of the contrast between geminates and singletons. Previous researchers have suggested that the lower ratio between geminate and singleton duration may lead to potential contrast neutralization (Aoyama and Reid 2006, Blevins 2004). I propose that it is not only the relative durations of the geminates and singletons that determine the quality of the contrast, but also the location of the perceptual boundary between the two categories. Based on experimental evidence from Russian, I show that just as the durations of the geminate and singletons varies depending on the phonetic environment and the type of consonant, so does the location of perceptual boundary. Moreover, the placement of the perceptual boundary in relation to the average duration of the geminates and singletons is crucial in predicting the amount of articulatory effort necessary to realize the contrast and the likelihood of a perceptually driven neutralization of the contrast. The prediction that geminates that are favored cross-linguistically will emerge as the easiest to maintain and the least susceptible to neutralization was supported in the majority of the cases. The results show that this method of evaluating the quality of the contrast is more reliable than the previously proposed geminate/singleton ratio and can provide an explanation for certain cross-linguistic universals in geminate distribution and typology.

NOVEMBER 3 - ABBY KAPLAN, UNIVERSITY OF CALIFORNIA, SANTA CRUZ: PERCEPTUAL, ARTICULATORY, AND SYSTEMIC INFLUENCES ON LENITION

Lenition is commonly understood as an articulatorily driven phenomenon, with lenited forms involving less articulatory effort than unlenited forms. Although intuitively plausible, this line of reasoning remains somewhat speculative because of the lack of experimental methods that are able to test articulatory effort directly. In addition, other potential sources of phonetic grounding for lenition (such as perception and neutralization avoidance) have been neglected. This paper examines the spirantization of intervocalic voiced stops in the light of these latter two types of phonetic pressure.

The perceptual experiment reported here tests whether the relative perceptibility of the segments involved in spirantization can account for the typology of lenition, given the framework of the P-map. The results show that perception is able to explain some, but not all, of the typological facts. Data from a cross-linguistic survey of lenition shows that a language's segment inventory interacts with its potential to spirantize, but that the effects of inventory are not enough to fill the gaps left by a perceptual account. I conclude that articulation may very well play a role in lenition, especially in the tendencies left unexplained by perceptual or systemic pressures. However, to the extent that perception *does* explain patterns of lenition, we must obtain more direct evidence of the articulatory effort involved before positing it as an explanation for the same facts.

NOVEMBER 10 - JOHN HOUDE, UNIVERSITY OF CALIFORNIA, SAN FRANCISCO: A MODEL OF SPEECH MOTOR CONTROL BASED ON STATE FEEDBACK CONTROL.

What is the role of the higher CNS (cortex, cerebellum, thalamus, basal ganglia) in controlling speech? Recent models of non-speech motor control would suggest the higher CNS monitors and controls the dynamic state of the articulators — i.e., articulator positions, velocities, or any other information needed to predict their future behavior in the current task. Such state information is not directly or instantly available from sensory feedback, and is instead hypothesized to be estimated within the higher CNS by a prediction/correction process where it (a) predicts the articulators' next state based on efference copy of the previous articulatory controls, (b) compares incoming sensory feedback with feedback expected from the predicted state, and (c) uses the difference to correct its state prediction. It then generates new articulator controls based on this corrected state estimate. Models based in this way on state feedback control (SFC) have been especially successful at predicting the ways people flexibly optimize their movements for different tasks, but they have received little attention thus far in speech motor control research. In this talk, I will describe an SFC model of speech motor control and compare it will Guenther's well-known DIVA model of speech motor control.

NOVEMBER 17 - YAO YAO, UNIVERSITY OF CALIFORNIA, BERKELEY: TO LEARN OR NOT TO LEARN: THE GROWING PATHS OF CHILDREN'S PHONOLOGICAL NEIGHBORHOODS

In this study, two children's speech data are examined for the development of phonological neighborhoods in the third year of life. The analysis shows that neighborhood density increases over time, but it is not necessarily the case that children acquire words from dense neighborhoods in adult lexicon first. Moreover, in the initial stage of acquisition, words that are added to the lexicon are from denser neighborhoods than words that are already acquired, but after a certain stage, as the backbone of the lexicon is formed, the trend is reversed. Our analysis suggests that several different forces are at play in early acquisition, leading the development of phonological neighborhoods in different directions.

NOVEMBER 24 - SHIRA KATSEFF, UNIVERSITY OF CALIFORNIA, BERKELEY: COMPENSATION FOR CHANGES IN AUDITORY FEEDBACK

This presentation investigates the contribution of phonetic information to motor planning through a speech adaptation experiment. As they spoke, subjects heard their voices through a pair of headphones. The experiment proceeded in four stages: (1) subjects' auditory feedback was unaltered; (2) subjects' feedback was slowly altered up to a set maximum; (3) Feedback alteration was held at that maximum shift; (4) subjects' feedback was returned to normal. Previous work demonstrates that when auditory feedback is shifted, subjects change their speech to oppose the feedback shift (e.g., Burnett, Freedland, Larson & Hain, 1998; Houde & Jordan, 2002, Purcell & Munhall, 2006).

Here, we explore the relative importance of F0, F1, and F2 to the representation of 'head'. Subjects participated in three speech adaptation experiments -- one for each formant -- on three different days. Formants were altered in randomized order, and all data was analyzed relative to a control condition where the subject went through the experiment without auditory feedback alteration.

Several trends emerge from the data. First, compensation is never complete. A subject might lower his or her F1 by 50 Hz in response to a total F1 feedback increase of 100 Hz . Second, compensation is more complete for small feedback shifts than for large feedback shifts, even when shifts are smaller than the baseline vowel space. Third, compensation is more complete for F1 shifts than for F2 shifts. Fourth, speakers appear to be tracking formant ratios rather than absolute values of formants.

These results support theories of speech production that incorporate both acoustic and sensorimotor feedback. Because auditory feedback is altered while motor feedback is not, feedback from these two sources can conflict. For small shifts in auditory feedback, the amount of potential conflict is small and the normal motor feedback does not affect compensation. But for large shifts in auditory feedback, the amount of conflict is large. Abnormal acoustic feedback pushes the articulatory system to compensate, and normal motor feedback pushes the articulatory system to remain in its current configuration, damping the compensatory response.

DECEMBER 1 - CHARLES CHANG, UNIVERSITY OF CALIFORNIA, BERKELEY: RESTRUCTURING PHONETIC SPACE IN SECOND LANGUAGE ACQUISITION: PERCEPTION VS. ABSTRACTION

Many previous researchers have examined factors influencing the way in which phonological categories from a first (L1) and second (L2) language are linked, particularly in loanword adaptation and L2 acquisition (e.g. Hyman 1970, Flege 1987, Best 1994, Kang 2003, Kenstowicz 2003, Peperkamp and Dupoux 2003, Broselow 2004, LaCharité and Paradis 2005, Yip 2006, Heffernan 2007). One question that remains is how cross-linguistic "equivalence classification" between categories occurs when it is not just one L2 sound, but two L2 sounds that stand to be assimilated to an L1 sound. One L2 sound can be assimilated, but what about the other one? In this case learners must create at least one new category if they are to preserve a contrast between the two L2 sounds.

I address the question of novel L2 category formation by examining the acquisition of the three-way Korean laryngeal contrast among lenis, fortis, and aspirated stops by 27 native speakers of American English having no previous experience with the language. Here I report results from an imitation experiment in which learners repeated a two-dimensional continuum of Korean syllables differing in the primary cues to the Korean laryngeal contrast (cf. Kim 2004, among many others): voice onset time (VOT) and fundamental frequency (f0) onset. How quickly do learners develop the extra category they need to effect a three-way laryngeal contrast?

Acoustic analyses of speakers' productions indicate that after three weeks (= 39 hours) of immersion instruction, novice learners generally still command only two categories (short-lag VOT and long-lag VOT), while native speakers make use of three categories: short-lag VOT (fortis); medium/long-lag VOT + low f0 onset (lenis); and long-lag VOT + high f0 onset (aspirated). However, the most striking difference between these two groups is that for novice learners, productions within each cluster of responses show much better correspondence to the parameters of the stimuli than is the case for native speakers: without strong categorical scaffolding for L2, novice learners continue to attend to L2 speech at a phonetically detailed level, while native speakers are predisposed to simply categorize what they hear and disregard phonemically irrelevant aspects of the signal.

These results are consistent with one of the fundamental postulates of Flege's (1995) Speech Learning Model – that adults retain, rather than lose, the perceptual mechanisms used in learning their L1 sound system. Where novice learners in this study seem to come up short is in the move from perception of details to abstraction to categories. These findings suggest that, left to its own devices, L2 category development is rather slow, and that explicit phonetic instruction is probably required to achieve the formation of a new L2 category that is similar in structure to that of native speakers (cf. Catford and Pisoni 1970).

DECEMBER 8 - PARTICIPANTS IN LINGUISTICS 210: A REPORT ON THE STATE OF THE ART IN DESCRIPTIVE PHONETICS

Participants in Linguistics 210 will present articles from the July, 2008 issue of Journal of Phonetics, which is a special issue titled "Phonetic Studies of North American Indigenous Languages", edited by Joyce McDonough and Doug Whalen.


SCHEDULE OF TALKS: Spring 2008 

JANUARY 28 - GABRIELA CABALLERO, UNIVERSITY OF CALIFORNIA, BERKELEY: "VARIABLE AFFIX ORDERING IN CHOGUITA RARA'MURI (TARAHUMARA): PARSABILITY, SEMANTIC SCOPE, AND SELECTIONAL RESTRICTIONS IN AN AGGLUTINATING LANGUAGE"

Research on the principles underlying affix ordering has provided solutions for the analysis of typologically diverse languages. It has been proposed, for instance, that semantic compositionality (scope) and templatic restrictions interact in different ways to shape affix sequences in Athabaskan (Rice 2000) and Bantu languages (Hyman 2002). Patterns of "free" variable affix order, on the other hand, have only been recently documented in very few languages, such as Kiranti (Sino-Tibetan) (Bickel et al. 2007) and Totonacan (Totonaco-Tepahua) languages (McFarland 2005, Beck 2007). This paper makes an empirical contribution by documenting another case of "free" variable affix order in Choguita Rara'muri (Tarahumara), an endangered Uto-Aztecan language, which also features other affix ordering and exponence phenomena that are independent of syntactic/semantic principles. Based on the analysis of morphologically complex constructions of this agglutinating language, it will be argued that although scope determines some of the attested suffix ordering patterns, it is phonological subcategorization and parsability, not templatic constraints, that underlie Choguita Rara'muri suffix combinatorics and exponence.

FEBRUARY 4 - CHRISTIAN DICANIO (UNIVERSITY OF CALIFORNIA, BERKELEY), KEITH JOHNSON (UNIVERSITY OF CALIFORNIA, BERKELEY), LAUREL MACKENZIE (UNIVERSITY OF PENNSYLVANIA): "PHONETIC EXPLANATIONS FOR NASAL RESTORATION"

One common historical development in languages with distinctively nasalized vowels is the excrescence of coda velar nasals in place of nasalized vowels. For example, the dialect of French spoken in the southwestern part of France (Midi French) is characterized by words ending in the velar nasal [N] where Parisian French has nasalized vowels and no final nasal consonant ([ savO)]~[savON] "soap"). More generally, there is a cross-linguistic tendency for the unmarked place of articulation for coda nasals, and perhaps also for stops, to be velar. In four experiments, we explored why the cross-linguistically unmarked place for the excrescent nasal is velar. The experiments test Ohala's (1975) acoustic explanation: that velar nasals, having no oral antiformants, are acoustically more similar to nasalized vowels than are bilabial or alveolar nasals. The experiments also tested an explanation based on the visual phonetics of nasalized vowels and velar nasals: velar nasals having no visible consonant articulation are visually more similar to nasalized vowels than are bilabial or alveolar nasals. American English listeners gave place of articulation judgments for audio-only and audio-visual tokens ending in nasal consonants or nasalized vowels. In the first and second experiments, we embedded recorded tokens of CVN (N = /m/, /n/, or /N/) words in masking noise and presented them in audio-only and audio-visual trials. We also synthesized "placeless" nasals by repeating pitch periods from the nasalized vowel to replace the final consonant in CVm with nasalized vowel. These stimuli provide a direct test of Ohala's acoustic explanation of coda velarity in nasals. The third and fourth experiments extended these results with tokens in which the last portion of CVN (N = /m/, /n/, or /N/) and Cx)syllables were obscured with masking noise. These experiments were designed to force listeners to assume the existence of a final consonant and to rely primarily on visual cues in a more direct test of the visual similarity of nasalized vowels and velar nasals. Taken together, the results of these four experiments suggest that excrescent coda nasals tend to be velar because nasalized vowels are both acoustically and visually similar to velar nasals.

FEBRUARY 11 - GRANT MCGUIRE, UNIVERSITY OF CALIFORNIA, BERKELEY: THE ROLE OF EXPERIENCE IN THE USE OF PHONETIC CUES

In order to be a competent perceiver of a language a listener must have full command of the relevant phonetic contrasts in that language. There are many aspects of the acoustic signal, or phonetic cues, that differentiate these contrasts and the use of these cues differs by linguistic background (Wagner et al. 2006) and development (e.g. Nittrouer 1992). However, the ways in which listeners come to know which cues are most relevant and how this affects perception is understudied. This talk reports data from three studies on cue use and learning, both in adults and infants, which focus on how experience affects cue use. In the first study data is presented demonstrating that knowledge of cues and their integrality is localized and based in specific linguistic experience with the relevant contrast. A second study demonstrates that learning to rely on specific cues heightens sensitivity to the relevant dimensions of contrast, changing the perceptual space. A final study examines how changing the distribution of tokens can influence which cues infants rely on to differentiate categories. Together, all three studies provide strong evidence that cue use is a localized phenomenon that develops with specific experience in the relevant contrast.

FEBRUARY 25 - KATIE DRAGER, UNIVERSITY OF CANTERBURY: EFFECT OF THE EXPERIMENTER: EVIDENCE FROM NEW ZEALAND ENGLISH

It is common practice to have multiple researchers meet with participants for a single experiment. However, post hoc analysis of experimental data collected in our lab in New Zealand (NZ) suggests that experimenter identity can affect performance on production and perception tasks. This talk will review these results and will discuss two follow-up experiments investigating the degree to which performance on a lexical access task can be influenced both by exposure to non-NZ accents and by the concept of countries other than NZ. These results will be explored within the context of an exemplar model of speech production and perception.

MARCH 3 - CHEN ZHONGMIN, UNIVERSITY OF CALIFORNIA, BERKELEY: ON THE IMPLOSIVES IN HAIYAN DIALECT

Haiyan is located in the Southeast corner of the delta of the Yangtze River. Its dialect is one of Wu Dialects of Chinese. Besides the tripartite distinction of obstruents (unaspirated voiceless, aspirated voiceless and murmured), which is one of the most important features of Wu dialects, there is another type of obstruents: implosives. In this study I discussed four different implosives in Haiyan dialect, including the rather rare implosive affricate, their acoustic features, and the relationship between implosives and tonal registers. I also discussed the historical development of the implosives and argued that the implosives were developed from murmured obstruents.

MARCH 10 - RACHELLE WAKSLER (SAN FRANCISCO STATE UNIVERSITY), LINDA WHEELDON (BIRMINGHAM UNIVERSITY), AND JENNY WING (BIRMINGHAM UNIVERSITY: FEATURE SPECIFICATION AND UNDERSPECIFICATION IN THE MENTAL LEXICON

Research in both the psycholinguistic and neurobiological arenas provides mounting support for phonologically underspecified lexical entries in the mental lexicon (e.g., Wheeldon & Waksler 2004, Friedrich, Eulitz & Lahiri 2006). Models of speech processing incorporating phonologically underspecified lexical entries (Lahiri & Reetz 2002) have so far assumed a traditional theory of underspecification in which all phonologically redundant features are underspecified in the lexical entry. Recent treatments of redundant features in theoretical linguistics, however, (e.g., Inkelas 1995, Ito, Mester, & Padgett, 1995, McCarthy 2003, Beckman & Ringen 2004) have argued that some redundant features are lexically specified. The scope of underspecification, i.e., whether underspecification is absolute, or whether some redundant features are represented in lexical entries, has not yet been examined in psycholinguistic research, and is the focus of our present study.

In two psycholinguistics experiments controlled for the syntax, semantics, and prosodic structure of the stimuli sentences, we investigate the degree of phonological abstractness in lexical entries using form priming and semantic priming tasks. Results from both experiments support phonological underspecification in the mental lexicon, and are consistent with predictions based on some redundant feature specification in lexical entries.

MARCH 17 - ROSEMARY ORR, ICSI SPEECH GROUP: A HUMAN BENCHMARK FOR LANGUAGE RECOGNITION

There has been quite some work carried out in the area of automatic language recognition. Systems are built, refined, fused, and evaluated, and latest results from, for example, the NIST Language Recognition Evaluations, show that automatic systems can recognize languages with as little as 3% error on a 10 second stretch of speech.

To date, little work has been done to see how well a human performs this task. Muthusamy reported, in 1994, that humans outperformed the automatic systems when recognizing their native languages or languages with which they were familiar. However, systems have improved greatly since then, and it is important to try to establish a solid human benchmark against which to compare current systems.

The challenges to be met for this task are not trivial, not least because of the difficulty inherent in the establishment of a suitable experimental paradigm. We have a well-organized and well-established method of evaluating machine performance, as well as a large multilingual database to use, all stemming from the NIST LRE evaluations in 2008. This has its drawbacks, in that the speech that we will use is telephone speech, and is only available in 14 languages. Furthermore, if we use the evaluation methods that were applied to machines, we must, in some way, classify the amount of training that human subjects have had in a language. For machines, this can be exactly measured in the number of hours training that the machine has had. For humans, we cannot make such estimates.

My presentation to the group will be a description of our current experiment, our planned experiment, and a sketch of the pilot data that we have found so far. I will be particularly glad of comments and suggestions, as the design is not yet set in stone, and we would like to have as solid a setup as possible for this work.

MARCH 31 - YUNI KIM, UNIVERSITY OF CALIFORNIA, BERKELEY: DIPHTHONGIZATION, FISSION, AND METATHESIS IN HUAVE

Huave (a language isolate of Oaxaca State, Mexico) has two sets of surface diphthongs: rising diphthongs like [ja], and falling diphthongs like [aj] and [oj]. None of these are present underlyingly. Rising diphthongs come from fission of underlying front vowels, where the features of a monophthong are split and distributed between two root nodes, while falling diphthongs come about through metathesis of secondary consonant palatalization onto a preceding vowel. Although these are superficially different processes, I claim that they are driven by common pressures: preservation of input features, and agreement for place of articulation at certain VC transitions. I formalize the analysis within Optimality Theory, drawing on Particle Phonology (Schane 1984, 1995) and the aperture node theory of Steriade (1993, 1994) to construct representations that are sufficiently nuanced to express the correct generalizations.

APRIL 7 - SHIRA KATSEFF, UNIVERSITY OF CALIFORNIA, BERKELEY: AN EXPERIMENT IN SENSORIMOTOR ADAPTATION: MAKING HUDS OUT OF HEADS

Sensorimotor speech adaptation refers to the process by which individuals modify their speech production on the basis of auditory feedback. The present study exploits this phenomenon by electronically modulating the feedback process in real time, enabling us to explore and characterize auditory "targets" of linguistic planning.

Among other interesting features, the methodology in this study consistently reveals that compensation for experimental perturbations in feedback is only partial: even after adjusting to the modified feedback, subjects' processed speech signals still differ from their respective baseline productions.

I present new evidence suggesting that such partial compensation is readily explained by a theory of linguistic targets consisting of both a motor and an auditory component. Additionally, I describe a future feedback experiment designed to measure the relative contributions of motor and auditory targets to speech adaptation.

APRIL 14 - LARRY HYMAN, UNIVERSITY OF CALIFORNIA, BERKELEY: ENLARGING THE SCOPE OF PHONOLOGIZATION

In this talk I have three goals: (i) to define and delimit the notion of "phonologization"; (ii) to determine how phonologization fits into the bigger picture; (iii) to discuss a few examples of (continued) interest to me, e.g. the effects of voiced obstruents ("depressor consonants") on pitch. I begin by considering the original definition of phonologization ("A universal phonetic tendency is said to become 'phonologized' when language-specific reference must be made to it, as in a phonological rule." (Hyman 1972:170)), a concept which can be traced back at least as far as Baudouin de Courtenay (1895 [1972:184]). Particular attention is paid to the role of contrast in the phonologization process. After presenting canonical examples of phonologization (particularly transphonologizations, whereby a contrast is shifted or transformed but maintained), I suggest that the term "phonologization" needs to be extended to cover other ways that phonological structure either changes or comes into being. Throughout the talk emphasis is on what Hopper (1987:148) identifies as "movements towards structure": the emergence of grammar (grammaticalization) and its subsequent transformations (regrammaticalization, degrammaticalization). After showing that phonologization has important parallels to well-known aspects of "grammaticalization" (Hyman 1984), I conclude that phonologization is but one aspect of the larger issue of how (phonetic, semantic, pragmatic) substance becomes linguistically codified into form.

APRIL 21 - PRACTICE TALKS

Talk 1 - Intergestural Inhibition Counteracts Phonologization

Sam Tilsen

In the Ohalan hypocorrective model, vowel-to-vowel coarticulation can lead to phonologization of vowel harmony. Something that is missing from this model is a mechanism for restricting phonologization--why dont all vowels in all words in all languages always eventually harmonize? Appeals to preservation of contrast or lexical faithfulness do not give us much insight into filling this gap in the theory. I present unexpected results from a cross-phonemic primed vowel shadowing experiment, which suggest the presence of a dissimilatory speech target-planning mechanism that opposes coarticulation and thereby restricts phonologization.

Talk 2 - The Phonetic Space of Phonological Categories in Heritage Speakers of Mandarin

Charles Chang, Erin Haynes, Russell Rhodes, and Yao Yao

Previous research on the phonological competence of heritage language (HL) speakers has found that even very limited childhood experience with an HL boosts pronunciation in comparison to second language (L2) learners with no prior experience (e.g. Au et al. 2002; Knightly et al. 2003; Oh et al. 2002, 2003). In the present study, we delve deeper into the question of categorical neutralization: though HL speakers may end up with better accents than L2 learners, do they make the same phonological distinctions as native speakers, in the same way and to the same degree?

Our study consists of a series of experiments on the realization of Mandarin and English phonological contrasts by Mandarin HL speakers, in comparison to native Mandarin speakers and L2 learners of Mandarin. Our previous experiment (cf. Chang et al. 2008) examined the place contrast between Mandarin retroflex /ṣ/ and alveolo-palatal /ɕ/, as well as English palato-alveolar /ʃ/. We found that both native speakers and L2 speakers tend to merge Mandarin retroflex and English palato-alveolar (not necessarily in the same direction, though), while HL speakers tend to keep them apart. In addition, there was a correlation between HL speakers' amount of exposure to Mandarin and their production performance, with the most advanced HL speakers patterning with native speakers and the least advanced with L2 learners.

In this talk, we report two other experiments along the same lines. Experiment 1 examined the place contrast in the back rounded vowels, and Experiment 2 examined the laryngeal contrast in Mandarin compared to English. 16 speakers of Mandarin participated: 5 native speakers, 8 HL speakers, and 5 L2 learners. The results of Experiment 1 show that almost all speakers distinguish all four back vowels: English /u/, English /ou/, Mandarin /u/, and Mandarin /ou/. The difference between English /u/ and Mandarin /u/, as well as the difference between English /ou/ and Mandarin /ou/, mainly lies in the degree of backness. English back vowels are more fronted than Mandarin ones for all speakers. More interestingly, the degree of backness in both English and Mandarin back vowels increases with the speakers' experience in Mandarin (i.e. L2 < HL < native). There is also a tendency for HL speakers to maintain the largest distance between English back vowels and Mandarin back vowels, which resonates with the conclusions of our previous experiment on fricatives. The results of Experiment 2 show that all speakers maintain a VOT difference between Mandarin unaspirated stops and aspirated ones. Moreover, almost all of them also distinguish Mandarin aspirated stops from English voiceless (aspirated) ones by having longer VOTs for Mandarin. This pattern is most consistent among native and HL speakers, and least so in the L2 group.

In summary, the current results support our previous finding that HL speakers tend to be better at maintaining contrasts between "similar" categories in two languages, probably due to the fact that they have early exposure to both languages. Our data also show that there is a wide range of possibilities in terms of language production in the HL group, and that a continuum in the amount of exposure to the heritage language can probably be found to correlate with their production in both languages.

APRIL 28 - REIKO KATAOKA, UNIVERSITY OF CALIFORNIA, BERKELEY: MECHANISMS OF SOUND CHANGE: A STUDY ON PERCEPTUAL COMPENSATION FOR /U/-FRONTING

Listener's identification of speech sounds are influenced by perceived characteristics of surrounding sounds (e.g. Ladefoged and Broadbend, 1957; Lindblom and Studdert-Kennedy, 1967). For example, listeners 'compensate' for expected coarticulatory effect on the speech sound when the context is clearly detected. In this talk, I will present the results from a series of experiments investigating perceptual compensation for /u/-fronting in alveolar contexts. I will argue that perceptual compensation has both cognitive and mechanical causes and cognitively based compensation is responsible for hypo- and hyper-corrective speech misperception.

Experiment 1 replicated the experiment by Ohala and Feder (1994). It demonstrates that American listeners judge a vowel stimulus which sounds between /i/ and /u/ to English ear as more frequently and more easily as /u/ in alveolar context than in bilabial context, and do so with cognitively 'restored' context as well as acoustic context.

Experiment 2 shows that perceptual compensation becomes stronger as speech rate of precursor sentence increases. The results from Experiment 1 and 2 suggest that listeners use both cognitively based categorical compensation and mechanically based gradient compensation.

Experiment 3 investigates the role of speech production in perceptual compensation. Moderate correlation between degree of /u/-fronting in production and perceptual boundary of /i/-/u/ categories was obtained, suggesting a link between speech production and speech perception.

MAY 5 - JOHN OHALA, UNIVERSITY OF CALIFORNIA, BERKELEY: A BRIEF, INTERPRETIVE HISTORY OF PHONOLOGY OVER THE PAST 2.5 MILLENNIA

Phonology, the superordinate discipline that includes phonetics, has made stellar progress in (1) describing languages phonologically and (2) figuring out the phonological history of languages (documenting sound change, family relationships, etc). (Refs. to Panini, Al Khalil, the Icelandic "Grammarian", King Sejong, van Boxhorn, ten Kate, des Brosses, Sajnovics, Hervas, Rask, Grimm et al.) All of this developed gradually from approx. 5th c. BPE to, say, the 18th c. AD. Beginning in the late 19th c. and continuing into the 20th, attention turned to the psychological underpinnings of languages' phonology (e.g., Meringer & Meyer 1895, Saussure 1916, Sapir 1933, Chomsky & Halle 1968). My claim: the claims made about the psychological basis of language are, for the most part, just re-application of the methods and concepts used in the description of languages' sound patterns and their historical reconstruction. From 1968 on mainstream phonologists -- and this now includes OT-iose phonology -- have assumed that phonological alternations such as 'want to' and 'wanna' or even 'cock' (the bird) and 'chicken' have common underlying forms with different surface forms depending on differential application of ordered processes ("constraints", if one prefers). Understanding the true psychological basis of sound patterns will require (is this a surprise?) new methods and new data beyond what was obtained up to the 19th century. Examples of the new methodology will be given.

MAY 12 - EURIE SHIN, UNIVERSITY OF CALIFORNIA, BERKELEY: CROSS-DIALECT IDENTIFICATION OF CONSONANTS IN SEOUL AND NORTH KYUNGSANG KOREAN

The goal of this study is to examine important perceptual cues for stops and fricatives of Seoul Korean and North Kyungsang Korean. North Kyungsang Korean is spoken in the Southeast part of Korea, and it has lexical tones whereas Seoul Korean does not although both dialects are mutually intelligible. Previous studies have found various acoustic cues (e.g. VOT, F0, H1-H2, closure duration, fricative noise duration, and aspiration noise duration) for the three-way contrast of stops and the two-way contrast of fricatives in Seoul Korean (Johnson & Oh 1995, Cho et al. 1999, Choi 2002, and Chang 2007); however, the perceptual characteristics of the same consonants in North Kyungsang Korean have not been investigated extensively. In this study, I examine the perceptual cues of consonants in both Seoul and North Kyungsang Korean by testing (1) VOT, F0, and H1-H2 of the following vowel for stops in word-initial position, (2) VOT, stop closure duration, and H1-H2 of the following vowel for stops in word-medial position, and (3) frication noise duration, aspiration duration and H1-H2 of the following vowel for fricatives in word-initial position. The perception experiment results from thirty-two native speakers of Seoul Korean and thirty native speakers of North Kyungsang Korean are discussed focusing on the important cues for the identification of consonants in both dialects and the cross-dialect differences.