Phorum 2009

Fall 2009

AUGUST 31 - TAFFETA ELLIOTT, THEUNISSEN LABORATORY, HELEN WILLS NEUROSCIENCE INSTITUTE, UC BERKELEY: SPECTROTEMPORAL MODULATIONS ESSENTIAL FOR SPEECH COMPREHENSION AND SPEAKER GENDER IDENTIFICATION

The acoustic signal of speech is rich in temporal and frequency patterns. These power fluctuations in time and frequency are called modulations. Spoken words remain intelligible after drastic degradations in either time or frequency. To fully understand the perception of speech, and to be able to reduce the speech signal to essential components, we need to completely characterize how modulations in amplitude and frequency contribute together to the comprehensibility of speech. Hallmark research has distorted speech in time and frequency, but the manipulations have been described only in terms of one domain or the other, without quantifying the remaining and missing portions of the signal. We used a novel sound filtering technique to systematically investigate the spectrotemporal modulations that are crucial for understanding speech. Our conceptually new filtering procedure operates within a framework that completely describes the spectral information left intact by temporal smearing, and the temporal information left after smearing in frequency. Both the modulation-filtering approach and the resulting characterization of speech could be used to reduce the bandwidth of speech while best preserving intelligibility. They could potentially change the fundamental terms by which researchers characterize communication signals.

SEPTEMBER 14 - DASHA BULATOVA, UNIVERSITY OF CALIFORNIA, BERKELEY: THE EFFECT OF FUNDAMENTAL FREQUENCY ON PHONETIC CONVERGENCE

This talk will describe research done within the framework of an undergraduate honors thesis at UC Berkeley, under the advisorship of Professor Keith Johnson. This study examines the importance of F0 in the process of phonetic convergence using an immediate-repetition or "shadowing task". Previous research has suggested that F0 facilitates the transmission of social information that individuals can use to assess their social orientation in regards to a talker (Gregory et al., 1991, 1996, 2001). Social theories of accommodation assert that this process mediates a subconscious decision to converge (imitate) or diverge in speech. My data supported this hypothesis, where participants who shadowed a talker whose F0 had been high-pass filtered imitated less than participants who shadowed the talker's full range of speech. In addition, several compelling interactions between gender and imitation point to the need for a new, integrative model that allows for social processes to intervene in an exemplar-based perception-production link.

SEPTEMBER 21 - NORMA MENDOZA-DENTON, UNIVERSITY OF ARIZONA: THE FRESCA MODEL: MAKING SOCIAL SENSE OF EXEMPLAR-THEORETIC ACCOUNTS IN SOCIOPHONETICS.

The purpose of this discussion is twofold: 1) to draw together some themes that have emerged in recent work in linguistics, especially from interfaces between phonetics and phonology in sociolinguistics, and 2) to argue that we are in fact facing a state of what Kuhn (1962), in his study The Structure of Scientific Revolutions, called "crisis science." What impact have quantitative and experimental phonetic and sociolinguistic approaches had on the production of this crisis? These themes will be synthesized under an exemplar-theoretic framework in which I posit Frequency, Recency, Expectation, Social saliency, Clustering, and Agency, as the primary components that can help bring psycholinguistic exemplar theory more in line with the sociolinguistic literature. Throughout we consider phonological, phonetic, and sociolinguistic implications for communities of speakers, as well as for the changing paradigms in our field.

SEPTEMBER 28 - KEITH JOHNSON, UNIVERSITY OF CALIFORNIA, BERKELEY: A STUDY ON COMPENSATION FOR COARTICULATION

In speech perception there is a controversial perceptual phenomenon called compensation for coarticulation. In this phenomenon listeners label speech as if they are subtracting coarticulation caused by a neighboring segment. For example, when an /r/ precedes a segment that is ambiguous between /d/ and /g/ listeners report more /d/'s than they do when an /l/ precedes. This perceptually subtracts the retracting effect of the preceding /r/. Some types of sound change may be due to perceptual parsing such as this.

However, compensation for coarticulation is controversial in the psycholinguistics literature because some see it as evidence for a "gesture recovery" model of speech perception (Mann, 1980), while others (Lotto & Kluender, 1998) have suggested that the effects are due to a low level frequency contrast effect in auditory perception. This latter "auditory contrast" theory gives a quite different view of how speech perception influences sound change.

I will present experimental results that (1) replicate a famous case of compensation for coarticulation (Mann, 1980), (2) extend the effect to much more natural sounding stimuli (where the effect is much weaker even though the frequency contrasts are comparable), and (3) show that compensation for coarticulation is influenced by the perceived context even when the speech signal is unchanged (extending Ohala & Feder's, 1994 study). The overall implication of this work is that compensation for coarticulation must involve at least some gestural or linguistic component, and therefore that the "auditory contrast" theory is insufficient.

Lotto, A.J. & Kluender, K.R. (1998) General contrast effects in speech perception: Effect of preceding liquid on stop consonant identification. Perception and Psychophysics 60, 602-619.

Mann, V.A. (1980) Influence of preceding liquid on stop-consonant perception. Perception and Psychophysics 28, 407-412.

Ohala, J. J. & Feder, D. 1994. Listeners' identification of speech sounds is influenced by adjacent "restored" phonemes. Phonetica 51, 111-118.

OCTOBER 5 - WILL CHANG, JOHN SYLAK, AND MELINDA WOODLEY, UC BERKELEY: QUICHUA PHONOLOGY SO FAR: A PRESENTATION OF ONGOING WORK BY STUDENTS IN LING 240A (FIELD METHODS)

In this talk, we will present our preliminary findings on the phonology of Imbabura Quichua, a dialect of Quechua spoken in Ecuador. This work is ongoing and still evolving as we elicit new data; consequently, we will be presenting our best hypotheses to date, along with some discussion of the theoretical and practical difficulties involved in collecting phonetic and phonological data from scratch.

OCTOBER 12 - TE-HSIN LIU, UNIVERSITY OF CALIFORNIA, BERKELEY: THE PHONOLOGY OF INCOMPLETE TONE MERGER IN DALIAN

The theory of tone merger in Northern Chinese dialects was first proposed by Wang (1980), and further developed by Lien (1986), the migration of IIb (Yangshang) into III (Qu) being a common characteristic. The present work aims to provide an update on the current state of tone merger in Northern Chinese, with a special focus on Dalian, a less well-known Mandarin dialect spoken in Liaoning province in Northeast China.

According to Song (1963), four lexical tones are observed in citation form, i.e. 312, 34, 213 and 53 (henceforth Old Dalian). Our data obtained from a young female speaker of Dalian (henceforth Modern Dalian) suggest an inventory of three lexical tones, i.e. 51, 35 and 213. The lexical tone 312 in Old Dalian, derived from Ia (Yinping) is merging with the falling contour tone, derived from III (Qu). This tendency is consistent with some dialects spoken in the neighboring Shangdong province, where a reduced tonal inventory of three tones is becoming more and more frequent in the last decade.

However, the tone merger in Modern Dalian is incomplete on two grounds. On the one hand, a slight phonetic difference is observed between these two falling tones: both of them have similar F0 values, but the falling contour derived from Ia (Yinping) has a longer duration compared with the falling contour derived from III (Qu). Nevertheless, the speaker judges the contours to be the same. On the other hand, the underlying contrasts of these two contours surface in tone sandhi contexts, such that the lexical tone 312 of Old Dalian emerges in combination forms in Modern Dalian. A phonological analysis will proposed to account for the apparently complex tone sandhi rules in Modern Dalian.

OCTOBER 19: RONALD L. SPROUSE (UNIVERSITY OF CALIFORNIA, BERKELEY), MARIA-JOSEP SOLÉ (UNIVERSITAT AUTÒNOMA DE BARCELONA), JOHN J. OHALA (UNIVERSITY OF CALIFORNIA, BERKELEY): TRACKING LARYNGEAL GESTURES AND THEIR IMPACT ON VOICING

It is known that vertical displacement of the larynx changes the size of the oropharyngeal cavity and that such changes affect the oral pressure build-up during stop sounds (Rothenberg, 1968, Ohala 1983). A number of studies have observed the relationship between larynx height and stop voicing, with the larynx tending to be lower for voiced than for voiceless stops in Swedish, American English, French, and Thai (e.g., Ewan and Krones 1974). The present study examines the relation between larynx movement during closure and oral pressure build-up in pulmonic and non-pulmonic stops, and its consequences for voicing maintenance. A number of phonological patterns involving prolonged voicing during stops and historical implosivization of voiced stops may be related to adjustments in the timing and rate of larynx movement gestures.

High-speed video recording with subsequent automatic image processing was used to track the vertical and horizontal movement of the larynx in two male American English speakers (trained phoneticians) during the production of utterance-initial and intervocalic fully voiced stops, and intervocalic voiceless and nasal stops, long stops, and implosives in the context of high and low vowels. Oral pressure and acoustic data were collected simultaneously. Measures of larynx displacement (along the diagonal plane) were related to oral pressure values, and amplitude of voicing in the different consonant types and contexts.

OCTOBER 26 - GRANT MCGUIRE, UNIVERSITY OF CALIFORNIA, SANTA CRUZ: SELECTIVE ATTENTION AND GENERALIZATION OF DIMENSIONS IN SPEECH-LIKE STIMULI

Speech categories are multi-dimensional concepts in that there many cues are available to a listener for the categorization of a given contrast. These dimensions must be properly attended to and weighted by listeners (Raphael 2005). This paper presents results from a series of perceptual learning experiments exploring the role of selective attention in dimensional learning and generalization in non-speech stimuli. Specifically, listeners were trained to categorize different regions of a stimulus space using one or two dimensions. They were then asked to categorize novel stimuli in an adjacent region to assess transfer.

Results demonstrate that trained dimensions were preferred for categorizing novel sounds and that any bias towards a dimension significantly increased reliance on that dimension, regardless of its relevance to the task. Interestingly, while most listeners who were trained to categorize using both dimensions in a specific relationship did indeed tend to use both dimensions when categorizing novel stimuli, the relationship between the dimensions did not transfer. In terms of speech categories, these results suggest that listeners will attempt to use learned cues in new contexts, but that the specific relations between cues may be less adaptable.

NOVEMBER 2 - MICHAEL GROSVALD, UNIVERSITY OF CALIFORNIA, DAVIS: A PRODUCTION AND PERCEPTION STUDY OF COARTICULATION IN AMERICAN SIGN LANGUAGE

This project examines the production and perception of long-distance coarticulation in ASL. Five signers were filmed while signing ASL sentences and additionally, were outfitted with motion-capture sensors via which the three-dimensional coordinates of key points of the signer's body (e.g. the palm of each hand) could be recorded during the course of the signing of each sentence. Evidence of coarticulatory effects of one sign on another were found across up to three intervening signs, though they were generally weaker than effects found in analogous spoken-language studies (Magen, 1997; Grosvald, 2009). This difference appears to be due in part to greater variability among these signers in their articulatory behavior, relative to that of users of spoken language.

A perception experiment using stimuli derived from filmed excerpts of the production experiment showed that both deaf signers and hearing non-signers were sensitive to these coarticulatory effects, though again to a lesser degree than has been found for listeners in spoken-language studies. One possible explanation for this difference is that the visual modality offers direct perceptual access to the relevant articulators, so perceivers of spoken languages might rely more than sign language users on extra cues such as coarticulatory information.

NOVEMBER 9 - YAO YAO, UNIVERSITY OF CALIFORNIA, BERKELEY: PHONOLOGICAL NEIGHBORHOOD DENSITY AND SPEECH PRODUCTION: WHAT WE KNOW AND WHAT WE DON'T

Phonological neighborhood density (PND) refers to the number of words that are phonologically similar to a given word (e.g. cat and kit are phonological neighbors of each other). Previous research has shown that unlike other frequency measures, PND has different effects on comprehension and production. High-PND has an inhibitory effect on comprehension but a facilitatory effect on production (Dell & Gordon, 2003 for a review). My work focuses on the effect of PND on spontaneous speech production. In this talk, I will first briefly review my previous findings with PND and word duration. Then I will present current results with PND and vowel production, followed by a discussion on possible interpretations that integrate results from both the duration study and the vowel study.

NOVEMBER 16 - NATHANIEL W. DUMAS, UNIVERSITY OF CALIFORNIA, BERKELEY: LANDAR'S HYPOTHESIS AND VARIATIONAL DUPLICATION AS SOCIOLINGUISTIC PRACTICE

This presentation describes and applies a sociolinguistic variationist model to the phenomenon vernacularly known as "stuttering." I expand on Herbert Landar's (1961) previously unexplored hypothesis of stuttering forms as special morphemes by modeling these forms using Morphological Doubling Theory (Inkelas and Zoll 2005). The descriptive model proposed here diverges from the conventional, prescriptive ideologies endorsed and constituted by research in speech‐language pathology that relegates this kind of duplication to evidence of cognitive disorder, rather than creativity and variation (Le Page and Tabouret‐Keller 1985). In the descriptive model, I define variational duplication as a morphological process in conversation, or the language of turn and sequence (Ford, Fox, and Thompson 2002), that appears in both plain and expressive morphology (Zwicky and Pullum 1987). Moreover, similar to aggressive reduplication (Zuraw 2002), variational duplication produces constructions that are morpho‐semantically driven duplication outputs; however, this process does not participate in inflection or derivation, but sociolinguistic variation (i.e., two forms, one semantic meaning).

Using the variationist perspective, I re‐analyze previous linguistic data that speech pathologists have uncovered in regards to these forms. However, I diverge from the interpretations from these previous scholars. Whereas they see the findings as evidence for disability and deformation of the same underlying input, I interpret the findings as morphologically‐conditioned phonology and variation by way of a theory of emergent grammar (Hopper 1987, 1988). In sum, I offer the variationist perspective as an alternative and integrated model that relies on inductively grounded observations, as opposed to pre‐existing principles of what is and is not language. Moreover, this model takes seriously the interwoven nature of grammar and interaction (Ochs, Schegloff, and Thompson 1996) and concludes that a study of variational duplication must also contend with properties of interaction. I conclude the talk with future goals that such a model must account for, as part of a formal theory of phonological variation (Anttila 2002), and offer some methodological suggestions for future studies taking the variationist perspective.

NOVEMBER 23 - DAN SILVERMAN, SAN JOSE STATE UNIVERSITY: BOUNDARY SIGNALS: REASON OVER RHYME IN PHONOLOGY

Neutralization limits the amount of phonetic distinctness among morphemes, though, in doing so, increases the number of cues to morpheme boundaries. In this talk I summarize some early insights into these "boundary signals" (Trubetzkoy's, and Firth's), as well as more recent work on so-called "transitional probabilities".

NOVEMBER 30 - STEPHANIE SHIH (WITH JASON GRAFMILLER, RICHARD FUTRELL, AND JOAN BRESNAN), STANFORD UNIVERSITY: RHYTHM'S ROLE IN GENITIVE CONSTRUCTION CHOICE IN SPOKEN ENGLISH

Effects of rhythmic structure on English word order have been found in psycholinguistic studies of production, studies of historical change and variation, and theoretical studies of phonology-syntax interactions. Yet, as observed by Rosenbach (2005), much work on English word order has emphasized syntactic complexity, which can be efficiently measured by simply counting graphemic words, and which has been the subject of influential theories of language processing. Several recent multivariable studies of alternative word order choices with English genitive constructions have only used the word count measure on spoken language corpora, where one might expect direct effects of rhythmic structure on vocal communication. And, despite the growing recent interest in prosodic effects on syntactic constituent order, the relationship of phonological and non-phonological factors has yet to be fully explored. This study therefore seeks to fill a methodological gap by comparing rhythmic factors with known semantic, syntactic, and informational predictors in the genitive alternation in spoken English. We consider two important questions: (1) How good are rhythmic properties (metrical weight, clash, lapse) as predictors of construction choice in spoken language? and (2) How important are rhythmic factors relative to semantic, syntactic, and informational predictors?

Using an annotated database of spoken genitive constructions from the Treebank Switchboard corpus, we examined the effect of weak-strong stress alternation across constituent boundaries (i.e., possessor/possessum) on the genitive alternation using logistic regression models. Findings demonstrate that rhythm plays a significant role in determining genitive construction choice. The overall effect of rhythm on the model, however, still remains smaller than that of other known syntactic and semantic predictors. Hence, the importance of both non-phonological and phonological—especially rhythm-based—factors together cannot be discounted in any model of spoken word order alternations and in theories of variation, change, and language processing built around the relative ordering of syntactic constituents.

Spring 2009

JANUARY 27 - EDDIE CHANG, UNIVERSITY OF CALIFORNIA, SAN FRANCISCO: COORDINATED NEOCORTICAL DYNAMICS UNDERLYING PHONOLOGICAL TARGET DETECTION

Selectively processing task-relevant stimuli while ignoring irrelevant stimuli is critical for producing goal-directed behavior. To understand the mechanisms involved in selective attention, we used electrocorticography in epilepsy patients to track the spatiotemporal pattern of event-related high-gamma cortical activation during phonological target detection task. Simultaneous monitoring of multiple areas in the lateral hemisphere revealed a highly ordered, but overlapping temporal progression of phasic activity across the lateral cortex surface in the following sequence: 1) speech-sound-specific sensory processing in the posterior superior temporal gyrus (STG) and superior ventral premotor cortex, 2) task-dependent processing ventrolateral prefrontal cortex (PFC), 3) action planning in the inferior parietal lobule and ventral premotor cortex, and 4) motor response execution and proprioception in the hand sensorimotor cortex. STG activation was modestly greater for target stimuli during active behavior than passive listening, representing sensory selectivity under general attentional modulation. In contrast, PFC was mostly target-selective and highly enhanced by specific task demands, supporting a role in guiding behavioral relevant processing. These results demonstrate the utility of high gamma cortical activity as a powerful tool to evaluate the sensory, cognitive, and motor processes underlying everyday goal-directed human behavior.

FEBRUARY 2 - SEUNG-EUN CHANG, UNIVERSITY OF CALIFORNIA, BERKELEY: THE EFFECT OF WEAK TONE ON THE F0 PEAK ALIGNMENT

The f0 peak sometimes occurs after the syllable with which it is associated, and the peak alignment varies, depending on several factors such as lexical tone target, neighboring tones, focus, and so forth. This study investigates the effect of weak tones on the alignment of f0 peaks with three tone types (i.e., H, M, and R) of South Kyungsang Korean, spoken in the southeastern part of Korea. When three tone types are followed by one or two syllables unstressed suffixes, R was found to have the maximum amount of peak delay and M was found to have the minimum amount, i.e., the peak came in the second syllable, following the R-toned syllable, but the peak came in the syllable following the H-toned syllable. This peak delay was not found for M. Thus, it is argued that the tone alternation patterns in suffixed words are not random; rather, they systematically reflect the phonetic implementation of each tonal target. For example, the peak is in the final portion of a syllable in R, and it takes more time for the peak to be fully realized. This effect is clearly implemented when the following tone is weak as in suffixed words, while it is hardly realized in word final position as in unsuffixed words.

FEBRUARY 9 - STEPHEN WILSON, UNIVERSITY OF CALIFORNIA, SAN FRANCISCO: CORTICAL SPEECH MOTOR AREAS AND THEIR ROLES IN SPEECH PERCEPTION AND PRODUCTION

A network of speech motor areas in frontal cortex can be identified based on lesion-deficit correlations, cortical stimulation studies, and functional neuroimaging. In this talk I will describe this network and present some recent studies on the roles of these regions in speech perception and production.

In the first part of the talk, I will discuss the controversial idea that speech motor areas play a role in speech perception. I will present two functional MRI studies and one repetitive transcranial magnetic stimulation study which support this hypothesis. Based on a review of the literature, it appears that the role for motor regions may be limited to perception in "degraded" acoustic conditions, which is facilitated by an additional top-down source of information (i.e. motor representations).

In the second part of the talk, I will present our current work in progress on speech production. We are attempting to identify brain areas that are differentially recruited as articulatory complexity increases. Our goal is to study the functional status of these regions in patients with apraxia of speech due to neurodegenerative conditions or stroke.

FEBRUARY 23 - LARRY HYMAN, UNIVERSITY OF CALIFORNIA, BERKELEY: WHAT DOES 'TRADITIONAL PHONOLOGY' HAVE TO CONTRIBUTE TO A NEW PHON LAB?

This will be a trial run of a presentation I will make in Madrid on March 3 as part of Fonhispania 2009, where six of us have been invited to a workship entitled "New Approaches to the Phonetics-Phonology Interface" to inaugurate a new phonetics and phonology laboratory:

http://www.cchs.csic.es/estudiosfonicos/fonhispania.html

As can be seen from the list of invitees, a range of views will be represented. My own talk will focus on the symbiosis of phonetics and phonology, but start from the following observations concerning the current state of phonology, which I describe as:

* diverse, disjointed, unclear boundaries, disparate goals
* a lot of introspection, stock-taking, critiques, healthy diversity of views and agendas
* largely oriented towards the surface due to optimality theory and technology
* cutting-edge research tends to be experimental, instrumental, quantitative, computational
* increasing rejection of the the basic concepts and methodologies of the structuralist-generative heritage, ultimately denying that phonology is anything like we used to think

Given the above trends, one could justifiably ask: What does a traditional phonologist such as myself have to offer a new phonetics-phonology laboratory?

After establishing what is meant by "traditional phonology" I consider two questions concerning the phonetic vs. phonological properties of pre-vocalic nasal + voiced consonants (NDV):

* the question of why NDV devoices to NTV in the Sotho-Tswana subgroup of Bantu, where an in-progress phonetic investigation by Maria Josep Solé et al on Shekgalagari confirms this allegedly "anti-phonetic" process.

* the question of why NDV has variable behavior with respect to pitch-lowering effects in various African languages, sometimes patterning with the voiced obstruents /b, d, g/ as F0-depressors, sometimes not.

The discussion of these examples will establish that both traditional phonological and phonetic analyses are necessary to resolve such questions. While traditional phonology has centered around the development and application of theories and methodologies to help gain insight into the nature of phonological systems as they function in a grammar, many who deny structural phonology do so either because of differing agendas or because they wish to look at speech sounds at a different "level". One question I raise is whether phonologists have become too literal in approaching linguistics as a branch of cognitive science. I intend for my talk to be provocative, but also to present the above two real examples concerning NDV, where rigorous (structural) phonological and (instrumental) phonetic analyses must cooperate if we wish to understand what is a possible phonological system--and why.

MARCH 2 - REIKO KATAOKA, UNIVERSITY OF CALIFORNIA, BERKELEY: A STUDY ON PERCEPTUAL COMPENSATION FOR /U/-FRONTING IN AMERICAN ENGLISH

Listener's identification of speech sounds are influenced by both perceived and expected characteristics due to the influence of surrounding sounds. For example, a vowel ambiguous between /i/ and /e/ is heard more often as /e/ when the precursor sentence has low F1 but it is heard as /i/ when the precursor has high F1 (Ladefoged & Broadbent 1957), and a greater degree of vowel undershoot is perceptually accepted in fast speech than in slow speech (Lindblom & Studdert-Kennedy 1967). Later, Ohala and Feder (1994) showed that American listeners judge a vowel stimulus which is ambiguous between /i/ and /u/ more frequently as /u/ in alveolar context than in bilabial context, and do so both when the context is provided as acoustic signal and when it is cognitively "restored" under the condition that the acoustic signals for contexts were replaced with white noise. In this talk I will report the results of a perception experiment with native speaker of American English, aiming to extend Ohala & Feder's study with additional measures to reveal the locus of perceptual compensation in the human speech processing system.

In the experiment listeners identified words containing a vowel test sound in a /bip/ to /bup/ continuum and /dit/ to /dut/ continuum. The stimuli were presented in the following conditions: 1) with or without a precursor phrase before each word, and 2) fast, medium or slow speech rate. The continuum's phoneme boundary was shifted in a manner consistent with a perceptual compensation for the alveolar context that causes fronting of /u/, but not in the same magnitude across the conditions. The degree of boundary shift was non-significant when the stimuli were presented in isolation, while it became significant when the same stimuli were presented with precursor phrase. Further, greater degree of boundary shift was observed when the speech rate of whole sentence (precursor and target stimuli) was increased. In addition, for an ambiguous stimulus, faster reaction time (RT) was observed in alveolar context than in bilabial context in majority of the conditions.

That the perceptual compensation occurs but the degree of compensation varies across conditions suggest that listeners might use both cognitively based categorical compensation and mechanically based gradient compensation. Further, fairly consistent consonant effect on RT seems to indicate that perceptual compensation might facilitate phoneme decision making as well as shifting the category boundary. The results of the current study will shed some light on how human listeners benefit both bottom-up processing by using the surrounding acoustic signals and top-down processing using the acoustic images of the speech sounds which are stored in long-term memory. The theoretical implications regarding the relationship between speech production and speech perception will be discussed.

MARCH 9 - MOLLY BABEL, UNIVERSITY OF CALIFORNIA, BERKELEY: SPONTANEOUS PHONETIC IMITATION: THE GOOD, THE BAD, AND THE UGLY.

Spontaneous phonetic imitation – the phenomenon where interacting talkers come to be more similar-sounding – may be an important mechanism in dialect convergence and historical sound change. Recent research has been concerned with whether spontaneous imitation is an automatic (and hence unavoidable) process, or whether it is mediated by social factors (e.g., Giles & Coupland, 1991; Goldinger, 1998; Pickering & Garrod, 2004; Pardo, 2006). In this talk I briefly present the "clean" results from a project that investigates phonetic imitation of vowels. The results show that talkers convergence on the first and second formants of the model talker in the task, but that not all vowels are imitated to a significant degree. In this study of American English, only the low vowels /a/ and /æ/ exhibit strong convergence effects.

What will make this talk different from previous versions is that I will also reveal the ugly data from the crossed audio-visual conditions where the Black talker's image is presented with the White talker's voice and vice versa. The pattern of imitation from these conditions does not reflect participants' behaviors in the natural conditions. In the White voice/Black image condition participants exhibit imitation earlier than in the other conditions and the degree of imitation lessens across shadowing blocks (as opposed to increasing across blocks in the natural conditions). In the Black voice/White image condition participants diverge in the first shadowing block, only to slowly imitate after subsequent exposure. It is my hope that Phorum audience members will be familiar with audio-visual work that provided *funny* results, so that this data can be properly interpreted and understood.

MARCH 16 - SAM TILSEN, UNIVERSITY OF CALIFORNIA, BERKELEY: PRELIMINARY RESULTS OF A STOP-SIGNAL EXPERIMENT

A commonly used approach to investigating how speech is planned is to analyze how controlled variables affect the reaction time to initiate an utterance. A much less commonly used approach is to analyze how quickly speech can be halted. I present preliminary results of an experiment that investigates how quickly speakers can stop speaking in mid-utterance. Two main questions that I address are (1) does the stress of an upcoming syllable influence the time it takes to stop speaking and (2) does the rhythmicity of a sentence influence stop-RT?

MARCH 30 - BOB PORT, INDIANA UNIVERSITY: OBSERVATIONS ON SPEECH TIMING IN ENGLISH AND JAPANESE

Review of some rhythmical properties of speech that I have examined over the years. I will play a number of short clips of rhythmically spoken speech that I have collected, show how my own thinking has evolved and make some suggestions for younger researchers.

APRIL 6 - YAO YAO, UNIVERSITY OF CALIFORNIA, BERKELEY: THE EFFECT OF PHONOLOGICAL NEIGHBORHOOD ON WORD DURATION IN SPONTANEOUS SPEECH

It has been shown extensively in the literature of word perception that the existence of similar-sounding words (i.e. phonological neighbors) in the lexicon inhibits auditory recognition of the target word (Luce & Pisoni 1998). However, the effect of phonological neighborhood density on word production is less investigated. Previous studies have proposed two accounts. One is the facilitation account, i.e. similar-sounding neighbors contribute to the activation of the target word and thus facilitate production (Vitevitch 1997, 2002). The other one is the hyperarticulation account, i.e. speakers hyperarculate words from dense neighborhoods for the sake of listeners (Wright 1997, Munson & Soloman 2004, Scarborough 2002).

The current work explores the effect of neighborhood density in spontaneous speech production, using the Buckeye corpus. The variable under investigation is word duration in CVC monomorphemic content words. The two existing accounts will have opposite predictions for duration (facilitation -> shortening; hyperarticulation -> lengthening). A mixed-effect model is built to test the effect of neighborhood size and average neighbor frequency on word duration, while controlling for other linguistic and nonlinguistic factors. Current results show that neighborhood size has a robust facilitative effect on word duration – words with more neighbors are produced faster than those with fewer neighbors. Furthermore, comparison of neighborhood measures calculated from different dictionaries reveals that when more realistic word frequency measures are used, the effect of average neighbor frequency also reaches significance, in the same direction. Together these results provide evidence for the facilitation account, i.e. the more neighbors a word has, and the more frequent these neighbors are, the faster it is to produce this word.

APRIL 13 - TE-HSIN LIU, UNIVERSITY OF CALIFORNIA, BERKELEY (VISITING SCHOLAR): ON THE STATUS OF TONE IN COMPENSATORY LENGTHENING

Yue dialects, i.e. Cantonese, Bobai and Xinyi, have a process whereby a rising tone replaces the lexical tone of the head noun to derive diminutive forms, referred to as Pinjam (changed tones) in the literature. Chao (1947) and Benedict (1942) noticed that, in Cantonese, this derived rising tone has a slightly longer duration than the lexical rising tone. The same phenomenon is observed in Bobai and Xinyi, where the derived rising tones are longer than the lexical rising tone (Wang 1932, Ye & Tang 1982). We conducted a phonetic study to validate the post-1940 data, and found that the average duration of the lexical rising tone is 0.256 seconds, whereas the average duration of the derived rising tone is 0.518 seconds. Even the longest lexical rising tone averages shorter in duration than the shortest derived rising tone.

Establishing a correspondence between the Mandarin diminutive suffix [-ɻ] and the Cantonese high rising Pinjam, Chao (1959) used mora to describe this additional length, suggesting that the Cantonese mora is a suffix taking the form of a high tone rather than sound segments. This conjecture, capable of explaining the additional length associated with the Pinjam, is contrary to current theories according to which tones, being suprasegmental objects, have no temporal basis of their own. How to solve this paradox?

Following O'Melia (1939) and Whitaker (1956) according to whom the additional length is to compensate the elided diminutive suffix [ɲ] in Bobai, a more conservative dialect compared to Cantonese, we claim that tones, rather than vowels, lengthen in order to fill the vacuum left by the elision of the neighboring syllable. A conjecture based on segmental compensatory lengthening will encounter one problem: if the additional tonal duration had to be explained by the compensatory lengthening of vowels, no change in length would be expected to occur in closed syllables. However, the additional length is observed in both open and closed syllables in Pinjam. If we posit that the codas in Yue are moraic, we will encounter another problem: it is difficult to understand why, in the case of three entering tones, tonal duration is shortened by final stops in closed syllables in Cantonese, whereas the same final stops are capable of bearing a long rising tone in Pinjam. It remains the only possibility: it is tone that lengthens under syllable elision, not vowel. In other words, the vowel lengthens under the pressure of the tone, not the tone under the pressure of the vowel.

APRIL 20 - DAVID TEEPLE, UNIVERSITY OF CALIFORNIA, SANTA CRUZ: BICONDITIONAL PROMINENCE CORRELATION

Positional augmentation constraints are potentially neutralizing, since they call for a prominent property to be realized in a strong position (e.g., "If stressed, then heavy"), and since that property therefore cannot contrast with its absence (e.g., no contrast for vowel length in stressed syllables) (Smith 2005). Because augmentation constraints make no counterbalancing demands of weak positions, the same contrast which they neutralize in strong positions can be maintained in weak positions.

This sort of pattern – which I will refer to as Strong-Position Neutralization, or SPN – is either altogether absent or exceedingly rare, and one of my central goals is to account for this.

My proposal involves two formal components, one related to markedness and one to faithfulness, each based on a very general idea:

1. No constraint demands augmentation only.

2. Contrasts are generally more likely to survive in prominent positions than in weak, because contrast cues are bolstered by prominence correlates.

The first idea gives rise to the formal proposal that all markedness constraints which correlate prominence with phonetic correlates are biconditionals; e.g., "If and only if a syllable is prominent, it is relatively long." Thus a demand for augmentation is also a demand for a corresponding reduction. This is built on Liberman and Prince's (1977) notion that stress is only defined relationally in a local domain: the strong is only defined relative to its weak neighbors.

The second idea gives rise to the formal proposal that all cue-based faithfulness constraints for a given feature outrank all general faithfulness constraints for the same feature. This means that, e.g., DEP-µ/INTENSE, "In a relatively intense syllable, don't add a mora," outranks MAX-µ, "Don't remove a mora," despite the fact that they monitor different types of correspondence. This gives preference to moraic contrasts in strong positions (i.e., positions of relatively high intensity).

With both formal components in place, SPN does not emerge as a prediction, while other attested patterns of positional neutralization do. The components of the theory also make other interesting predictions, which I believe are borne out.

References

Liberman, Mark, and Alan Prince. 1977. On stress and linguistic rhythm. Linguistic Inquiry 8:249-336.
Smith, Jennifer. 2005. Phonological augmentation in prominent positions. Outstanding Dissertations in Linguistics. New York and London: Routledge.

APRIL 27 - LAUREN HALL-LEW, STANFORD UNIVERSITY: ASIAN ETHNICITY & SAN FRANCISCO ENGLISH.

The progression of regional vowel shifts has been of central concern in sociophonetic research (Labov, Yeager & Steiner 1972; Labov, Ash & Boberg 2006). Researchers' aim has been to determine which linguistic and social constraints propel or inhibit particular sections of a given change-in-progress. For example, Labov (2001) has argued that speaker ethnicity interacts with the adoption of sound change: White speakers further the advancement of vocalic chain shifts while non-White speakers do not.

In the first half of this talk I will consider these issues with respect to the California Vowel Shift (Eckert 2008). Based on sociolinguistic interviews and semi-ethnographic fieldwork in a neighborhood of San Francisco, I analyze vowel production patterns across speakers of varying ethnicities. The data collected thus far indicate that Asian Americans are not only indistinguishable from their White counterparts for some vowels (the fronting of /u/ and /o/), but are, in fact, leading in broader regional sound changes (the merger of /a/ and /ɔ/). However, these results are not surprising in for San Francisco, where the social history and current demographics suggest that equating regional sound change with White speech patterns is inappropriate. Rather, since regional variation is inextricably tied to social variation, the particular social constraints on sound change must be determined with respect to a given community.

Building on this initial analysis, I will then discuss one of the most important linguistic constraints inhibiting all of these back vowel shifts: the presence of a following /l/. These data present a interesting situation for the analysis of coda-/l/ syllables, because /l/ is undergoing variable vocalization to a semi-vowel, and vocalization appears to be favored by preceding back vowels. Vocalization is also favored by Asian Americans, suggesting a heritage language substrate effect. While the potential impact of vocalization on the progression of the California Vowel Shift remain to be seen, particularly given the notorious methodological challenges in measuring degree of vocalization accurately, its interaction with ethnicity suggests some probable outcomes.

MAY 4 - CHARLES CHANG, UC BERKELEY: SHORT-TERM PHONETIC DRIFT IN AN L2 IMMERSION ENVIRONMENT

The earliest research on "interlanguage" phonologies was based on two related assumptions: the existence of a critical period for language acquisition, and unidirectionality of cross-language influence (going from the first language, L1, to the second language, L2). Recent research has challenged both of these assumptions. Some (e.g. Flege 1987b) have pointed out numerous problems with the enterprise of proving that a critical period exists. Furthermore, there is mounting evidence that cross-language influence can, in fact, go from L2 to L1. Flege (1987a) and Sancier and Fowler (1997), for example, show that voice onset time in L1 voiceless stops shifts towards the phonetic norm of L2 voiceless stops when L1/L2 speakers are immersed in an L2 environment for an extended period of time. In the framework of Flege's (1995) Speech Learning Model, this change in L1 arises from an "equivalence classification" of similar L1 and L2 sounds that ties them to the same higher-level category, thereby allowing both sounds to be affected by input in L1 or L2.

However, we still know very little about how and when equivalence classification occurs. Nearly all of the work in this area examines the pronunciation of highly proficient bilinguals after they have spent a long time in an L2 environment and, moreover, focuses on L1/L2 pairs that share the same alphabet, making it unclear how generalizable these results are to learners of lower levels of proficiency (who normally constitute the majority of the population of adult L2 learners) and to cases of contact between languages that do not overtly equate similar sounds via identical orthography.

Thus, I delve deeper into the nature and time course of L1 phonetic drift by examining the very first weeks of 20 native English speakers' immersion in a Korean language environment. In a weekly elicited production task, these adult acquirers of Korean read the same series of Korean and English words, and acoustic measurements of voice onset time (VOT) and fundamental frequency (f0) onset were taken on their productions of 60 words beginning with stop consonants. These data indicate that learning Korean stops affects the production of English stops in as little as one week. In the case of English voiced stops (which in initial position resemble Korean fortis stops in having a short VOT, but differ in having a lower f0 onset), a repeated-measures ANOVA shows no main effect of time on VOT, but does show a main effect of time on f0 onset. In the case of English voiceless stops (which resemble Korean aspirated stops in having a long VOT and high f0 onset, though not as long or high as the Korean stops), a repeated-measures ANOVA shows a main effect of time on both VOT and f0 onset.

In both of these cases, the pattern of change in the English sounds approximates the characteristics of the Korean sounds to which the English sounds are most phonetically similar. English voiced stops do not change significantly in VOT over time, since they are already similar to Korean fortis stops in this respect; however, their f0 onset rises in approximation to the elevated f0 onset typical of the Korean fortis category. Meanwhile, English voiceless stops become longer in VOT and higher in f0 onset in approximation to the Korean aspirated stops. These results indicate that L1 phonological categories are much more malleable than previously imagined, subject to phonetic drift on a timescale of weeks rather than months or years – even when there is no clear orthographic correspondence between the L1 and L2 sounds. Given that in other tasks most of the learners in this study show command of only two (English) laryngeal categories during this time period, these findings suggest that the equivalence classification that gives rise to this phonetic drift may be rather low-level in nature.

Quick Links