Phorum 2011

Schedule of Talks for Fall 2011

PREVIOUS MEETINGS:

SEPTEMBER 12 -

LARRY HYMAN
UC BERKELEY

Tonal Density and Tonal Typology

In previous work I have documented several ways in which tone is "different" from segmental and metrical phonology. I believe that a reasonable case can be made that tone offers both greater complexity and greater diversity than other aspects of phonology. Concerning such diversity, a relatively small number of languages distinguish up to five contrasting tone heights and multiple contours on each syllable. On the other hand, many two-height tone systems are best analyzed with a single contrast of /H/ vs. ø, where the "privative" /H/ may be significantly restricted in distribution (e.g. only one per word, contrast only on the stressed syllable). Recognizing this extreme variation, Gussenhoven (2001:15296) introduces the notion of "tonal density": "A phonological typology of tone might be based on tonal density: how many locations are specified for tone, that is, have tonal associations? ... in the 'densest' case they specify every mora for tone, and in the sparsest case they just mark the phrasing" (cf. Gussenhoven 2004:35). In this talk, I raise the question of whether, and if so how, tonal density either constrains or enables the aforementioned wide range of complexities and diversity. Are there certain properties of tone systems which are found mostly/only in highly dense systems, in highly sparse systems, or independently of tonal density? My ultimate goal is to raise the question of whether it is possible by this criterion or any other to establish a "canonical typology" of tone in the sense of Corbett (2005), and if not, why not.

SEPTEMBER 19 -

REIKO KATAOKA
UC BERKELEY

Phonetic and Cognitive Bases of Sound Change — Past, Present, and Future

In this talk I will report some of the main results from my dissertation research and will explore possible directions for future research. As a way to investigate phonetic and cognitive factors that may serve as preconditions of coarticulation-based sound changes, I examined how speakers of American English produce, perceive, and repeat the high back vowel /u/ in fronting and non-fronting contexts. Production study found that: (1) the relative acoustic difference between the fronted /u/ and the non-fronted /u/ remained across elicited ranges of vowel duration; and (2) the degree of acoustic variability was less for the fronted /u/ than for the non-fronted /u/. These results were interpreted as evidence that speakers have a distinct and more narrowly specified articulatory target for the fronted /u/ in the alveolar context than for the non-fronted /u/. Perception study found the evidence for (1) compensation for coarticulation (i.e., phonemic category boundary shift as a function of consonantal environment), (2) systematic individual variation in perceptual category judgments, and (3) similarity between the distributional properties of /u/ observed in the production experiment and the range of perceptual responses. Together, these results suggest that the one systematic source of individual variation in speech perception is individual differences in phonological grammar (perceptual category boundary), and that this grammar emerges in response to the ambient language data to which the listeners have been exposed. Finally, vowel repetition study found evidence for (1) compensation for coarticulation and (2) systematic individual variation in the repetition task as well as that (3) perceptual category boundary judgment guides vowel repetition behavior, suggesting that perceptual interpretation determines mental representation of spoken inputs.

Based on these experiments, I contend that pronunciation variation emerges through a system of mutual dependency and multiple causal loops between and among speech perception, speech production, knowledge about pronunciation norm and ambient language data. These properties in language use govern the output of communicative interactions among members in a speech community, and one such output is member's knowledge of multiple sub-phonemic pronunciation categories that exist in any speech community. Additionally, I argue that any speech community is in a constant state of readiness to adopt an innovative pronunciation as a new community norm, because members have rich pronunciation repertoire even when there is no observable community-level sound change. The talk will end with discussion of the remaining issues.

SEPTEMBER 26 -

MEGHAN SUMNER
STANFORD UNIVERSITY

Global Frequency, Canonical Forms, and the Representation Paradox

Variation in speech abounds both within and across speakers. Much of this variation is systematic, governed by linguistic principles of coarticulation and covariation. These principles, though, are not the only factors governing speech variation – a particular word may be produced differently each time it is uttered depending on speaking rate, listening condition, such as ambient noise, and to whom a person is speaking. These latter types of variation involve changes from highly reduced speech (hypo-articulated) to clearly articulated speech (hyper-articulated). Given the massive variation in natural speech, how listeners perceive sounds and words is a central issue for linguistic theory.

For the past 20 years or so, work in speech perception has used what we know about how we speak to explain the understanding of spoken words despite massive variation in the speech signal. This focus on language use has borne many factors influential in speech perception, with frequency rising as a standout factor. How often a word or a sound is produced has been influential in how we understand how listeners understand spoken words – with much work showing that units that are experienced frequently are easier to perceive. This perspective is in direct conflict with another view – one that suggests the perceptually privileged form is a more abstract, canonical form.

In this talk, I consider these two perspectives in greater detail and highlight an interesting paradox in the literature that I call the representation paradox: both frequent forms and canonical forms facilitate perception and both frequent forms and canonical forms hinder perception. These data have resulted in arguments that each is 'the' stored form. In my view, this paradox results for at least three reasons. First, phonological variation is viewed as categorical in work resulting in the paradox. Second, experience is often equated with how often a sound is experienced rather than how often a sound is experienced in a given linguistic or social context. Finally, the makeup of the stimuli biases listeners toward either a frequent form or a canonical form.

I present data from three experiments designed to address this representation paradox to show that (1) when situated in an appropriate social or linguistic context, variation is unproblematic, (2) the bias toward a frequent linguistic unit (here, a sound) is predictable, and stems from unintentionally controlled word-level phonetics, and (3) the sounds and forms that facilitate speech perception are those that are frequent in a given linguistic or social context.

From these data, I provide a language use perspective that moves away from global frequency toward contingent frequency based on two broad assumptions. First, given any time window of speech, listeners extract information about sounds, sound categories, words, speakers, and social situations. Second, this information is gradient and cumulative, such that as cues to a particular social or linguistic context increase (sound, word, speaker), word recognition is increasingly facilitated.

OCTOBER 3 -

ANNE VILAIN
UC BERKELEY & GIPSA-LAB, SPEECH & COGNITION DEPARTMENT, UNIVERSITÉ STENDHAL, GRENOBLE, FRANCE

Phonetic and Kinematic Correlates of Distance Encoding in Deictic Pointing: Grounding Sound Symbolism in Gestural Strategies

Deictic pointing is described as one of the core processes of linguistic communication, both in language phylogeny and ontogeny (Tomasello 2004, Leavens et al. 2005, Özçaliskan & Goldin-Meadow 2005). It is a ubiquitous example of multimodal language production, as it most of the time implies a manual gesture and a deictic word, which makes it an ideal process to study interactions between speech, gesture, and language. On the one hand, behavioral and neurophysiological studies have suggested that speech and manual gestures are two modes of the same communication system (McNeill 1992, Kita 2003, Bernardis & Gentilucci 2006, Loevenbruck et al. 2005). On the other hand, typological studies have shown that deictic pointing is also one of the most consistent examples of sound symbolism in the world’s languages: phonological variation correlated with meaning (Ultan 1978, Woodworth 1991, Traunmuller 1996, Diessel 1999).

In this talk I will present two experimental studies (Gonseth, Vilain & Vilain 2011) exploring the encoding of distance information in vocal and manual pointing, and its potential relation with the phonological structure of deictic words, as well as the nature of the interactions between the speech mode and the manual mode within the process of deixis. Articulatory, acoustic and kinematic data have been recorded during a pointing task contrasting different distances of the designated targets, different pointing words and non-words, and different modalities: speech + gesture, speech only, and gesture only. The results show that distance induces intra-categorical phonetic modifications of the deictic word, and kinematic differences on the manual gesture. Results also evidence different communicative strategies related to the use of one vs. two modalities, supporting the idea of speech and gesture being two cooperating effectors for language. Preliminary data on infant behavior illustrate the development of this cooperation.

OCTOBER 10 -

SHARON INKELAS
UC BERKELEY

Confidence Scales: a New Approach to Derived Environment Effects

This talk introduces confidence strength as a new dimension in phonological representations. In addition to representing individual segments in terms of distinctive features, each individual segment is, on this proposal, lexically stored with a confidence value reflecting the robustness of its storage in memory. This proposal offers a way of distilling continuous psycholinguistic and phonetic salience, both continuously valued variables, into categorical phonological representations.

The case study that will be used to demonstrate the usefulness of this proposal is morphologically derived environment effects (MDEEs), also known as nonderived environment blocking (NDEB). These effects are evidence of a behavioral asymmetry across segments that are otherwise equivalent, and have proved tricky to capture over 40 years of trying. The proposal in this talk is that the behavioral asymmetries can be attributed to strength differences.

OCTOBER 17 -

MARC ETTLINGER
HUMAN COGNITIVE NEUROPHYSIOLOGY LABORATORY, VA MEDICAL CENTER

The Interaction of Memory and Language

In this talk, I address the question of how general cognitive capabilities help shape what human languages look like. I do so by showing that the acquisition of certain aspects of morphophonology depend on specific types of memory using both behavioral and neuroimaging data. In particular, I demonstrate that morphological rules and analogy are supported by procedural memory (memory for sequences) and declarative memory (memory for facts), respectively. This suggests that some people may be better at acquiring certain types of languages depending on non-linguistic cognitive skills, neuroanatomical structure and, ultimately, genetics. I conclude by outlining some alternatives to genetic determinism as well as current research exploring the opposite question of how language supports memory.

OCTOBER 24 -

— CANCELLED —

DARYA KAVITSKAYA
UC BERKELEY, DEPT. OF SLAVIC LANGUAGES AND LITERATURES

OCTOBER 31 -

SHINAE KANG AND EMILY CIBELLI
UC BERKELEY

Introducting ECoG: A Tool for Investigating the Neural Basis of Speech

Meet at 11 AM!

NOVEMBER 7 -

QP FEST
370 DWINELLE, 2:15-6:00 PM

Presentation of QP Work by 3rd and 2nd Year Graduate Students

2:15 QP Fest "Starts"

2:25 Welcome (Line Mikkelsen, 201 Instructor)

Session 1

2:30 Emily Cibelli
"The Neural Pathways of Lexical Processing: Evidence from High-Density ECoG"

2:45 Shinae Kang
"Neural Basis of the Word Frequency Effect and Its Relation to Lexical Retrieval"

3:00 I-Hsuan Chen
"The Licensing of Minimal Negative Polarity Items in Mandarin Chinese"

3:15 Eric Prendergast
"Animacy and Volitionality As Conditions on -eeNoun Derivation"

3:30 Break

Session 2

3:45 Florian Lionnet
"Unusual Grammar Leads to Unusual Grammaticalization: Verbal Demonstratives in Juu Languages"

4:00 Roslyn Burns
"Inheritance, Contact, or Innovation: Palatalization in Eastern Low German"

4:15 Clare Sandy
"Accent in Karuk"

4:30 Break

Session 3

4:45 Chundra Cathcart
"Tapping in, Flapping out: Articulatory Patterns of the Alveolar Tap and Implications for Sound Change"

5:00 Greg Finley
"A Speech-Specific Perceptual Mode: Evidence from Compensation for Coarticulation"

5:15 Elise Stickles
"Color Naming Does Not Covary With Color Diet"

5:30 Refreshments

NOVEMBER 14 -

— CANCELLED —

DANIEL ABRAMS
STANFORD UNIVERSITY

NOVEMBER 21 -

KIE ZURAW
UCLA

Novel Contrasts in Tongan Loan Adaptation (joint work with Kaeli Ward and Kathleen O'Flynn)

We examine secondary stress, vowel deletion, and vowel length in a corpus of Tongan loans from English. Although stress and vowel deletability are not contrastive in Tongan, they reflect English contrasts such as CC vs. CVC. Vowel length is contrastive in Tongan, but in these loans it is sensitive to English stress and CC/CVC contrasts in adjacent syllables. We conclude that Tongan loan adaptation shows sensitivity to non-L1 contrasts and that the perceptual mapping from L2 to L1 here is not segment-to-segment or even syllable-to-syllable but looks more like comparison of full candidate forms.

DECEMBER 5 -

MARISA TICE AND MELINDA WOODLEY
STANFORD UNIVERSITY AND UC BERKELEY

L2 Immersion Effects on L1 Speech Perception

The idea that second language (L2) learners' phonological categories are tightly linked to their native language categories is not a new one (Laeufer 1996). Typically, the role of L2 phonology has been seen as subordinate to native (L1) phonology. However, in the present study we find that, though L1 and L2 categories may be tightly linked in early L2 acquisition, the relationship between them is not strictly one-way. Our findings demonstrate interference from the L2 on the L1 in both a phoneme discrimination task and a semantic priming experiment, underscoring the continued malleability of L1 phonological categories well into adulthood.

Five beginning learners of French (L1 English) completed perception and production tasks in French and English on a weekly basis for four to six weeks (Tice et al. 2011). In a phoneme discrimination task, French and English phoneme perception appeared to be tightly linked; listeners used their L1 categories for both languages. However, in the 3-4 week L2 exposure range, participants began to show a shift in their category boundary for English, moving toward a lower, French-like boundary. By week five, their category boundaries had shifted back to L1 norms.

This phoneme categorization finding is corroborated by the results of a semantic priming experiment, which also indicated category confusion during the 3-4 week exposure period. But unlike the categorization task, listeners did not switch back to L1 norms at week 5, instead showing further effects on their L1 processing.

We hypothesize that the difference in tasks is due to the different types of linguistic processing required: explicit categorization versus lexical processing. The shifts in their perception may not only reflect their exposure to French, but also to French-accented English. These findings support recent work on cross-language effects (e.g. Flege, Schirru, and MacKay 2003, Flege 2007, Chang 2010), and dovetail neatly with exemplar-type models of speech perception.

DECEMBER 12 -

KRISTOFER E. BOUCHARD (1), NIMA MESGARANI (1), MIRANDA BABIAK (1), KEITH JOHNSON (2), EDWARD CHANG (1)
(1) UC SAN FRANCISCO AND (2) UC BERKELEY

Electrophysiological Foundations of Human Speech Production

No behavior is as unique to humans as the ability to produce spoken language, and few behaviors that every human performs are as complicated to control as speech. Though much is known about the behavioral mechanics and anatomical substrates of speech, the spatio-temporal structure of neural activity controlling the speech articulators is largely unknown. We recorded from the surface of the speech somato-motor cortex in neurosurgical patients during the production of consonant-vowel syllables. We found that neuro-electrical activity exhibits temporal structure that reflects the sequence of articulator engagements and is somatotopically mapped onto the cortical surface. Furthermore, the spatial structure of neural activity is hierarchically organized by the pattern of articulator engagements and exhibits divergent and convergent dynamics. These results provide an electro-physiological foundation of human speech production and shed light on how the neocortex dynamically controls complex, multi-articulator behaviors, which is crucial to our understanding of basic nervous system function.


Schedule of Talks for Spring 2011

PREVIOUS MEETINGS:

JANUARY 31 -

EDWARD CHANG
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO, DEPT. OF NEUROLOGICAL SURGERY

Speech Mechanisms: Perception and Production

FEBRUARY 7 -

CARRIE NIZIOLEK
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO

The Role of Linguistic Contrasts in Speech Feedback Control

Speakers use auditory feedback to monitor their own speech, making adjustments to keep their output on target. This feedback-based control may occur at a relatively low level, based on pure acoustic input, or it may occur at a higher level, after language-dependent transforms take place in auditory cortex. The two experiments discussed here tested the influence of perceptual categories on auditory feedback control. Linguistic influences were assessed using real-time auditory perturbation to induce differences between what is expected and what is heard. The results demonstrate that auditory feedback control is sensitive to linguistic contrasts learned through auditory experience.

FEBRUARY 14 -

KEITH JOHNSON
UNIVERSITY OF CALIFORNIA, BERKELEY

Two Studies on Compensation for Coarticulation

One of the fundamental processes of speech perception is a contextual normalization process in which segments are "parsed" so that the effects of coarticulation are reduced or eliminated. For example, when consonants are said in sequential order (e.g. the [ld] in "tall dot" or the [rg] in "tar got") the tongue positions for the consonants interact with each other. This "coarticulation" is undone in speech perception by a process that is called compensation for coarticulation. The basis of this process is a source of much controversy in the speech perception literature. The studies that I report in this talk probe the compensation for coarticulation process in two ways. The first set of experiments examines the role of top-down expectations, finding that the compensation effect is produced when people think they hear the context, whether the context is present or not. This dissociation of the context effect from any acoustic stimulus parameter indicates that at least a portion of the compensation effect is driven by expectations. The second set of experiments examines the role of articulatory detail in the compensation effect, finding that the compensation effect is driven at least partly by detection of particular tongue configurations. This set of experiments looked at perception in context of the "retroflex" and "bunched" variants of English "r" and found that this low-level articulatory parameter influences perceptual boundaries. The overall picture that emerges is one of listeners who make use of fine-grained articulatory expectations during speech perception.

FEBRUARY 28 -

ARLO FARIA
UNIVERSITY OF CALIFORNIA, BERKELEY, COMPUTER SCIENCE PH.D. CANDIDATE AT THE INTERNATIONAL COMPUTER SCIENCE INSTITUTE

Data-driven Automatic Speech Recognition

Automatic speech recognition (ASR) is a computer technology that performs speech-to-text processing, using systems that integrate linguistic knowledge along with statistical methods to learn from data. This talk will describe the modern statistical approach to artificial intelligence, while highlighting specific ways in which language structure is present in ASR systems. In addition, some analysis and discussion will consider the suitability of these engineering assumptions. For example, characteristics of human auditory sensitivity are encoded in a front-end signal processing module, which uses normalization and adaptation techniques to provide robustness against speaker variability and environmental conditions. An ASR system's acoustic model -- as well as other components -- will require a phonemic pronunciation dictionary to define the composition of words; evidence suggests this is fundamentally flawed, although in practice it is a more effective solution than other alternatives. Rather than presenting speech technology in the familiar context of transcription, I think this audience would appreciate exposure to a lesser-known application: HMM-based forced alignment. This automatic procedure can provide an accurate word-level segmentation of transcribed speech, and could enable convenient indexing for large collections of audio or video recordings.

MARCH 7 -

SAM TILSEN
UNIVERSITY OF SOUTHERN CALIFORNIA

An Experimental Investigation of Right/Left Ear Advantage for Prosodic/Segmental Processing

Phorum will be an open discussion of an experimental design that is currently in development. The experiment aims to investigate REA (right ear advantage) and LEA for segmental/prosodic processing in the context of a discrimination task. REA/LEA phenomena arise from biased projections from right and left subcortical auditory systems to the contralateral hemispheres. Segmental speech processing has been associated with the LH/REA and prosodic processing with RH/LEA. The discussion purports to elicit feedback and brainstorming regarding issues in task design and stimuli construction relevant to the experiment.

MARCH 14 -

MELINDA WOODLEY
UNIVERSITY OF CALIFORNIA, BERKELEY

Effects of Bilingual Language Acquisition on Processing English Sounds

Preschoolers in the UC Berkeley child development program come from hugely diverse linguistic backgrounds. Of the children tested so far in the present study, approximately half could be considered monolingual English speakers. The non-monolinguals are evenly split between children who speak English and another language at home, and children who speak no English at home.

In this talk, I will compare results from these three groups of children on two speech processing tasks (one perception- and one production-oriented) in a first pass at examining the effects of amount of home English exposure on learning to process English words and sounds. While more subjects are still needed, the results so far suggest that the developmental path toward adult-like speech processing likely varies significantly for children simultaneously acquiring multiple languages.

MARCH 28 -

JENNIFER ARNOLD
UNIVERSITY OF NORTH CAROLINA, CHAPEL HILL

Does Audience Design Impact Acoustic Reduction?

Spoken words vary in their degree of acoustic prominence, or intelligibility. Discourse-given or predictable words tend to be reduced; new or unpredictable words tend to be acoustically prominent (e.g. Bell et al. 2009; Fowler & Housum 1987). An unresolved question is whether this variation results from audience design or purely speaker-internal constraints. I consider two possible versions of the audience design view: 1) speakers use acoustic reduction in order to mark the discourse status of referents, which is defined with respect to a shared, common-ground discourse model, and 2) speakers model the addressee's comprehension needs, and provide more explicit acoustic input when the word or referent is difficult to identify, such as when it is discourse-new or unpredictable. I present the results of three experiments that test these ideas, and conclude that while audience design has some impact on acoustic reduction, it is not in the ways suggested by either of these accounts. Instead, acoustic reduction is primarily driven by speaker-internal constraints on planning and production.

APRIL 4 -

GRANT MCGUIRE
UNIVERSITY OF CALIFORNIA, SANTA CRUZ

--CANCELLED--

APRIL 11 -

WILL CHANG
UNIVERSITY OF CALIFORNIA, BERKELEY

The "Population Structure" of South American Sound Inventories

This will be a workshop-like talk on the application of STRUCTURE [1] to linguistic data.

STRUCTURE is a model for explaining the distribution of features in a sampled population in terms of an arbitrary number of "ancestral populations". Each specimen is explained as inheriting its features from one or more ancestral populations. Though originally developed for studying population genetics, STRUCTURE can conceivably be applied to clustering and classification problems in historical linguistics and dialectology, particularly when contact-induced borrowing, not internal development, is the primary source of linguistic change.

I will discuss the statistical underpinnings of the model in some detail, and will review Reesink et al's use of STRUCTURE in classifying the languages of Southeast Asia and Oceania [2]. With Tammy Stark, I will also be presenting some preliminary results from applying STRUCTURE to the sound inventories of 300+ South American languages.

  1. Jonathan K. Pritchard, Matthew Stephens & Peter Donnelly. 2000. "Inference of population structure using multilocus genotype data." Genetics Society of America155(2).
  2. Ger Reesink, Ruth Singer & Michael Dunn. 2009. "Explaining the linguistic diversity of Sahul using population models." PLoS Biology 7(11).

APRIL 18 -

JAYE PADGETT
UNIVERSITY OF CALIFORNIA, SANTA CRUZ

Domain Generalization (work in collaboration with Scott Myers, UT Austin)

Word-final devoicing is a claimed example of domain generalization, a shift in a sound pattern from a larger prosodic domain to a smaller one: it derives diachronically first from the deterioration of voicing utterance-finally, and then is generalized to the end of all words. However, it has never been established that people perform domain generalization. We report on three artificial language learning experiments in which subjects were exposed to a final devoicing pattern only in utterance-final position, and then were tested on whether they apply the pattern to new utterances and utterance-medial words. Results of two of the experiments support domain generalization. We discuss implications of domain generalization for phonological theory.

APRIL 25 -

JEREMY O'BRIEN
UNIVERSITY OF CALIFORNIA, SANTA CRUZ

Debuccalization and Neutralization Avoidance

Debuccalization is a type of sound alternation or sound change that involves lenition of a consonant to a laryngeal consonant. Although it is often discussed in the literature as a subtype of lenition, it is unclear if debuccalization is a unified phenomenon. Any attempt to unify these various debuccalization processes must be able to account for the fact that the same segment weakens to different laryngeals in different languages (e.g. /k/ → [ʔ] in Indonesian, /k/ → [h] in Florentine Italian fast speech).

One possible explanation for the variation in debuccalization involves neutralization avoidance. It is plausible that neutralization causes difficulty in rule learning and processing, and for that reason neutralization avoidance may be a part of the grammar. An artificial grammar experiment was performed to investigate this possible effect. The artificial grammar includes a debuccalization rule modeled on Florentine Italian. Two versions of the same basic language were created — the phoneme inventories were manipulated to make the rule non-neutralizing for Language A, but neutralizing for Language B. Preliminary results of the experiment are presented.