Language and Cognition

Interpreting Intermediate Convolutional Layers of Generative CNNs Trained on Waveforms

Gašper Beguš
Alan Zhou

This paper presents a technique to interpret and visualize intermediate layers in generative CNNs trained on raw speech data in an unsupervised manner. We argue that averaging over feature maps after ReLU activation in each transpose convolutional layer yields interpretable time-series data. This technique allows for acoustic analysis of intermediate layers that parallels the acoustic analysis of human speech data: we can extract F0, intensity, duration, formants, and other acoustic properties from intermediate layers in order to test where and how CNNs encode various types of...

Encoding of speech in convolutional layers and the brain stem based on language experience

Gašper Beguš
Alan Zhou
Christina Zhao

Comparing artificial neural networks with outputs of neuroimaging techniques has recently seen substantial advances in (computer) vision and text-based language models. Here, we propose a framework to compare biological and artificial neural computations of spoken language representations and propose several new challenges to this paradigm. The proposed technique is based on a similar principle that underlies electroencephalography (EEG): averaging of neural (artificial or biological) activity across neurons in the time domain, and allows to compare encoding of any acoustic property in the...

Berkeley linguists @ RaAM16

April 11, 2023

The following papers from our department have been accepted for presentation at the 16th Researching and Applying Metaphor (RaAM16) conference, hosted at the Universidad de Alcalá, Spain, from June 28 to 30, 2023:

Bryce Wallace and Eve Sweetser: "Anti-Vax framings and metaphors: What makes an Anti-Vaxxer?" Eve Sweetser: "Culturally based metaphors, frame metonymy, and 'culturally primary' associations."

Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data

Gašper Beguš
Alan Zhou

Human speakers encode information into raw speech which is then decoded by the listeners. This complex relationship between encoding (production) and decoding (perception) is often modeled separately. Here, we test how encoding and decoding of lexical semantic information can emerge automatically from raw speech in unsupervised generative deep convolutional networks that combine the production and perception principles of speech. We introduce, to our knowledge, the most challenging objective in unsupervised lexical learning: a network that must learn unique representations for...

Toward understanding the communication in sperm whales

J. Andreas
Gašper Beguš
M. Bronstein
R. Diamant
D. Delaney
S. Gero
S. Goldwasser
D. Gruber
S. de Haas
P. Malkin
N. Pavlov
R. Payne
G. Petri
D. Rus
P. Sharma
D. Tchernov
P. Tønnesen
A. Torralba
D. Vogt
R. Wood

Machine learning has been advancing dramatically over the past decade. Most strides are human-based applications due to the availability of large-scale datasets; however, opportunities are ripe to apply this technology to more deeply understand non-human communication. We detail a scientific roadmap for advancing the understanding of communication of whales that can be built further upon as a template to decipher other forms of animal and non-human communication. Sperm whales, with their highly developed neuroanatomical features, cognitive abilities, social structures, and discrete...

Berkeley linguists published in Journal of Language Evolution

May 1, 2022

A new paper by Berkeley linguists and colleagues has just appeared in the Journal of Language Evolution:

Noga Zaslavsky*, Karee Garvin* (PhD 2021), Charles Kemp, Naftali Tishby, and Terry Regier. 2022. The evolution of color naming reflects pressure for efficiency: Evidence from the recent past. Journal of Language Evolution. (* = co-first authors, contributed equally)

Click here for the preprint PDF. Congrats to all!

Didn't hear that coming: Effects of withholding phonetic cues to code-switching.

Alice Shen
Susanne Gahl
Keith Johnson

Code-switching has been found to incur a processing cost in auditory comprehension. However, listeners may have access to anticipatory phonetic cues to code-switches (Piccinini & Garellek, 2014; Fricke et al., 2016), thus mitigating switch cost. We investigated effects of withholding anticipatory phonetic cues on code-switched word recognition by splicing English-to-Mandarin code-switches into unilingual English sentences. In a concept monitoring experiment, Mandarin–English bilinguals took longer to recognize code-switches, suggesting a switch cost. In an eye tracking experiment, the...

Twenty-eight years of vowels

Gahl, Susanne
Baayen, Harald

Research on age-related changes in speech has primarily focused on comparing “young” vs. “elderly” adults. Yet, listeners are able to guess talker age more accurately than a binary distinction would imply, suggesting that acoustic characteristics of speech change continually and gradually throughout adulthood. We describe acoustic properties of vowels produced by eleven talkers based on naturalistic speech samples spanning a period of 28 years, from ages 21 to 49. We find that the position of vowels in F1/F2 space shifts towards the periphery with increasing talker age. Based on...