Computational and Experimental Methods

Beguš and Zhou publish in IEEE/ACM TASLP

October 18, 2022

Gašper Beguš and Alan Zhou (Berkeley Speech and Computation lab alum) published a paper titled "Interpreting Intermediate Convolutional Layers of Generative CNNs Trained on Waveforms" in IEEE/ACM Transactions on Audio, Speech, and Language Processing. The paper is available through Open Access here: https://doi.org/10.1109/TASLP.2022.3209938

Regier colloquia

October 11, 2022

Terry Regier recently gave colloquium presentations at the University of Pennsylvania (September 30) and UC Irvine (October 4).

Beguš speaks at Yale

September 27, 2022

On October 3, Gašper Beguš will be giving a colloquium talk at the Yale University Department of Linguistics titled "Deep Phonology: Modeling language from raw acoustic data in a fully unsupervised manner." More information is available here.

Beguš, Bleaman, and Zhou publish in Interspeech 2022

September 20, 2022

Congratulations to Gašper Beguš, Isaac Bleaman, and Alan Zhou (BA 2021), who were just published in Proceedings of Interspeech 2022!

Beguš, Gašper and Alan Zhou. 2022. Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data. Proc. Interspeech 2022, 5298-5302. [article] [asynchronous talk] Webber, Jacob J., Samuel K. Lo, and Isaac L. Bleaman. 2022. REYD – The first Yiddish text-to-speech dataset and system. Proc. Interspeech 2022, 2363-2367. [article]

Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data

Gašper Beguš
Alan Zhou
2022

Human speakers encode information into raw speech which is then decoded by the listeners. This complex relationship between encoding (production) and decoding (perception) is often modeled separately. Here, we test how encoding and decoding of lexical semantic information can emerge automatically from raw speech in unsupervised generative deep convolutional networks that combine the production and perception principles of speech. We introduce, to our knowledge, the most challenging objective in unsupervised lexical learning: a network that must learn unique representations for...

Bleaman receives NSF CAREER Award

August 16, 2022

Congratulations to Isaac Bleaman, who has received a 5-year CAREER grant from the National Science Foundation! His project is entitled "Documenting and Analyzing Sociolinguistic Variation in the Speech of Holocaust Survivors," and it will involve developing a large corpus of conversational Yiddish for language research and community engagement. The project was described in a recent announcement to LSA members and publicized in the Forward (first in Yiddish and then in English translation).

Toward understanding the communication in sperm whales

J. Andreas
Gašper Beguš
M. Bronstein
R. Diamant
D. Delaney
S. Gero
S. Goldwasser
D. Gruber
S. de Haas
P. Malkin
N. Pavlov
R. Payne
G. Petri
D. Rus
P. Sharma
D. Tchernov
P. Tønnesen
A. Torralba
D. Vogt
R. Wood
2022

Machine learning has been advancing dramatically over the past decade. Most strides are human-based applications due to the availability of large-scale datasets; however, opportunities are ripe to apply this technology to more deeply understand non-human communication. We detail a scientific roadmap for advancing the understanding of communication of whales that can be built further upon as a template to decipher other forms of animal and non-human communication. Sperm whales, with their highly developed neuroanatomical features, cognitive abilities, social structures, and discrete...

Distinguishing cognitive from historical influences in phonology

Gašper Beguš
2022

Distinguishing cognitive influences from historical influences on human behavior has long been a disputed topic in behavioral sciences, including linguistics. The discussion is often complicated due to empirical evidence being consistent with both the cognitive and the historical approach. This article argues that phonology offers a unique test case for distinguishing historical and cognitive influences on grammar, and it proposes an experimental technique for testing the cognitive factor which controls for the historical factor. The article outlines a model called catalysis for...

Interpreting Intermediate Convolutional Layers In Unsupervised Acoustic Word Classification

Gašper Beguš
Alan Zhou
2022

Understanding how deep convolutional neural networks classify data has been subject to extensive research. This paper proposes a technique to visualize and interpret intermediate layers of unsupervised deep convolutional networks by averaging over individual feature maps in each convolutional layer and inferring underlying distributions of words with non-linear regression techniques. A GAN-based architecture (ciwGAN [1]) that includes a Generator, a Discriminator, and a classifier was trained on unlabeled sliced lexical items from TIMIT. The training process results in a deep...

Terry Regier

Chair of Linguistics, Professor of Linguistics and Cognitive Science

PhD, UC Berkeley

Language and cognition; semantic variation and universals; computational linguistics