Gašper Beguš gave a virtual invited hall titled Modeling language from raw speech with GANs” at the CHAI: Chat about AI colloquium at the School of Data Science and AI, Indian Institute of Technology (IIT Guwahati) on September 13, 2023.
Gašper Beguš published a paper titled "Articulation GAN: Unsupervised modeling of articulatory learning" in proceedings of ICASSP 2023 (IEEE International Conference on Acoustics, Speech and Signal Processing) with Alan Zhou, Peter Wu, and Gopala K. Anumanchipalli. The paper is available here.
Video of the presentation scheduled to be given at the conference in Rhodes, Greece on June 9 is available here.
This paper presents a technique to interpret and visualize intermediate layers in generative CNNs trained on raw speech data in an unsupervised manner. We argue that averaging over feature maps after ReLU activation in each transpose convolutional layer yields interpretable time-series data. This technique allows for acoustic analysis of intermediate layers that parallels the acoustic analysis of human speech data: we can extract F0, intensity, duration, formants, and other acoustic properties from intermediate layers in order to test where and how CNNs encode various types of...
Comparing artificial neural networks with outputs of neuroimaging techniques has recently seen substantial advances in (computer) vision and text-based language models. Here, we propose a framework to compare biological and artificial neural computations of spoken language representations and propose several new challenges to this paradigm. The proposed technique is based on a similar principle that underlies electroencephalography (EEG): averaging of neural (artificial or biological) activity across neurons in the time domain, and allows to compare encoding of any acoustic property in the...
Isaac Bleaman and Ronald Sprouse have published a tutorial on speaker diarization at the Linguistics Methods Hub. The process allows researchers to automatically generate ELAN or Praat files for audio recordings with speech segments marked off on the appropriate speaker tiers — an important first step in the transcription workflow.
Isaac Bleaman and Chaya Nove will be giving a research talk at the 54th annual meeting of the Association for Jewish Studies, held in Boston, December 18-20. Their talk is titled "The Corpus of Spoken Yiddish in Europe: A new resource for language research and pedagogy," and it is part of a panel on "Jewish Corpus Linguistics and Language Documentation."
Gašper Beguš and Alan Zhou (Berkeley Speech and Computation lab alum) published a paper titled "Interpreting Intermediate Convolutional Layers of Generative CNNs Trained on Waveforms" in IEEE/ACM Transactions on Audio, Speech, and Language Processing. The paper is available through Open Access here: https://doi.org/10.1109/TASLP.2022.3209938