Given the acoustic consequences of physiological differences between talkers, there is a practical need for effective and theoretically motivated procedures of vowel normalization to facilitate comparison of speech produced by people who differ by dialect or language. In addition, there is a question whether listeners might utilize a normalization procedure during speech perception. This paper reports the results of two studies that explore these questions—with particular focus on vocal tract length normalization. Drawing on research in speech engineering, where accurate estimates of vocal tract length are needed in some approaches to automatic speech recognition and speaker verification, a new model of vowel normalization is introduced. The model uses a direct measure of average formant spacing (the ΔF) which can be used to measure vocal tract length. The acoustic consequences of vocal tract length differences are removed from vowel measurements by scaling vowel formant measurements by ΔF. Study 1 found that this method is comparable to Nearey’s (1978) uniform normalization method, while providing an explicit vocal tract length interpretation, and a rationalized unit of measure. Study 2 found that uniform normalization measures (which let each formant serve as a noisy estimator of ΔF) improve vowel classification even with only a couple of randomly selected vowel tokens. This suggests that vocal tract length normalization could be involved in speech perception.
July 22, 2020
Johnson, K. (2020) The Delta F method of vocal tract length normalization for vowels. Laboratory Phonology, 11(1), 10. DOI: http://doi.org/10.5334/labphon.196