Nichols colloquium

October 8, 2020

The 2020-2021 colloquium series kicks off this coming Monday, October 12, with a talk by Johanna Nichols (UC Berkeley), held via Zoom. The talk is entitled Proper measurement of linguistic complexity (and why it matters), and the abstract is as follows:

Hypotheses involving linguistic complexity generate interesting research in a variety of subfields – typology, historical linguistics, sociolinguistics, language acquisition, cognition, neurolinguistics, language processing, and others. Good measures of complexity in various linguistic domains are essential, then, but we have very few and those are mostly single-feature (chiefly size of phoneme inventory and morphemes per word in text).
In other ways as well what we have is not up to the task. The kind of complexity that is favored by certain sociolinguistic factors is not what is usually surveyed in studies invoking the sociolinguistic work. Phonological and morphological complexity are very strongly inversely correlated and form opposite worldwide frequency clines, yet surveys of just one or the other, or both lumped together, are used to support cross-linguistic generalizations about the distribution of complexity writ large. Complexity of derivation, syntax, and lexicon is largely unexplored. Measuring the complexity of polysynthetic languages in the right terms has not been seriously addressed.
This paper proposes a tripartite metric---enumerative, transparency-based, and relational---using a set of different assays across different parts of the grammar and lexicon, that addresses these problems and should help increase the grammatical sophistication of complexity-based hypotheses and choice of targets for computational extraction of complexity levels from corpora. Meeting current expectations of sustainability and replicability, the set is reusable, revealing, reasonably granular, and (at least mostly) amenable to computational implementation. I demonstrate its usefulness to typology and historical linguistics with some cross-linguistic and within-family surveys.