Libraries and Corpora

Department holdings

Linguistics Department book holdings (HEMLOC)
Survey of California and Other Indian Languages book holdings

Berkeley Libraries

Main library
Linguistics collections at the Main Library
Library's Connecting from Off Campus page

Language Corpora

The department houses several online, searchable corpora developed by its own faculty and students:

Ararahih'urípih: A Dictionary and Text Corpus of the Karuk Language
Sino-Tibetan Etymological Dictionary and Thesaurus (STEDT)
Turkish Electronic Living Lexicon (TELL)
Yurok Language Project

The Library provides access to corpora, including materials from the Linguistic Data Consortium (LDC), COCA, COHA, GloWbE, Switchboard, and many others:

Library sources for Text Mining & Computational Text Analysis (This is the best place to start searching. See especially the 'Linguistic Corpora' section for links to materials in oskicat)
Language corpora in the Library catalog (Calnet login required. Currently few materials are accessible this way.)

In addition the department has access to other national and international corpora, including:

Berkeley Language Center Collections/Archives
The Oxford English Dictionary. Unrestricted searches available from the campus network. For off-campus access, see the Library's Connecting from Off Campus page.

Quick Links

Libraries and Corpora

Department holdings

Berkeley Libraries

Language Corpora