A comprehensive and accessible introduction to statistics in corpus linguistics, covering multiple techniques of quantitative language analysis and data visualisation.
Vaclav Brezina is a research fellow and lecturer at the Department of Linguistics and English Language, Lancaster University. He specialises in corpus linguistics, statistics and applied linguistics, and has designed a number of different tools for corpus analysis.
1. Introduction: statistics meets corpus linguistics; 2. Vocabulary: frequency, dispersion and diversity; 3. Semantics and discourse: collocations, keywords and reliability of manual coding; 4. Lexico-grammar: from simple counts to complex models; 5. Register variation: correlation, clusters and factors; 6. Sociolinguistics and stylistics: individual and social variation; 7. Change over time: working diachronic data; 8. Bringing everything together: ten principles of statistical thinking, meta-analysis and effect sizes.