From Zeldes (2017): "Selecting the right kinds of texts in correct proportions for a representative corpus has been a hotly debated topic for over two decades (Biber 1993; Crowdy 1993; Hunston 2008). Corpus design is of course intimately related to the research questions that a corpus is meant to answer (cf. Reppen 2010),"
- Zeldes, A. (2017). The GUM corpus: creating multilayer resources in the classroom. Language Resources and Evaluation, 51(3), 581–612. https://doi.org/10.1007/s10579-016-9343-x