From Zeldes (2017)[1]: "Selecting the right kinds of texts in correct proportions for a representative corpus has been a hotly debated topic for over two decades (Biber 1993; Crowdy 1993; Hunston 2008). Corpus design is of course intimately related to the research questions that a corpus is meant to answer (cf. Reppen 2010),"

References Edit

  1. Zeldes, A. (2017). The GUM corpus: creating multilayer resources in the classroom. Language Resources and Evaluation, 51(3), 581–612.
