According to Lee et al. (2013):
The idea of beginning with the most accurate models or starting with smaller subproblems that allow for high-precision solutions combines the intuitions of “shaping” or “successive approximations” first proposed for learning by Skinner (1938), and widely used in NLP (e.g., the successively trained IBM MT models of Brown et al. ) and the “islands of reliability” approaches to parsing and speech recognition [Borghesi and Favareto 1982; Corazza et al. 1991]). The idea of beginning with a high-recall list of candidates that are followed by a series of high-precision filters dates back to one of the earliest architectures in natural language processing, the part of speech tagging algorithm of the Computational Grammar Coder (Klein and Simmons 1963) and the TAGGIT tagger (Greene and Rubin 1971), which begin with a high-recall list of all possible tags for words, and then used high-precision rules to filter likely tags based on context.
- Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., & Jurafsky, D. (2013). Deterministic coreference resolution based on entity-centric, precision-ranked rules. Computational Linguistics, 39(4), 885-916.