From Toutanova et al. (2015): "corpora labeled with POS tags at the token level are only available for around 35 languages, while tag dictionaries of the form displayed in Fig. 1 are available for many more languages, either in commercial dictionaries or community created resources such as Wiktionary. Tag dictionaries provide type-level supervision for word types in the lexicon. Similarly, while sentences labeled with named entities are scarce, gazetteers and databases are more readily available (Bollacker et al., 2008).
There has been substantial research on how best to build models using such type-level supervision, for POS tagging, super sense tagging, NER, and relation extraction (Craven et al., 1999; Smith and Eisner, 2005; Carlson et al., 2009; Mintz et al., 2009; Johannsen et al., 2014),"
- Toutanova, K. (2015). Model Selection for Type-Supervised Learning with Application to POS Tagging. In CoNLL 2015 (pp. 332–337).