Domain adaptation of syntactic parsers

From Plank (2011)^[1]: "... supported by the results of Rimell and Clark (2008)^[2]. They found that, for parser performance, retraining the PoS tagger accounted for a greater proportion of the improvement on the biomedical data, while retraining the supertagger (that includes subcategorization information) was more beneficial for the questions domain. Thus, intuitively, they argue that the main difference between newspaper and biomedical text is in vocabulary, while the main difference between newspaper text and questions is syntactic."

Datasets[]

Biomedical[]

Others[]

TED talks: NAIST-NTT TED Treebank
QuestionBank and Stanford improvements

References[]

↑ Plank, B. (2011). Domain Adaptation for Parsing. PhD thesis. http://doi.org/10.4337/9781845420536.00006
↑ Rimell, L. & Clark, S. (2008). Adapting a Lexicalized-Grammar Parser to Con- trasting Domains. In Proceedings of the 2008 Conference on Empirical Meth- ods in Natural Language Processing (pp. 475–484). Honolulu, Hawaii: As- sociation for Computational Linguistics.

[1] Plank, B. (2011). Domain Adaptation for Parsing. PhD thesis. http://doi.org/10.4337/9781845420536.00006

[2] Rimell, L. & Clark, S. (2008). Adapting a Lexicalized-Grammar Parser to Con- trasting Domains. In Proceedings of the 2008 Conference on Empirical Meth- ods in Natural Language Processing (pp. 475–484). Honolulu, Hawaii: As- sociation for Computational Linguistics.

[1]

[2]