From Durrett & Klein (2014): "ACE 2005 corpus (NIST, 2005): this corpus annotates mentions complete with coreference, semantic types (per mention), and entity links (also per mention) later added by Bentivogli et al. (2010). [...] train/test split from Stoyanov et al. (2009), Haghighi and Klein (2010), and Bansal and Klein (2012)."
- Official contest: test data is only available to participants
- LDC2005E18: enlarged version
- LDC2006T06: further enlarged version
- Stoyanov et al.'s split: newswire only, documents unclear from which set, 57 documents for training, 24 for testing (ratio 70/30)
- Rahman and Ng's split: full 599 documents from LDC2006T06, split into 482 documents for training, 117 testing (ratio 80/20), balanced between genres
See also Edit
Official website(dead link - archived version): documentation, software, resources
- CMU machine learning wiki
- LDC page (pay a lot of money to download)
- Event coreference resolution (state of the art)
- ↑ Durrett, G., & Klein, D. (2014). A Joint Model for Entity Analysis: Coreference, Typing, and Linking. In Transactions of the Association for Computational Linguistics (Vol. 2, pp. 477–490). Retrieved from https://transacl.org/ojs/index.php/tacl/article/view/412
- ↑ V Stoyanov, N Gilbert, C Cardie, and E Riloff. 2009. Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-art. In Associate of Computational Linguistics (ACL).
- ↑ A Rahman and V Ng. 2009. Supervised models for coreference resolution. In Proceedings of the 2009 Conference on Empirical Conference in Natural Lan- guage Processing.