FANDOM


Words Entity mentions Event mentions Documents Layers All sent.? All events?

All entities?

Notes
OntoNotes (English)[1] 1.5M[2] ? ? ? dependency, propbank SRL, entity coref. yes only NPs
ACE 2005[3] + ACEtoWiki[4] 300,000[5] 16,310[4] ? 535 event coref., semantic type, entity linking ? only 8 types Any Wikipedia Topic
Lee et al.[6] (extension of ECB[7]) 167,494[8] 5447[9] 2533[9] 482[9] entity coref, event coref no[10] yes[11] or no[12]?
ECB+

[13]

262,601[14] 2255+2392

+9577+2963

= 17,187[15]

14884[16] 482+502=982 event coref no no cross document
TimeBank 1.2 (TimeML) + Causal TimeBank[17] 61,000[18] ? 7,923[18] 183 temporal relations, causal relations (CSIGNALs, CLINKs)
MEANTIME[19] no no
Hong et al. (2016)[20] 863 125 no[21]
Richer Event Description (RED) corpus[22] 54,287 10,320 entities + 1127 temporal epressions 8,731 92 entity coref[23], event coref., bridging, temporal, causal and subevent relations yes yes[24] $1,750 for non-member, newswire and discussion forums
Event nugget[25] 139444 training

+98414 testing

= 237,858

- 6538 training

+ 6438 test = 12,976

202 event coref newswire and discussion forum
ACE2004 57 entity linking  ?  ? news
AIDA-CoNLL ~300,000 34,956 (no entity: 7,136) 1,393 entity recognition, entity linking  ?  ? PER, ORG, LOC, MISC news
NP4E[26] 50,000[27] ? ? 94[27] entity coref. (full), event coref. (partial)[27] yes no news (Reuters)
Aquaint 50 entity linking (common+named)  ?  ? news
IITB 103 entity linking  ?  ? mixed
KORE 50 50 entity linking  ?  ? mixed
Meij 502 entity linking  ?  ? tweets
Microposts2014 3505 entity linking  ?  ? tweets
MSNBC 20 entity linking  ?  ? news
N3 Reuters-128 128 entity linking  ?  ? news
N3 RSS-500 500 entity linking  ?  ? RSS-feeds
Spotlight Corpus 58 entity linking  ?  ? news

See also Edit

References Edit

  1. Weischedel, R., Palmer, M., Marcus, M., Hovy, E., Pradhan, S., Ramshaw, L., … Houston, A. (2013). OntoNotes Release 5.0 LDC2013T19. Linguistic Data Consortium.
  2. Weischedel et al. (2013), page 4.
  3. LDC. (2005). Ace (automatic content extraction) english annotation guidelines for events ver. 5.4.3 2005.07.01. Technical report, Linguistic Data Consortium.
  4. 4.0 4.1 Bentivogli, L., Forner, P., Giuliano, C., Marchetti, A., Pianta, E., & Tymoshenko, K. (2010). Extending English ACE 2005 corpus annotation with ground-truth links to Wikipedia. 23rd International Conference on Computational Linguistics, (August), 19–26.
  5. ACE. (2005). The ACE 2005 (ACE05) Evaluation Plan, 1–20.
  6. Lee, H., Recasens, M., Chang, A., Surdeanu, M., & Jurafsky, D. (2012). Joint Entity and Event Coreference Resolution across Documents. (EMNLP-CoNLL 2012) Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, (July), 489–500. Retrieved from http://www.aclweb.org/anthology/D12-1045
  7. Bejan, C. A., & Harabagiu, S. (2010). Unsupervised Event Coreference Resolution with Rich Linguistic Features. In ACL 2010 (pp. 1412–1422).
  8. Code: https://gist.github.com/minhlab/75811994c75549465cd3fc0ba7d20f13
  9. 9.0 9.1 9.2 Table 3 in Lee et al. (2012)
  10. In jcoref-corpus/README.txt: "Note that this is not a full annotation of the corpus. Only the sentences annotated in ECB are fully annotated."
  11. From Lee et al. (2012): "We extended the OntoNotes guidelines by also annotating singletons (but we do not score them; see below), and by including all events mentions (not only those mentioned at least once with an NP)."
  12. How could it be that in 482 documents there are only 774 events?
  13. Cybulska, A., & Vossen, P. (2014). Using a sledgehammer to crack a nut? Lexical diversity and event coreference resolution. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 4545–4552. Retrieved from http://www.lrec-conf.org/proceedings/lrec2014/pdf/840_Paper.pdf
  14. Code: https://gist.github.com/minhlab/0379f030176e85f747c1d2f6e1232932
  15. Location+time+human/nonhuman part in Table 4, Cybulska and Vossen (2014)
  16. "Action mentions" in Table 4, Cybulska and Vossen (2014)
  17. Paramita Mirza, Rachele Sprugnoli, Sara Tonelli and Manuela Speranza. 2014. Annotating causality in the TempEval-3 corpus. In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL), pages 10–19, Gothenburg, Sweden, April. Association for Computational Linguistics. [pdf] [bib]
  18. 18.0 18.1 TimeBank 1.2 on LDC
  19. Anne-Lyse Minard, Manuela Speranza, Ruben Urizar, Begona Altuna, Marieke van Erp, Anneleen Schoen, and Chantal van Son. 2016. MEANTIME, the News- Reader multilingual event and time corpus. Proceed- ings of LREC2016.
  20. Hong, Y., Zhang, T., Gorman, T. O., Horowit-hendler, S., Ji, H., & Palmer, M. (2016). Building a Cross-document Event-Event Relation Corpus, 1–6.
  21. From Gorman et al. (2016): "Hong et al, (2016) annotated a wide inventory of event-event relations, but covered only events within the ERE ontology."
  22. Gorman, T. O., Wright-bettner, K., & Palmer, M. (2016). Richer Event Description : Integrating event coreference with temporal , causal and bridging annotation, 47–56.
  23. From Gorman et al. (2016): "Coreference in the first pass is done between all entities in the document"
  24. From Gorman et al. (2016): "In RED annotation, events and entities are annotated regardless of whether they participate in a coreference chain. All occurrences and timeline-relevant states are annotated as events, and entities are annotated according to whether or not they represent an actual discourse referent in the discourse."
  25. Mitamura, Teruko, Zhengzhong Liu, and Eduard Hovy. "Overview of TAC KBP 2015 Event Nugget Track." Text Analysis Conference. 2015. PDF
  26. Hasler, L., Orasan, C., Naumann, K., Hasler, L., & Orasan, C. (2006). NPs for Events : Experiments in Coreference Annotation. In LREC 2006 (pp. 1167–1172).
  27. 27.0 27.1 27.2 Recasens, M., Ant, M., Orasan, C., & Martí, M. A. (2012). Annotating Near-Identity from Coreference Disagreements. Proceedings of LREC 2012, 165–172.