TODO: a survey on AI Magazine, a blogpost
https://futurism.com/deepmind-abstract-reasoning-iq/
Multi-hop reasoning[]
- HotpotQA (Yang et al. 2018)[1]
Comprehension[]
- LAMBADA
- CSI (Lea et al. 2017[2])
- reading children book: http://arxiv.org/abs/1511.02301
- EpiReader (Trischler et al., 2016)[3]
- CNN/Daily Mail (CNNDM) benchmark by Hermann et al. (2015)
- CBT by Hill et al. (2016).
- Google DeepMind's comprehension test
- Quiz bowl
- Microsoft's MCTest
- MSRCC dataset (Zweig and Burges, 2011).
- Biological processes (Berant et al., 2014[4]): 200 paragraphs from the textbook Biology with 585 multiple-choice questions
- ROCStories (Mostafazadeh, 2016)
Mathematical reasoning[]
- Saxton et al. (2019)[5]
- Euclid from AllenAI and SemEval 2019, Task 10
TO CATEGORIZE[]
- bAbI: http://arxiv.org/pdf/1502.05698v3.pdf
- GABITS (inspired by bAbI): http://www.thespermwhale.com/jaseweston/ram/papers/paper_14.pdf
- FRACAS
- [1]
- Bongard problem
- IQ test
- Imagining: [2]
- Entrance exam 2015
- Allen AI Challenge
- Visual Turing challenge
- VQA: Visual Question Answering [6]
- TODO: http://arxiv.org/abs/1502.05698
- WikiReading from Google
References[]
- ↑ Yang, Z., Qi, P., Zhang, S., Bengio, Y., Cohen, W. W., Salakhutdinov, R., & Manning, C. D. (2020). Hotpotqa: A dataset for diverse, explainable multi-hop question answering. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, 2369–2380.
- ↑ Frermann, L., Cohen, S. B., & Lapata, M. (2017). Whodunnit? Crime Drama as a Case for Natural Language Understanding. Retrieved from http://arxiv.org/abs/1710.11601
- ↑ Trischler, A., Ye, Z., Yuan, X., & Suleman, K. (2016). Natural Language Comprehension with the EpiReader. Retrieved from http://arxiv.org/abs/1606.02270
- ↑ Berant, J., Srikumar, V., Chen, P.-C., Linden, A. Vander, Harding, B., Huang, B., … Manning, C. D. (2014). Modeling Biological Processes for Reading Comprehension. In Empirical Methods in Natural Language Processing (EMNLP).
- ↑ Saxton, D., Grefenstette, E., Hill, F., & Kohli, P. (2019). Analysing mathematical reasoning abilities of neural models. ICLR 2019, 1–17. Retrieved from https://arxiv.org/pdf/1904.01557.pdf
- ↑ Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh. 2015.