Credit for the original idea: Abney (1991)[1].

From Kübler et al. (2007)[2]: "transition-based approach was first explored by Matsumoto and colleagues (Kudo and Matsumoto, 2002[3]; Yamada and Matsumoto, 2003[4])"

TODO: a good paper detailing training biases: Aufrant et al. (2017)[5]. I think much of the problems they addressed is implicitly solved using reinforcement learning.

References Edit

  1. Abney, S. P. (1991). Parsing By Chunks. Principle-Based Parsing, 257–278. doi:
  2. Kübler, S., Mcdonald, R., & Nivre, J. (2007). Dependency Parsing.
  3. Kudo, Taku and Matsumoto, Yuji (2002). Japanese dependency analysis using cascaded chunking, Proceedings of the 6th Workshop on Computational Language Learning (CoNLL), Taipei, Taiwan, pp. 63–69. 
  4. Yamada, Hiroyasu and Matsumoto, Yuji (2003). Statistical dependency analysis with support vector machines, Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), Nancy, France, pp. 195–206. 
  5. Lauriane Aufrant, Guillaume Wisniewski and François Yvon. Don’t Stop Me Now! Using Global Dynamic Oracles to Correct Training Biases of Transition-Based Dependency Parsers. EACL 2017