Natural Language Understanding Wiki

Pseudo-recurrent NNLM[]

Le (2013)[1] introduced an unfolded and truncated recurrent NNLM called pseudo-recurrent.

Long short-term memory (LSTM)[]


  • Semantic role labeling: He et al. (2017)[2]
  • Dependency parsing: Dyer et al. (2015)[3]
  • Machine translation: Sutskever et al. (2014)[4]
  • ...

Constrained recurrent neural network[]

Mikolov et al. (2015)[5] uses much simpler an architecture to achieve performance similar to LSTM.

FOFE: simple recurrent model achieving good results

Le et al. 2015: initialization trick + ReLU



  1. Le, H. S. (2012). Continuous space models with neural networks in natural language processing (Doctoral dissertation, Université Paris Sud-Paris XI).
  2. He, L., Lee, K., Lewis, M., & Zettlemoyer, L. (2017). Deep Semantic Role Labeling : What Works and What ’ s Next. Acl2017.
  3. Dyer, C., Ballesteros, M., Ling, W., Matthews, A., & Smith, N. A. (2015). Transition-Based Dependency Parsing with Stack Long Short-Term Memory. In ACL 2015 (pp. 334–343).
  4. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104–3112).
  5. Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., & Ranzato, M. A. (2014). Learning Longer Memory in Recurrent Neural Networks. arXiv preprint arXiv:1412.7753.