Recurrent neural networks: Difference between revisions | Natural Language Understanding Wiki | Fandom

Revision as of 08:28, 29 January 2018

Training

Problem: gradient vanishing or exploding.

Long Short-Term Memory

Structurally constrained network

Mikolov et al. (2015)^[1] combine feed-forward NN with a cache model.

Rectified units with initialization trick

Le et al. (2015)^[2] uses rectified units with identity matrix or its scaled-down versions as recurrent matrices.

Limitations

cannot model reduplication: Prickett (2017)^[3]

References

↑ Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., & Ranzato, M. A. (2014). Learning Longer Memory in Recurrent Neural Networks. arXiv preprint arXiv:1412.7753.
↑ Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton, 2015. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units. URL
↑ Prickett, Brandon. "Vanilla Sequence-to-Sequence Neural Nets cannot Model Reduplication." (2017).