(Rosca and Breuel (2017)) Tag: Visual edit |
(→Limitations: fix citation) Tag: Visual edit |
||
Line 15: | Line 15: | ||
== Limitations == |
== Limitations == |
||
− | * cannot model reduplication: |
+ | * cannot model reduplication: Prickett (2017)<ref>Prickett, Brandon. "Vanilla Sequence-to-Sequence Neural Nets cannot Model Reduplication." (2017).</ref> |
== References == |
== References == |
Revision as of 08:28, 29 January 2018
Training
Problem: gradient vanishing or exploding.
Long Short-Term Memory
Structurally constrained network
Mikolov et al. (2015)[1] combine feed-forward NN with a cache model.
Rectified units with initialization trick
Le et al. (2015)[2] uses rectified units with identity matrix or its scaled-down versions as recurrent matrices.
Limitations
- cannot model reduplication: Prickett (2017)[3]
References
- ↑ Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., & Ranzato, M. A. (2014). Learning Longer Memory in Recurrent Neural Networks. arXiv preprint arXiv:1412.7753.
- ↑ Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton, 2015. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units. URL
- ↑ Prickett, Brandon. "Vanilla Sequence-to-Sequence Neural Nets cannot Model Reduplication." (2017).