Attention mechanism was initially invented for machine translation but quickly found applications in many other tasks. It works whenever one needs to "translate" from one structure (images, sequences, trees) to another.
The basic idea is to read the input structure twice: once to encode the gist and another time (at each step while decoding) to "pay attention" to certain details.
Machine translation
TODO: Luong et al. (2015)[1]
Text processing/understanding
Natural language inference: Parikh et al. (2016)[2]
Visual
Mnih, V., Heess, N., Graves, A., & Kavukcuoglu, K. (2014). Recurrent Models of Visual Attention, 1–12. Retrieved from http://arxiv.org/abs/1406.6247
Ba, J., Mnih, V., & Kavukcuoglu, K. (2014). Multiple Object Recognition with Visual Attention. arXiv Preprint arXiv:1412.7755.
Audio
http://arxiv.org/pdf/1508.01211.pdf
References
- ↑ Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. Emnlp, (September), 11. Retrieved from http://arxiv.org/abs/1508.04025
- ↑ Parikh, A. P., Täckström, O., Das, D., & Uszkoreit, J. (2016). A Decomposable Attention Model for Natural Language Inference. Retrieved from http://arxiv.org/abs/1606.01933