TODO: an important technique for applications of neural networks: "Decoupled Neural Interfaces using Synthetic Gradients" from Google Deepmind (Jaderberg et al., 2016)
Beyond back-propagation Edit
Classification region Edit
Fawzi et al. (2017) show that neural nets' classification regions are connected.
Activation functions Edit
A list of activation functions:
- piece-wise linear
- Complementary log-log
- Bipolar Sigmoid
- LeCun's Tanh
- Hard Tanh
- soft plus
- soft max
Reference about modeling probablility distribution: Baum and Wilczek (1988)
Phase-change memory is potentially faster than GPU (shown via partial implementation + simulation).
- ↑ Fawzi, Alhussein, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, and Stefano Soatto. "Classification regions of deep neural networks." arXiv preprint arXiv:1705.09552 (2017).
- ↑ Nwankpa, C., Ijomah, W., Gachagan, A., & Marshall, S. (2018). Activation Functions: Comparison of trends in Practice and Research for Deep Learning, 1–20. http://doi.org/10.1126/science.270.5241.1491
- ↑ E.B. Baum and F. Wilczek. Supervised Learning of Probability Distributions by Neural Net- works. Neural Information Processing Systems, American Institute of Physics, 1988.
- ↑ "Training a neural network in phase-change memory beats GPUs". Arstechnica.