TODO: an important technique for applications of neural networks: "Decoupled Neural Interfaces using Synthetic Gradients" from Google Deepmind (Jaderberg et al., 2016)

Beyond back-propagation Edit

Border Pairs Method and Bipropagation.

Classification region Edit

Fawzi et al. (2017)[1] show that neural nets' classification regions are connected.

Activation functions Edit

A list of activation functions:

  • identity
  • step
  • piece-wise linear
  • sigmoid
  • Complementary log-log
  • bipolar
  • Bipolar Sigmoid
  • tanh
    • LeCun's Tanh
  • Hard Tanh
  • absolute
  • rectifier
  • soft plus
  • soft max
  • max-out

See also:

Sigmoid Edit

Reference about modeling probablility distribution: Baum and Wilczek (1988)[3]

Optimization Edit

Hardware Edit

Phase-change memory is potentially faster than GPU (shown via partial implementation + simulation).[4]

References Edit

  1. Fawzi, Alhussein, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, and Stefano Soatto. "Classification regions of deep neural networks." arXiv preprint arXiv:1705.09552 (2017).
  2. Nwankpa, C., Ijomah, W., Gachagan, A., & Marshall, S. (2018). Activation Functions: Comparison of trends in Practice and Research for Deep Learning, 1–20.
  3. E.B. Baum and F. Wilczek. Supervised Learning of Probability Distributions by Neural Net- works. Neural Information Processing Systems, American Institute of Physics, 1988.
  4. "Training a neural network in phase-change memory beats GPUs". Arstechnica.