Related works Edit
A neural network with a maxout layer can be viewed as automatically learning and choosing multiple senses of a word. Devlin et al. (2015) made use of such architecture (among others) but didn't mention word senses.
Community content is available under CC-BY-SA unless otherwise noted.