Old overview: Caruana (1997)
Yang et al. (2016):
"We present a deep hierarchical recurrent neural network for sequence tagging. Given a sequence of words, our model employs deep gated recurrent units on both character and word levels to encode morphology and context information, and applies a conditional random field layer to predict the tags. Our model is task independent, language independent, and feature engineering free. We further extend our model to multi-task and crosslingual joint training by sharing the architecture and parameters."
- R Caruana. 1997. Multitask learning. Machine Learning, 28:41–75.
- Yang, Salakhutdinov & Cohen. 2016. Multi-Task Cross-Lingual Sequence Tagging from Scratch. PDF