Word embeddings initialization

Main idea: try to allocate similar vectors to similar words from the start (see Goldberg (2015)^[1], Sections 5.1-5.3).

Chen and Manning (2014)^[2]: Figure 4 (middle) shows that using pre-trained word embeddings can obtain around 0.7% improvement on PTB and 1.7% improvement on CTB, compared with using random initialization within (−0.01, 0.01)."

Lebret et al. (2015)^[3]: "This task confirms the importance of embedding fine-tuning for NLP tasks with a high semantic component. We note that our tuned embeddings leads to a performance gain of about 1% or 2% for NER, while the gain is between about 4% and 8% for the movie review."

Pei et al. (2014)^[4]: "Previous work found that the performance can be improved by pre-training the character embeddings on large unlabeled data and using the obtained embeddings to initialize the character lookup table instead of random initialization (Mansur et al., 2013; Zheng et al., 2013). [...] We pre-train the embeddings on the Chinese Giga-word corpus (Graff and Chen, 2005). As shown in Table 5 (last three rows), both the F-score and OOV recall of our model boost by using pre-training."

References[]

↑ Goldberg, Y. (2015). A Primer on Neural Network Models for Natural Language Processing, 1–76.
↑ Chen, Danqi, and Christopher D. Manning. "A Fast and Accurate Dependency Parser using Neural Networks." EMNLP. 2014.
↑ Lebret, Rémi, Joël Legrand, and Ronan Collobert. Is deep learning really necessary for word embeddings?. No. EPFL-REPORT-196986. Idiap, 2013.
↑ Pei, Wenzhe, Tao Ge, and Baobao Chang. "Max-Margin Tensor Neural Network for Chinese Word Segmentation." ACL (1). 2014.
↑ Labutov, Igor, and Hod Lipson. "Re-embedding words." ACL (2). 2013. PDF

[1] Goldberg, Y. (2015). A Primer on Neural Network Models for Natural Language Processing, 1–76.

[2] Chen, Danqi, and Christopher D. Manning. "A Fast and Accurate Dependency Parser using Neural Networks." EMNLP. 2014.

[3] Lebret, Rémi, Joël Legrand, and Ronan Collobert. Is deep learning really necessary for word embeddings?. No. EPFL-REPORT-196986. Idiap, 2013.

[4] Pei, Wenzhe, Tao Ge, and Baobao Chang. "Max-Margin Tensor Neural Network for Chinese Word Segmentation." ACL (1). 2014.

[5] Labutov, Igor, and Hod Lipson. "Re-embedding words." ACL (2). 2013. PDF

[1]

[2]

[3]

[4]

[5]

Word embeddings initialization

See also[]

References[]

Fan Feed