Knowledge base Edit
The key idea is that good embeddings to predict triples in a knowledge base are those that cluster similar synsets together. By training a model on knowledge base completion task, researchers hope to find good synset embeddings as byproduct.
Given a triple, e.g. (cat, has_part, tail), let l denotes the left synset, r is the right synset and t is the relation between them.
In the literature, there are two ways to formalize this task:
- Margin-based: A scoring function g should score observed triples greater than random triples at least to some margin, hence: maximize $ L+R $ where in L, left entities are replaced by random entities $ L = \sum min(g(l, t, r) - g(l', t, r) - margin, 0) $ while in R, right entities are randomized $ R = \sum min(g(l, t, r) - g(l, t, r') - margin, 0) $.
- Negative sampling: A function f is trained to differentiate between a triple drawn from correct distribution D and one drawn from a uniform distribution N. For example, $ f(\cdot) $ stands for the probability that a triple comes from D then we minimize negative log-likelihood: $ \sum_{D} -\log f(l, t, r) + \sum_{N} -\log (1 - f(l', t', r')) $.
Many models have been proposed (see Yang et al., 2014^{[1]} for a review):
- Unstructured^{[2]}: Treat all relations indifferently $ g(l, t, r) = |l-r|_p $
- RESCAL^{[3]}: TODO
- SE^{[4]}: A relation is represented by two matrices, working as linear transformation on each side: $ g(l, t, r) = |l.t_L - r.t_R|_p $
- SME(LINEAR)^{[2]}: A relation is represented by two vectors, the semantic matching energy function compare two sides: $ g(l, t, r) = (w_{L1}.l + w_{L2}.t_L + b_l)(w_{R1}.r + w_{R2}.t_R + b_R) $
- SME(BILINEAR)^{[2]}: The representation of relations stays the same but weights are rank 3 tensors: $ g(l, t, r) = ((w_L \times t_L).l)((w_R \times t_R).r) $
- LFM^{[5]}: TODO
- TransE^{[6]}: A relation is a translation of the left hand side to the right hand side: $ g(l,t,r) = |l+t-r|_p $
- TransM^{[7]}: Scale scores according to its relation: $ g(l,t,r) = w_T|l+t-r|_p $
- Neural tensor network^{[8]}: Relations are represented as rank-3 tensors, score consists of two parts: "tensor" part is the tensor product of them with a relation and "neural" part adds up linear combination of entities: $ g(l, t, r)=l.t.r + w_L.l + w_R.r $
Knowledge base + Text Edit
Wang et al. (2014)^{[9]} created two models for entities and words and align them by Wikipedia anchors or the name of entities.
Bordes et al. (2012)^{[10]} trains embeddings on knowledge base completion and word sense disambiguation tasks simultaneously therefore make use of both knowledge bases and corpora.
References Edit
- ↑ Yang, B., Yih, W., He, X., Gao, J., & Deng, L. (2014). Embedding Entities and Relations for Learning and Inference in Knowledge Bases, 12. Computation and Language. Retrieved from http://arxiv.org/abs/1412.6575
- ↑ ^{2.0} ^{2.1} ^{2.2} A. Bordes, X. Glorot, J. Weston, and Y. Bengio. A semantic matching energy function for learning with multi-relational data. Machine Learning, 2013.
- ↑ M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.
- ↑ A. Bordes, J.Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowl- edge bases. In Proceedings of the 25th Annual Conference on Artificial Intelligence (AAAI), 2011.
- ↑ R. Jenatton, N. Le Roux, A. Bordes, G. Obozinski, et al. A latent factor model for highly multi-relational data. In Advances in Neural Information Processing Systems (NIPS 25), 2012.
- ↑ Bordes, A., Usunier, N., Weston, J., & Yakhnenko, O. (2013). Translating Embeddings for Modeling Multi-relational Data. In N (pp. 1–9).
- ↑ Miao Fan, Qiang Zhou, Emily Chang, Thomas Fang Zheng. Transition-based Knowledge Graph Embedding with Relational Mapping Properties. PACLIC'14
- ↑ R. Socher, D. Chen, C. D. Manning, and A. Y. Ng. Learning new facts from knowledge bases with neural tensor networks and semantic word vectors. In Advances in Neural Information Processing Systems (NIPS 26), 2013.
- ↑ Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge Graph and Text Jointly Embedding. In The 2014 Conference on Empirical Methods on Natural Language Processing. ACL – Association for Computational Linguistics. Retrieved from http://research.microsoft.com/apps/pubs/default.aspx?id=228269
- ↑ Bordes, A., & Weston, J. (2012). Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing, 22, 127–135.