Synset embedding

Knowledge base[]

The key idea is that good embeddings to predict triples in a knowledge base are those that cluster similar synsets together. By training a model on knowledge base completion task, researchers hope to find good synset embeddings as byproduct.

Given a triple, e.g. (cat, has_part, tail), let l denotes the left synset, r is the right synset and t is the relation between them.

In the literature, there are two ways to formalize this task:

Margin-based: A scoring function g should score observed triples greater than random triples at least to some margin, hence: maximize $L+R$ where in L, left entities are replaced by random entities $L = \sum min(g(l, t, r) - g(l', t, r) - margin, 0)$ while in R, right entities are randomized $R = \sum min(g(l, t, r) - g(l, t, r') - margin, 0)$ .
Negative sampling: A function f is trained to differentiate between a triple drawn from correct distribution D and one drawn from a uniform distribution N. For example, $f(\cdot)$ stands for the probability that a triple comes from D then we minimize negative log-likelihood: $\sum_{D} -\log f(l, t, r) + \sum_{N} -\log (1 - f(l', t', r'))$ .

Many models have been proposed (see Yang et al., 2014^[1] for a review):

Unstructured^[2]: Treat all relations indifferently $g(l, t, r) = |l-r|_p$
RESCAL^[3]: TODO
SE^[4]: A relation is represented by two matrices, working as linear transformation on each side: $g(l, t, r) = |l.t_L - r.t_R|_p$
SME(LINEAR)^[2]: A relation is represented by two vectors, the semantic matching energy function compare two sides: $g(l, t, r) = (w_{L1}.l + w_{L2}.t_L + b_l)(w_{R1}.r + w_{R2}.t_R + b_R)$
SME(BILINEAR)^[2]: The representation of relations stays the same but weights are rank 3 tensors: $g(l, t, r) = ((w_L \times t_L).l)((w_R \times t_R).r)$
LFM^[5]: TODO
TransE^[6]: A relation is a translation of the left hand side to the right hand side: $g(l,t,r) = |l+t-r|_p$
TransM^[7]: Scale scores according to its relation: $g(l,t,r) = w_T|l+t-r|_p$
Neural tensor network^[8]: Relations are represented as rank-3 tensors, score consists of two parts: "tensor" part is the tensor product of them with a relation and "neural" part adds up linear combination of entities: $g(l, t, r)=l.t.r + w_L.l + w_R.r$

Knowledge base + Text[]

Wang et al. (2014)^[9] created two models for entities and words and align them by Wikipedia anchors or the name of entities.

Bordes et al. (2012)^[10] trains embeddings on knowledge base completion and word sense disambiguation tasks simultaneously therefore make use of both knowledge bases and corpora.

References[]

↑ Yang, B., Yih, W., He, X., Gao, J., & Deng, L. (2014). Embedding Entities and Relations for Learning and Inference in Knowledge Bases, 12. Computation and Language. Retrieved from http://arxiv.org/abs/1412.6575
↑ ^2.0 ^2.1 ^2.2 A. Bordes, X. Glorot, J. Weston, and Y. Bengio. A semantic matching energy function for learning with multi-relational data. Machine Learning, 2013.
↑ M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.
↑ A. Bordes, J.Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowl- edge bases. In Proceedings of the 25th Annual Conference on Artificial Intelligence (AAAI), 2011.
↑ R. Jenatton, N. Le Roux, A. Bordes, G. Obozinski, et al. A latent factor model for highly multi-relational data. In Advances in Neural Information Processing Systems (NIPS 25), 2012.
↑ Bordes, A., Usunier, N., Weston, J., & Yakhnenko, O. (2013). Translating Embeddings for Modeling Multi-relational Data. In N (pp. 1–9).
↑ Miao Fan, Qiang Zhou, Emily Chang, Thomas Fang Zheng. Transition-based Knowledge Graph Embedding with Relational Mapping Properties. PACLIC'14
↑ R. Socher, D. Chen, C. D. Manning, and A. Y. Ng. Learning new facts from knowledge bases with neural tensor networks and semantic word vectors. In Advances in Neural Information Processing Systems (NIPS 26), 2013.
↑ Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge Graph and Text Jointly Embedding. In The 2014 Conference on Empirical Methods on Natural Language Processing. ACL – Association for Computational Linguistics. Retrieved from http://research.microsoft.com/apps/pubs/default.aspx?id=228269
↑ Bordes, A., & Weston, J. (2012). Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing, 22, 127–135.

[1] Yang, B., Yih, W., He, X., Gao, J., & Deng, L. (2014). Embedding Entities and Relations for Learning and Inference in Knowledge Bases, 12. Computation and Language. Retrieved from http://arxiv.org/abs/1412.6575

[bordes2013-2] 2.0 ^2.1 ^2.2 A. Bordes, X. Glorot, J. Weston, and Y. Bengio. A semantic matching energy function for learning with multi-relational data. Machine Learning, 2013.

[3] M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.

[4] A. Bordes, J.Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowl- edge bases. In Proceedings of the 25th Annual Conference on Artificial Intelligence (AAAI), 2011.

[5] R. Jenatton, N. Le Roux, A. Bordes, G. Obozinski, et al. A latent factor model for highly multi-relational data. In Advances in Neural Information Processing Systems (NIPS 25), 2012.

[6] Bordes, A., Usunier, N., Weston, J., & Yakhnenko, O. (2013). Translating Embeddings for Modeling Multi-relational Data. In N (pp. 1–9).

[7] Miao Fan, Qiang Zhou, Emily Chang, Thomas Fang Zheng. Transition-based Knowledge Graph Embedding with Relational Mapping Properties. PACLIC'14

[8] R. Socher, D. Chen, C. D. Manning, and A. Y. Ng. Learning new facts from knowledge bases with neural tensor networks and semantic word vectors. In Advances in Neural Information Processing Systems (NIPS 26), 2013.

[9] Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge Graph and Text Jointly Embedding. In The 2014 Conference on Empirical Methods on Natural Language Processing. ACL – Association for Computational Linguistics. Retrieved from http://research.microsoft.com/apps/pubs/default.aspx?id=228269

[10] Bordes, A., & Weston, J. (2012). Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing, 22, 127–135.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

Synset embedding

Knowledge base[]

Knowledge base + Text[]

References[]

Fan Feed