Sources of information
- Lexical resources
- WordNet
- Ontologies
- Corpora
Theoretical considerations
- Identity: maximal score for identical concepts
Triangle inequality
Triangle inequality: if A is close to B, B is close to C then A and C cannot be too far apart.[1][2] Triangle inequality is one of metric axioms. If it doesn't hold then a measure of distance is not a proper metric.
Tversky argued that triangle inequality is not valid.[1] but Rada et al. (1989)[3] showed that his examples were inconsistent.
Lin (1998)[4] also argued that triangle inequality was undesirable but he used an artificial and limited example.
Similarity measures
Purely WordNet
Purely corpus-based
Hybrid
Applications
- Semantic Role Labeling: Fuerstenau and Lapata (2012)[5]
- Textual Entailment: Berant et al. (2012)[6]
- Question Answering: Surdeanu et al. (2011)[7]
References
- ↑ 1.0 1.1 Tversky, Amos (1977). "Features of Similarity" (PDF). Psychological Reviews 84 (4): 327–352.
- ↑ There is also "reverse triangle inequality" for similarity: the similarity of A to C is greater than the sum of the similarity of A to B and the similarity of B to C. But it is shown to not hold (Rada et al., 1989).
- ↑ Rada, R., Mili, H., Bicknell, E., & Blettner, M. (1989). Development and application of a metric on semantic nets. Systems, Man and Cybernetics, IEEE Transactions on, 19(1), 17-30.
- ↑ Lin, Dekang. 1998. An information-theoretic definition of similarity (PDF). In Proceedings of the 15th International Conference on Machine Learning, pages 296–304, July
- ↑ Hagen Fuerstenau and Mirella Lapata. Semisupervised semantic role labeling via structural alignment. Computational Linguistics, 38(1): 135–171, 2012.
- ↑ Jonathan Berant, Ido Dagan, and Jacob Goldberger. Learning entailment relations by global graph structure optimization. Computational Linguis- tics, 38(1):73–111, 2012.
- ↑ Mihai Surdeanu, Massimiliano Ciaramita, and Hugo Zaragoza. Learning to rank answers to non-factoid questions from web collections. Computational Linguistics, 37(2):351–383, 2011.