Similarity, association and relatedness

Similarity, association and relatedness are closely-related but distinct concepts in lexical semantics. They are a big source of confusion as many researchers have been used these terms interchangeably and even datasets have been constructed without disentangling the concepts. This page aims at highlighting the difference between them and surveying the literature that has been done for the mixture of them.

Contrasting similarity, association and relatedness

Relatedness is more general than similarity

"It’s important to note that semantic relatedness is a more general concept than similarity; similar entities are semantically related by virtue of their similarity (bank–trust company), but dissimilar entities may also be semantically related by lexical relationships such as meronymy (car–wheel) and antonymy (hot–cold), or just by any kind of functional relationship or frequent associ- ation (pencil–paper, penguin–Antarctica, rain–flood). Computational."^[1]

"Cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar."^[2]

Similarity is different from association

Relatedness includes similarity, association and many other stuffs

"These relationships include not just hyponymy and the non-hyponymy relationships in WordNet such as meronymy but also associative and ad hoc relationships. As mentioned in the introduction, these can include just about any kind of functional relation or frequent association in the world.

Table 6 in Budanitsky & Hirst (2006)
Name	Example
IS-USED-TO	bed–sleep
WORKS-IN	judge–court
LIVES-IN	camel–desert
IS-THE-OUTSIDE-OF	husk–corn

For the last century, many researchers have attempted to enumerate these kinds of relationships. Some elements from a typical list (that of Spellman, Holyoak, and Morrison (2001)) are shown in Table 6. Morris and Hirst (2004; 2005) have termed these non-classical lexical semantic relationships (following Lakoff’s (1987) non-classical categories), and Morris has shown in experiments with human subjects that around 60% of the lexical relationships that readers perceive in a text are of this nature (Morris, 2006). [...]

But lists of such relationships can never be exhaustive, as lexical relationships can also arise ad hoc in context (Barsalou, 1983; Barsalou, 1989) — in particular, as co-membership of an ad hoc category. For example, Morris’s subjects reported that the words sex, drinking, and drag racing were semantically related, by all being “dangerous behaviors”, in the context of an article about teenagers emulating what they see in movies. Thus lexical semantic relatedness is sometimes constructed in context and can-not always be determined purely from an a priori lexical resource such as WordNet. It’s very unclear how ad hoc semantic relationships could be quantified in any meaningful way, let alone compared with prior quantifications of the classical and non-classical relationships." ^[1]

Evaluating

Theoretical examination "of a proposed measure for those mathematical properties thought desirable, such as whether it is a metric (or the inverse of a metric), whether it has singularities, whether its parameter-projections are smooth functions, and so on."^[1] See Wei (1993)^[3]; Lin (1998)^[4]
Comparison with human judgments: "human judgments of similarity and relatedness are deemed to be correct by definition, this clearly gives the best assessment of the ‘goodness’ of a measure. Its main drawback lies in the difficulty of obtaining a large set of reliable, subject-independent judgments for comparison—designing a psycholinguistic experiment, validating its results, and so on."^[1]
Application (external evaluation): "evaluate the measures with respect to their performance in the framework of a particular application. If some particular NLP system requires a measure of semantic relatedness, we can compare different measures by seeing which one the system is most effective with, while holding all other aspects of the system constant."^[1]

Desirata

Not subject to triangle inequality ^[1]
Identity: maximal score for identical concepts

Some approaches

Path length

Information content

References

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 Budanitsky, A., & Hirst, G. (2006). Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics, 32(August 2005), 13–47. doi:10.1162/coli.2006.32.1.13
↑ Resnik, Philip. 1995. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, August.
↑ Wei, Mei. 1993. An analysis of word relatedness correlation measures. Master’s thesis, University ofWestern Ontario, London, Ontario, May
↑ Lin, Dekang. 1998b. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, pages 296–304, July

[hirst2006-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 Budanitsky, A., & Hirst, G. (2006). Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics, 32(August 2005), 13–47. doi:10.1162/coli.2006.32.1.13

[2] Resnik, Philip. 1995. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, August.

[3] Wei, Mei. 1993. An analysis of word relatedness correlation measures. Master’s thesis, University ofWestern Ontario, London, Ontario, May

[4] Lin, Dekang. 1998b. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, pages 296–304, July

[1]

[2]

[3]

[4]