Natural Language Understanding Wiki

Martinez et al. (2002)[1] use syntactic features to improve the precision of WSD (with penalty in recall).

From Martinez et al. (2002):

"Ng (1996) uses a basic set of features similar to those defined by Yarowsky, but they also use syntactic information: verb-object and subject-verb relations. [...]

Stetina et al. (1998) achieve good results with syntactic relations as features. They use a measure of semantic distance based on WordNet to find similar features. The features are extracted using a statistical parser (Collins, 1996), and consist of the head and modifiers of each phrase.

(Senseval-2 workshop) Regarding syntactic information, in the Japanese tasks, several groups relied on dependency trees to extract features that were used by different models (SVM, Bayes, or vector space models). For the English tasks, the team from the University of Sussex extracted selectional preferences based on subject-verb and verb-object relations. The John Hopkins team applied syntactic features obtained using simple heuristic patterns and regular expressions. Finally, WASP-bench used finite-state techniques to create a grammatical relation database, which was later used in the disambiguation process. The papers in the proceedings do not provide specific evaluation of the syntactic features, and it is difficult to derive whether they were really useful or not."

From Agirre and Edmonds (2007)[2]:

"Yarowsky et al. (2001) [...] For training the VSM component, they applied a rich set of features (including syntactic information).

Regarding the English all-words task at Senseval-3, 20 systems competed, [...] Most of the participant systems included rich features in their models, especially syntactic dependencies and domain information.

From Agirre and Edmonds (2007)

Subcategorization. Directly encodes KS 4. Details of a word’s subcategorization behavior are most easily obtained from tagged corpora using a robust parser (e.g., Minipar (Lin 1993) or RASP (Carroll and Briscoe 2001)). Martínez et al. (2002) used Minipar to derive subcategorization information for verbs from tagged corpora. For instance, from The unfortunate hiker fell-1 into a crevasse2 we can derive that the first sense of the verb to fall allows for a subject but no other arguments. Some dictionaries (e.g., LDOCE and WordNet) list information about the syntactic behavior of words although this has not been extensively used in WSD.

Syntactic Dependencies. This feature encodes syntagmatic relations (KS 6b). The dependencies of a particular word sense can be extracted from a corpus which is parsed and tagged with word senses (Lin 1997, Yarowsky and Florian 2002)."

From Gaustad (2004)[3], chapter 7 (WSD for Dutch): "including dependency relations as sole feature already performs significantly better than the baseline classifier which proves that deep syntactic knowledge provides valuable information for disambiguation".


  1. Martínez, David, Eneko Agirre, and Lluís Màrquez. "Syntactic features for high precision word sense disambiguation." Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 2002.
  2. Agirre, Eneko, and Philip Edmonds, eds. Word sense disambiguation: Algorithms and applications. Vol. 33. Springer Science & Business Media, 2007.
  3. Gaustad, Tanja. Linguistic knowledge and word sense disambiguation. Diss. Rijksuniversiteit Groningen, 2004.