Around 1990s, people considered NLU as a "monolithic problem", as described in Abney (1997)[1]:

"The initial impetus for the current popularity of statistical methods in computational linguistics was provided in large part by the papers on part-of-speech tagging by Church [20], DeRose [25], and Garside [34]. In contradiction to common wisdom, these taggers showed that it was indeed possible to carve part-of-speech disambiguation out of the apparently monolithic problem of natural language understanding, and solve it with impressive accuracy.

The concensus at the time was that part-of-speech disambiguation could only be done as part of a global analysis, including syntactic analysis, discourse analysis, and even world knowledge. For instance, to correctly disambiguate help in give John helpN versus let John helpV, one apparently needs to parse the sentences, making reference to the differing subcategorization frames of give and let. Similar examples show that even world knowledge must be taken into account. For instance, off is a preposition in I turned off highway I-90, but a particle in I turned off my radio, so assigning the correct part of speech in I turned off the spectroroute depends on knowing whether spectroroute is the name of a road or the name of a device.

Such examples do demonstrate that the problem of part-of-speech disambiguation cannot be solved without solving all the rest of the natural-language understanding problem. But Church, DeRose and Garside showed that, even if an exact solution is far beyond reach, a reasonable approximate solution is quite feasible."

Cite error: <ref> tags exist, but no <references/> tag was found