[drink:60, green:55, milk:40, sip:30, enjoy:10, . . .]. Each entry shows a word that appeared five words before or after tea, and the number of times the pair was seen in a large text collection. Taking just the number of occurrences of context words makes the representation even more convenient, because various standard (geometric) approaches exist for comparing the distance between numeric vectors. In this manner, a machine can compute the similarity between any two words.
Here is an example from Pantel and Lin (2002) of the 15 words most similar to wine computed by this approach:
Wine: beer, white wine, red wine, Chardonnay, champagne, fruit, food, coffee, juice, Cabernet, cognac, vinegar, Pinot noir, milk, vodka, . . .
The list may not look immediately useful but is certainly impressive if one considers how little similarity there is in the sequence of letters wine, beer, Chardonnay.
Building upon these representations, it has become possible to automatically discover words with multiple senses by clustering words similar to them (plant: (plant, factory, facility, refinery) (shrub, ground cover, perennial, bulb)), finding synonyms and antonyms. To aid analysis of customer reviews, researchers at Google developed a large lexicon of almost 200,000 positive and negative words and phrases, identified through their similarity to a handful of predefined positive or negative words such as excellent, amazing, bad, horrible. Among the positive phrases in the automatically constructed lexicon were cute, fabulous, top of the line, melt in your mouth; negative examples included subpar, crappy, out of touch, sick to my stomach (Velikovich et al., 2010).
Another line of research in semantic processing exploits the stable meaning of some contexts. For example, patterns like “X such as Y,” if occurring often in texts, is very likely an indicator that Y is a kind of X (i.e., “Red wines such as Cabernet and Pinot noir . . .”). Similarly a phrase like “The mayor of X” is a good indicator that X is a city. NELL (Never Ending Language Learning, http://rtw.ml.cmu.edu/rtw/) is a system that constantly learns unary and binary predicates, corresponding to categories and relations such as isCity(Philadelphia) and playsInstrument(George_Harrison, guitar). The learning of each type of fact starts with minimal supervision in the form of several examples of category instances or entities between which a relation holds, given by the researchers. Then the system starts an infinite loop in which it finds web pages that contain the examples, finds phrase patterns that typically occur with the examples, selects the best patterns that indicate the predicate with high probability, and then applies the patterns to new texts to discover more instances for which the predicate is true. Different flavors of this approach to machine understanding