In search of smarter search technology, two ISD grad students and their advisor won second-place honors recently in one of the competition tracks of the annual Text REtrieval Conference (TREC).
Soo-Min Kim, Deepak Ravichandran and advisor Eduard Hovy pitted their systems for information retrieval and natural language processing against those of some 34 other competitors in the competition sponsored by the National Institute for Standards and Technology (NIST).
The challenge was to scan the text of news stories and identify sentences that either offered opinions or described events. The competitors were given 26 topics (such as abortion or drugs)and told to extract the target data from 50 newspaper articles in each topic.
Kim designed her program to read the texts on a given topic (e.g. “abortion”), read through each text sentence by sentence and output either Yes (= opinion-bearing) or No (not).
Ravichandran constructed his program to identify sentences that contained events relevant to a given topic. In this case, the topic was an earthquake in Afghanistan, and the program was set to read each sentence and identify whether it did or did not contain other events relevant to the event.
The output of the competitors’ programs was then scored on precision – should each Yes answer have been designated as such – and recall – how many of the full set of correct Yes answers were logged. These scores were then measured against a human-scored read of each text.
“In both cases, the technology will form part of future QA systems, which is the direction in which Web search engines are involving,” says Hovy.
If you ask Google today, ‘Who likes Al Qaeda?’ all you get is a list of documents about Al Qaeda – it has no idea about like or dislike or opinions in general,” he says. “Soo-Min’s system will help give you real answers.
Similarly, Deepak’s will help answer questions like ‘What happened after the earthquake?’ or ‘Tell me about Einstein’s life in Berne.’ The QA system needs to know what constitutes an event.”
Published on December 11th, 2003
Last updated on August 9th, 2021