This is a link to the Second Recognizing Textual Entailment Challenge.
http://www.pascal-network.org/Challenges/RTE2
Entailment means when the truth of one statement guarantees the truth of another statement. Here is an example from O’Grady (2005).
The park wardens killed the bear. <==> The bear is dead.
If it is true that the park wardens killed the bear, then it must also be true that the bear is dead.
This challenge involves a list of sentence pairs. The object is to write a program that can sort out the pairs where one sentence entails the other, and the pairs that do not exhibit entailment. See the link for the samples as well as links to entries to the challenge.
After I saw this challenge, I started thinking about applications for the ability to perform this entailment test. For example, image a web search that returns a list of pages. From each of the pages, take the sentences that contain the words in the original search. If the search was for ‘republican crisis’, then each sentence returned would contain those words. From this list of sentences, we perform the entailment test. If there are 1000 sentences, then we are doing over 500,000 sentence compares (probably not very efficient, but ignore that for now).
After the sentence comparisons, we can sort based on which sentences entailed which ones. This will create groups of sentences where entailment was detected either one way or the other between each pair of sentences. Now that the search pages are grouped according to this entailment test, let the user pick which group to wade into first.
In other words, let the entailment test group the pages and then let the user pick the page groups accordingly. Continuing our ‘republican crisis’ example, maybe this entailment sorting method would be able to group pages that talk about republicans that have caused their own crisis and pages that talk about a crisis that republicans were working to provide relief for.
Anyway, this is just a question to ponder at this point. This entailment challenge was only the second time around, and there is still much to be learned in order to improve the accuracy of the entailment detection algorithms. And certainly, my suggestion doing a many-to-many comparison between all of the pages returned from a search is not very practical. But the idea of being able to group the pages according to this entailment criteria is never the less very intriguing.