Archive for the ‘Local Development’ Category

Open Domain Question Answering

Monday, May 19th, 2008

As part of the CLMA program here at University of Washington, we were asked to investigate an area of interest and write a short summary of what we read about.

So I chose to look at the TREC Question Answering challenge and read about some of the techniques that were used for submissions to that evaluation.  The kinds of questions that are used for this include simple factoid questions such as “How many calories are in a Big Mac?”, list questions such as “Which past and present NFL players have the last name of Johnson?”, and definition question such as “What is a Golden Parachute?”.

Many of the system descriptions submitted to TREC have pipelined architectures.  Here are some examples of components that were described:

  • Question Type Classification – Decide what the answer should be like.  Is it asking for a name, a date, a description or a list, etc.
  • Question Rewrite – Convert the question to search terms that are used for the IR search.  This step typically expands the list of words used in the search by adding synonyms.  This gives a broader list of pages returned that can then be further analyzed.
  • IR Search – generally a simple search using established technologies.
  • Passage Selection – deciding which portions of the text returned by the IR search to include in the results.
  • Answer extraction and ranking – creating the actual text that will be returned as answers.

The paper is not intended to be a thorough description of the field.  It helped me get a handle on what current designs are being developed.

Here is the link to my exploration paper: QuestionAnsweringAtTrec

Entailment Revisited

Sunday, February 3rd, 2008

In a previous entry, I wrote about the Entailment Challenge. Entailment is a linguistic knowledge concept that concerns two sentences about an event or idea. It is said that sentence A ‘entails’ sentence B when all of the meaning in B is contained in A. Follow this link to see more discussion: Recognizing Entailment.

As part of the Master’s program in Computational Linguistics at the University of Washington, we are preparing for internship appointments in the summer of 2008. As part of that preparation, we were asked to pick a topic that concerns Computational Linguistics and write a short summary of a few papers that covered the topic. This is a PDF of that paper. PreInternship Topic

Comparing Bernoulli and Multinomial

Sunday, February 3rd, 2008

As part of our NLP statistical processing class in the CLMA program at the University of Washington, we did a comparison of Naive-Bayes learning algorithms. We compared the Bernoulli and Multinomial approaches to this problem and the results are shown in the table below.

This test was run by ‘training’ a classifier on a set of data instances. Each instance has a vector of features. In the Bernoulli case, we treat the features as binary. In the multinomial case, we treat the features as a numeric value from 0 to n where n is the number of instances the given word was found in the instance. Both sets of data have a ‘class’ assigned to them. After we train on the data, we check our system results with the actual class value assigned to each instance. This is where the percent accuracy comes from.

Of course these numbers are from a single test on one set of training and test data, but the fact that the Multinomial results are 91% accurate compared to 88% accurate for the Bernoulli is pretty telling. Also you can see that the elapsed runtime for the test is significantly different as well. The ‘cross_prob_delta’ is a parameter for tweaking the ‘add-one’ smoothing in the training stage.

Bernoull

Cross
prob
delta

Training Accuracy

Test Accuracy

Wall Clock (seconds)

0.1

0.9303

0.8800

303.23

0.5

0.9103

0.8633

318.10

1.0

0.8970

0.8400

305.45

2.0

0.8796

0.8233

305.56

Multinomial

Cross prob
delta

Training Accuracy

Test Accuracy

Wall Clock (seconds)

0.1

0.9570

0.9133

8.33

0.5

0.9503

0.9066

8.50

1.0

0.9448

0.9000

8.29

2.0

0.9400

0.8966

8.26

While we were working on this evaluation, we were brainstorming on other ways to compare the Bernoulli and Multinomial approaches. This is what we came up with.

  • Office chair comparison. Put signs on two contestants that are sitting in office chairs. The signs are ‘Bernoulli’ and ‘Multinomial’. The contestants race down the hallway without lifting themselves out of the chair. The first sign to the finish line is the method of choice.
  • Date getting comparison. Again, put signs on two contestants. Have them stand back to back in the center of the student lounge and randomly ask girls for dates. The sign that gets the most dates is the method of choice.

All kidding aside, the NLP stats processing class is challenging and fun. We are gaining insight into how these methods can be used for basic classification of data.

Diphones in Text To Speech

Saturday, January 5th, 2008

For a phonetics class that is part of the Master’s program at the University of Washington, I wrote a research paper on how Diphones are used in text to speech systems. Essentially, Diphones are portions of words that are extracted from a recording of words or sentences.

One of the main problems with Text To Speech systems is making them sound natural by varying the prosody of the output. Prosody is the term for the variation in pitch, duration and intensity that all people use when speaking an utterance. By splitting a recording into Diphones, the system can select from a list of candidates for each slot in the output. The system finds the Diphone candidate that is closest to the desired prosody.

Here is an image that showing the word ‘maybe’. There are four phones or segments ‘m’ ‘ay’ ‘b’ ‘e’. A Diphone is two halves of two adjacent phones. The middle of the phone is the most stable portion. By splitting the recording at the middle of each Diphone there is less disturbance at the joints between Diphones that are concatenated in the simulated speech output.
Maybe
Here is a link to the paper that describes the technique of using Diphones for text to speech systems.

Text To Speech Using Diphones.pdf

Machine Learning

Monday, May 14th, 2007

As part of a Machine Learning class, my lab partner and I wrote this project description that describes Neural Networks.   We used it on a few different data sets – one of which was a hand written zip code character recognition task.

Typical Neural Network

Paper that describes neural networks

Yoda Speak

Monday, April 9th, 2007

This week I have been improving my phrase definitions for parsing sentences.  I added another rule that accepts modifiers before the phrase head, so for example “big dog” has an adjective “big” that modifies “dog”.  This is in contrast to “dog with a big tail” where “with a big tail” comes after “dog”.  In both cases they are considered modifiers.

Today I was working on “the dog with a stick ran”.  Here “with a stick” is a modifier to “dog”.  But it can also be considered a modifier to “ran”.   This is one of the things that Yoda does to sentence structures, is the modifiers are moved around in the sentence.

Here is a tree showing “with a stick” as a modifier to dog.

 

dog with a stick

Here is a tree of the same sentence showing “with a stick” as a modifier to “ran”.

with a stick ran

Comps Versus Mods

Friday, March 16th, 2007

As I have been developing my feature structure tool and a grammar to use in the tool, I have gradually refined the grammar.  Recently I added a small tweak to the head complement rule and doing so reduced some over generation of parses for sentences.

This document gives a summary of how the feature structure rules work.  It also talks briefly about the tweak that I made and the kind of over generation that the tweak fixed.

Comps Versus Mods

 Chase Modifier

Good And Bad Sentences

Friday, March 16th, 2007

Developing a grammer using feature structures is an iterative process.  After writing some rules and word defition entries for the lexicon, you try parsing sentences.  This means parsing sentences that are correct in english and also some incorrect sentences.  If the tool allows incorrect sentences it is called over-generation.  For example ‘the dog ran’ is correct, but ‘the dog run’ is incorrect.

Here is a list of sentences that I have tested on my current grammar implementation.

the dog chased the cat
* the dogs chases the cat
* the dog chase the cat
the dogs chase the cat
the dogs chase the cats
* the dogs sleeps
the dog sleeps
the dogs sleep

*These dog sleep
*these dog sleeps
these dogs sleep
*these dogs sleeps

*This dog sleep
this dog sleeps
*this dogs sleep
*this dogs sleeps

these dogs chase these cats
*these dogs chase this cats
*these dogs chase these cat
these dogs chase this cat

the dog chased the cat in the yard
*the dog chased the cat in the
*the dog chased the cat in yard
*the dog sleeps the cat

Sentences with ‘*’ are incorrect, and the grammer rejects them.

Home Brew Feature Structures

Wednesday, February 28th, 2007

After spending some time with (Sag, Wasow, Bender, 2003) and (Copestake, 2002), I wanted to see just what is involved in implementing a processor for feature structures and the grammar for parsing sentences.  The two texts mentioned helped me know where this was going, and having LKB (http://wiki.delph-in.net/moin/LkbTop) as a model meant I didn’t have to venture into uncharted territory.

So, I wrote a program that works with feature structures based on the descriptions in the two texts listed.  There are several pieces involved in this program:

  • TDL parser.  Read in a file that contains Type Definitions.  The syntax of these definitions are taken from Copestake.
  • Feature Structure processor.  Unification is the main task for managing feature structures.  This means combining two or more feature structures into a combined structure while ensuring that rules for unification are not broken.
  • Earley style Chart Parser.  The Earley algorithm (http://en.wikipedia.org/wiki/Earley_parser) manages a list of potential partial solutions to the parse and continues to expand these partial solutions until one or more complete solutions are found.
  • Viewer.  The feature structures are displayed in an Attribute Value Matrix (AVM) viewer that allows the user to selectively minimize portions of a structure to maximize screen usage.  These tend to be large images, so the selective minimization is a necessity even for relatively short sentences.

So, the program reads in the type definition file, manages a list of feature structures for grammar rules as well as word definitions, parses sentences, and displays the feature structure results for the sentence parses. 

This is an image of a sentence that was parsed using feature structures.  This is the minimized form.

A simple Sentence

This image shows the structure of the sentence.  ‘the dogs’ is a noun phrase, as is ‘the cats’.  ‘chased’ is a verb with ‘the dogs’ as its specifier and ‘the cat’ as its complement.  ’start’ is the term used here for the top level container of the sentence.

Here is the same sentence expanded to show the ‘feature structures’ of each part of the sentence.

expandedsentence05.bmp

The feature structures are the various portions shown in square brackets and angle brackets.  It is a nested structure which means there are pieces inside other pieces.  So ‘the’ and ‘dogs’ are inside the np-rule.  The np-rule and the vp-rule-intrans are inside ’start’ which is the sentence.

The features are the items in square brackets.  For example ‘CATEG’, ‘NUMAGR’ and ‘ORTH’  are all features in the structure for ‘dogs’.  ‘CATEG N’ means that this stucture is a noun.  ‘NUMAGR pl’ means the structure requires agreement for number which in this case is ‘pl’ (plural).

The numbers inside squares are coreferences.  These mean that whereever a number is repeated, those features must have the same value.  So there are 6 places that show a ‘3′ inside a square.  Each of these places must ‘allow’ the same value.  So for example ‘the’ and ‘dogs’ both allow plural.  Whereas “a dogs” would not have worked because ‘a’ can only be used with singular nouns.  Similarly ‘ran’ allows plural, whereas ’sleeps’ doesn’t allow plural – you don’t say “the dogs sleeps”.

This has been a pretty involved project.  I am glad to have made it this far.  I plan to describe a few of the components of the program in a little more detail.  As for future work, I am currently working on making the parser more efficient.  Making the parser run faster is required for it to be useful in future work.  After that, I want to see how to apply this to solve the entailment problem (see earlier posts).

Comparing Word Mutual Information Scores

Saturday, February 18th, 2006

=== See Entry for April 1, 2006.  ===

=== I found the problem with the computation. ===

Below is a table that shows the computed mutual information for noun-verb pairs. The table displays 5 different ways of numerically sorting the noun-verb pairs.

The first column gives the noun that is used for the pairs. The number under the noun is the number of noun-verb pairs that were detected by the parser and clustering algorithm. (See entry for Feb 11, 2006 – Streaming Clusters).

The rest of the columns show the verbs having high mutual information with the noun. These words are sorted, so the top verb in each list has the highest mutual information value with the noun. The numbers in column 2 to 6 are the instance count for the given noun-verb pair.

The definition for the mutual information was taken from Hindle (1990), Lin (1998) and Gamallo, et. al. (2005) (See the reference page). Hindle and Lin each use the same definition. Gamallo, et. al., use a slightly different definition.

Columns 4 and 5 show a variation on the Hindle definition. In column 4, after computing the value according to Lin, the result is then multiplied by the number of instances of the given word pair (wrw). The term ‘wrw’ comes from Lin.

‘Door opened’ was found as a pair 526 times, so the result is

MI_Modified = MI * 526.

In column 5, the Hindle definition is multiplied by the log of the number of instances,

MI_Modified = MI * log(526).

The final column 6 shows the instance count for the noun-verb pairs.

 

 

Hindle
or Lin
Gamallo
Lin
With *wrw
Lin
With *log(wrw)
Instance
Count
Door
3090 
2 creaminess 1 reflectors
1 musquitoes
1 azeglio
1 atlantis’
26 slammed
5 flinched
14 banged
2 hughes
2 thornie
2 unbarred
1 munich
2 burby
Musquitos Azeglio
Atlantis
Slammed
Fliched
Banged
Hughes
Thornie
Unbarred
Munich
Burby
opened
526 opened 207 open
127 closed
511 was
80 locked
87 shut
54 leading
39 swung
26 slammed
36 bell
26 flung
42 stood
14 banged
22 opening
526 opened 80 locked
207 open
127 closed
26 slammed
87 shut
54 leading
39 swung
14 banged
36 bell
26 flung
22 opening
11 neighbor
11 unlocked
10 barred
8 creaked
526 opened 511 was
207 open
127 clsoed
87 shut
80 locked
77 had
72 is
54 leading
42 stood
39 swung
36 bell
26 slammed
26 flung
25 be
22 opening
21 being
Death
1736
2 deowe 2 distilling
2 furens
1 supervenes
1 titmouse
1 fouras
6 tristram
1 copperplate
2 usurp
1 resplendently
1 suffe
1 bonnemains
deowe distilling
furens
supervenes
titmouse
fouras
tristram
copperplate
96 bed 88 rate
27 blow
127 is
159 was
15 warrant
15 wound
10 rates
13 struggle
37 like
96 had
6 stistram
9 sentence
13 comes
62 be
6 beds
8 occurred
96 bed 88 rate
27 blow
15 warrant
10 rates
15 wound
6 tristram
13 struggle
9 sentence
6 beds
4 maximus
6 trap
4 overtake
8 occurred
13 comes
6 brings
3 overweight
159 was 127 is
96 bed
96 had
88 rate
62 be
37 like
35 will
34 have
27 blow
27 are
24 were
22 has
17 come
15 warrant
15 wound
15 came
Child
2682
1 topknots 1 humouring
2 coughs
1 develish
1 stiddy
2 sneezed
1 cons
1 antagonized
5 stope
1 playhouses
1 ashiel
1 riposte
1 praisest
humouring coughs
develish
stiddy
sneezed
cons
antagonized
stope
playhouses
84 born 269 is
233 was
98 said
69 has
103 be
44 like
11 bearing
10 birth
10 skillful
7 learns
5 stope
12 sitting
7 asks
52 will
84 born 5 stope
7 learns
10 birth
11 bearing
10 skillful
7 asks
6 dies
269 is
12 sitting
3 drogo
6 feels
9 learn
69 has
7 study
269 is 233 was
103 be
98 said
93 had
84 born
69 has
52 will
44 like
40 have
27 can
15 up
15 do
12 sitting
12 looked

 

There is something amiss with columns 2 and 3. Some of the verbs selected by the Hindle and Gamallo definitions are intuitively associated with the noun, but many of the words do not seem to be particularly related. And when compared to columns 4 and 5, columns 2 and 3 do not seem very close to target.

I suspect that this is a result of steps in the process that occur earlier in the process. The parsing algorithm that I use is substantially different from that desribed by Hindle, Lin or Gamallo. Similarly, the cluster gathering algorithm is different.

Another potential cause for these results to be different from the works referenced is the interpretation of the word pairs. Keeping track of which instances represent a cluster starting with a given noun and ending with a given verb is a difficult task. I have been optimizing the storage of the instances to speed up the computation. I need to do another pass on the code to see if I am correctly managing everything.