Natural Language Processing - University Of Texas At Austin

Transcription

Natural Language Processing NLP is the branch of computer sciencefocused on developing systems that allowcomputers to communicate with peopleusing everyday language. Also called Computational LinguisticsCS 343: Artificial IntelligenceNatural Language Processing– Also concerns how computational methods canaid the understanding of human languageRaymond J. MooneyUniversity of Texas at Austin12CommunicationCommunication (cont) Communication for the hearer: The goal in the production and comprehension ofnatural language is communication. Communication for the speaker:– Perception: Map input modality to a string of words,e.g. optical character recognition (OCR) or speechrecognition.– Analysis: Determine the information content of thestring.– Intention: Decide when and what information shouldbe transmitted (a.k.a. strategic generation). Mayrequire planning and reasoning about agents’ goals andbeliefs.– Generation: Translate the information to becommunicated (in internal logical representation or“language of thought”) into string of words in desirednatural language (a.k.a. tactical generation).– Synthesis: Output the string in desired modality, text orspeech. Syntactic interpretation (parsing): Find the correct parse treeshowing the phrase structure of the string. Semantic Interpretation: Extract the (literal) meaning of thestring (logical form). Pragmatic Interpretation: Consider effect of the overallcontext on altering the literal meaning of a sentence.– Incorporation: Decide whether or not to believe thecontent of the string and add it to the KB.34Communication (cont)Syntax, Semantic, Pragmatics Syntax concerns the proper ordering of words and its affecton meaning.––––The dog bit the boy.The boy bit the dog.* Bit boy dog the the.Colorless green ideas sleep furiously. Semantics concerns the (literal) meaning of words,phrases, and sentences.– “plant” as a photosynthetic organism– “plant” as a manufacturing facility– “plant” as the act of sowing Pragmatics concerns the overall communicative and socialcontext and its effect on interpretation.– The ham sandwich wants another beer. (co-reference, anaphora)– John thinks vanilla. (ellipsis)561

Modular ComprehensionAmbiguity Natural language is highlyambiguous and must ning(contextualized)– I saw the man on the hill with atelescope.– I saw the Grand Canyon flying to LA.– Time flies like an arrow.– Horse flies like a sugar cube.– Time runners like a coach.– Time cars like a Porsche.7Ambiguity is Ubiquitous8Ambiguity is Explosive Speech Recognition Ambiguities compound to generate enormousnumbers of possible interpretations. In English, a sentence ending in nprepositional phrases has over 2n syntacticinterpretations (cf. Catalan numbers).– “recognize speech” vs. “wreck a nice beach”– “youth in Asia” vs. “euthanasia” Syntactic Analysis– “I ate spaghetti with chopsticks” vs. “I ate spaghetti with meatballs.” Semantic Analysis– “The dog is in the pen.” vs. “The ink is in the pen.”– “I put the plant in the window” vs. “Ford put the plant in Mexico”– “I saw the man with the telescope”: 2 parses Pragmatic Analysis– From “The Pink Panther Strikes Again”:– Clouseau: Does your dog bite?Hotel Clerk: No.Clouseau: [bowing down to pet the dog] Nice doggie.[Dog barks and bites Clouseau in the hand]Clouseau: I thought you said your dog did not bite!Hotel Clerk: That is not my dog.9Humor and Ambiguity– “I saw the man on the hill with the telescope.”: 5 parses– “I saw the man on the hill in Texas with the telescope”:14 parses– “I saw the man on the hill in Texas with the telescope atnoon.”: 42 parses– “I saw the man on the hill in Texas with the telescope atnoon on Monday” 132 parses10Natural Languages vs. Computer Languages Many jokes rely on the ambiguity of language: Ambiguity is the primary difference betweennatural and computer languages. Formal programming languages are designed to beunambiguous, i.e. they can be defined by agrammar that produces a unique parse for eachsentence in the language. Programming languages are also designed forefficient (deterministic) parsing, i.e. they aredeterministic context-free languages (DCLFs).– Groucho Marx: One morning I shot an elephant in mypajamas. How he got into my pajamas, I’ll never know.– She criticized my apartment, so I knocked her flat.– Noah took all of the animals on the ark in pairs. Exceptthe worms, they came in apples.– Policeman to little boy: “We are looking for a thief witha bicycle.” Little boy: “Wouldn’t you be better usingyour eyes.”– Why is the teacher wearing sun-glasses. Because theclass is so bright.– A sentence in a DCFL can be parsed in O(n) timewhere n is the length of the string.11122

Syntactic ParsingContext Free Grammars (CFG) Produce the correct syntactic parse tree for asentence. N a set of non-terminal symbols (or variables) Σ a set of terminal symbols (disjoint from N) R a set of productions or rules of the formA β, where A is a non-terminal and β is astring of symbols from (Σ N)* S, a designated non-terminal called the startsymbolSimple CFG for ATIS EnglishGrammarS NP VPS Aux NP VPS VPNP PronounNP Proper-NounNP Det NominalNominal NounNominal Nominal NounNominal Nominal PPVP VerbVP Verb NPVP VP PPPP Prep NPSentence Generation Sentences are generated by recursively rewritingthe start symbol using the productions until onlyterminals symbols remain.LexiconDet the a that thisNoun book flight meal moneyVerb book include preferPronoun I he she meProper-Noun Houston NWAAux doesPrep from to on near throughSDerivationorParse htthroughNPProper-NounHoustonParse Trees and Syntactic AmbiguityPrepositional Phrase Attachment Explosion If a sentence has morethan one possiblederivation (parse tree) itis said to be syntacticallyambiguous. A transitive English sentence ending in mprepositional phrases has at least 2m parses.I saw the man on the hill with a telescope on Tuesday in Austin . The exact number of parses is given by theCatalan numbers (where n m 1)1, 2, 5, 14, 132, 429, 1430, 4862, 16796, 17183

Spurious AmbiguityParsing Given a string of non-terminals and a CFG,determine if the string can be generated by theCFG. Most parse trees of most NL sentences make nosense.– Also return a parse tree for the string– Also return all possible parse trees for the string Must search space of derivations for one thatderives the given string.– Top-Down Parsing: Start searching space ofderivations for the start symbol.– Bottom-up Parsing: Start search space of reversedeivations from the terminal symbols in the string.19Parsing ExampleTop Down ParsingSSNPVPVerbNPVPPronounbook that flightbookDetNominalthatNounflightTop Down ParsingTop Down ParsingSNPPronounSVPNPVPProperNounXbook4

Top Down ParsingTop Down ParsingSNPSVPNPProperNounDetVPNominalXbookTop Down ParsingTop Down ParsingSNPDetSVPAuxNPVPNominalXbookTop Down ParsingSAuxXNPTop Down ParsingSVPVPbook5

Top Down ParsingTop Down ParsingSSVPVPVerbVerbbookTop Down ParsingTop Down ParsingSSVPVPVerbbookVerbNPXthatTop Down ParsingTop Down ParsingSSVPVPVerbbookNPVerbbookNPPronoun6

Top Down ParsingTop Down XthatTop Down ParsingTop Down nalXthatTop Down ParsingTop Down NominalthatNoun7

Top Down ParsingBottom Up htflight44Bottom Up ParsingBottom Up 45Bottom Up Parsing46Bottom Up okNounthatflightbook47thatflight488

Bottom Up ParsingBottom Up minalPPNounDetbookthatflight49Bottom Up Parsing50Bottom Up ght51Bottom Up Parsing52Bottom Up htVPflight53549

Bottom Up ParsingBottom Up flightflight55Bottom Up ParsingBottom Up tNounbookthatNounflightflight57Bottom Up Parsing58Bottom Up thatNounbookthatNounflightflight596010

Bottom Up ParsingBottom Up tNounbookthatNPNominalNounflightflight61Bottom Up Parsing62Bottom Up hatNounbookthatNounflightflight63Bottom Up ParsingTop Down vs. Bottom Up Top down never explores options that will not leadto a full parse, but can explore many options thatnever connect to the actual sentence. Bottom up never explores options that do notconnect to the actual sentence but can exploreoptions that can never lead to a full parse. Relative amounts of wasted search depend on howmuch the grammar branches in each 56611

Syntactic Parsing & AmbiguityStatistical Parsing Statistical parsing uses a probabilistic model ofsyntax in order to assign probabilities to eachparse tree. Provides principled approach to resolvingsyntactic ambiguity. Allows supervised learning of parsers from treebanks of parse trees provided by human linguists. Also allows unsupervised learning of parsers fromunannotated text, but the accuracy of such parsershas been limited. Just produces all possible parse trees. Does not address the important issue ofambiguity resolution.67Probabilistic Context Free Grammar(PCFG)68Simple PCFG for ATIS English A PCFG is a probabilistic version of a CFGwhere each production has a probability. Probabilities of all productions rewriting agiven non-terminal must add to 1, defining adistribution for each non-terminal. String generation is now probabilistic whereproduction probabilities are used to nondeterministically select a production forrewriting a given non-terminal.LexiconProbGrammarS NP VPS Aux NP VPS VPNP PronounNP Proper-NounNP Det NominalNominal NounNominal Nominal NounNominal Nominal PPVP VerbVP Verb NPVP VP PPPP Prep NP0.80.10.10.20.20.60.30.20.50.20.50.31.0 1.0 1.0 1.0 1.0Det the a that this0.6 0.2 0.1 0.1Noun book flight meal money0.1 0.50.2 0.2Verb book include prefer0.50.20.3Pronoun I he she me0.5 0.1 0.1 0.3Proper-Noun Houston NWA0.80.2Aux does1.0Prep from to on near through0.25 0.25 0.1 0.2 0.269Sentence ProbabilitySyntactic Disambiguation Assume productions for each node are chosenindependently. Probability of derivation is the product of theprobabilities of its productions. Resolve ambiguity by picking most probable parsetree.SD2P(D2) 0.1 x 0.3 x 0.5 x 0.6 x 0.5 x0.1VP0.6 x 0.3 x 1.0 x 0.5 x 0.2 x0.3VP0.50.2 x 0.8 0.00001296VerbNP 0.60.5P(D1) 0.1 x 0.5 x 0.5 x 0.6 x 0.6 xSD10.10.5 x 0.3 x 1.0 x 0.2 x 0.2 xVP 0.50.5 x 0.8VerbNP 0.6 0.00002160.5Det Nominal0.5bookbook0.6the Nominal PP 1.00.3NP 0.2Noun Prep0.5 flight 0.2through Proper-Noun0.8Houston71PPDet Nominal1.00.6 0.3NP 0.2the Noun 0.2Prep0.5 flight through Proper-Noun0.8Houston7212

Sentence ProbabilityThree Useful PCFG Tasks Probability of a sentence is the sum of theprobabilities of all of its derivations. Observation likelihood: To classify andorder sentences. Most likely derivation: To determine themost likely parse tree for a sentence. Maximum likelihood training: To train aPCFG to fit empirical training data.P(“book the flight through Houston”) P(D1) P(D2) 0.0000216 0.00001296 0.000034567374PCFG: Observation LikelihoodPCFG: Most Likely Derivation What is the probability that a given string isproduced by a given PCFG. Can use a PCFG as a language model to choosebetween alternative sentences for speechrecognition or machine translation.S NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PP0.90.10.50.30.20.60.41.00.70.3English What is the most probable derivation (parsetree) for a sentence.O1?S NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PPThe dog big barked.The big dog barkedO2P(O2 English) P(O1 English) ?0.90.10.50.30.20.60.41.00.70.3John liked the dog in the pen.PCFGParserXSNPJohnVPVlikedNPPPthe dog in the penEnglish75PCFG: Most Likely DerivationPCFG: Supervised Training What is the most probable derivation (parsetree) for a sentence. If parse trees are provided for training sentences, agrammar and its parameters can be can all beestimated directly from counts accumulated from thetree-bank (with appropriate smoothing).Tree BankS NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PP0.90.10.50.30.20.60.41.00.70.3SJohn liked the dog in the pen.NPSPCFGParserVPJohnNPVVPputNPPPthe dog in the penSJohnVNPNPJohnlikedthe dog in the penVPVputEnglish77NPPPthe dog in the pen.SupervisedPCFGTrainingS NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PP0.90.10.50.30.20.60.41.00.70.3English7813

Estimating Production ProbabilitiesPCFG: Maximum Likelihood Training Set of production rules can be taken directlyfrom the set of rewrites in the treebank. Parameters can be directly estimated fromfrequency counts in the treebank. Given a set of sentences, induce a grammar thatmaximizes the probability that this data wasgenerated from this grammar. Assume the number of non-terminals in thegrammar is specified. Only need to have an unannotated set of sequencesgenerated from the model. Does not need correctparse trees for these sentences. In this sense, it isunsupervised.79PCFG: Maximum Likelihood TrainingTraining SentencesJohn ate the appleA dog bit MaryMary hit the dogJohn gave Mary the cat.S NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PPPCFGTraining.80Vanilla PCFG Limitations Since probabilities of productions do not rely onspecific words or concepts, only general structuraldisambiguation is possible (e.g. prefer to attachPPs to Nominals). Consequently, vanilla PCFGs cannot resolvesyntactic ambiguities that require semantics toresolve, e.g. ate with fork vs. meatballs. In order to work well, PCFGs must be lexicalized,i.e. productions must be specialized to specificwords by including their head-word in their LHSnon-terminals (e.g. mple of Importance of LexicalizationExample of Importance of Lexicalization A general preference for attaching PPs to NPsrather than VPs can be learned by a vanilla PCFG. But the desired preference can depend on specificwords.S NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PPEnglish0.90.10.50.30.20.60.41.00.70.3John put the dog in the pen.PCFGParserJohn A general preference for attaching PPs to NPsrather than VPs can be learned by a vanilla PCFG. But the desired preference can depend on specificwords.SNPVPVputNP82PPthe dog in the pen83S NP VPS VPNP Det A NNP NP PPNP PropNA εA Adj APP Prep NPVP V NPVP VP PPEnglish0.90.10.50.30.20.60.41.00.70.3John put the dog in the pen.PCFGParserXSNPJohnVPVputNPthe dog in the pen8414

TreebanksFirst WSJ Sentence English Penn Treebank: Standard corpus fortesting syntactic parsing consists of 1.2 M wordsof text from the Wall Street Journal (WSJ). Typical to train on about 40,000 parsed sentencesand test on an additional standard disjoint test setof 2,416 sentences. Chinese Penn Treebank: 100K words from theXinhua news service. Other corpora existing in many languages, see theWikipedia article “Treebank”( (S(NP-SBJ(NP (NNP Pierre) (NNP Vinken) )(, ,)(ADJP(NP (CD 61) (NNS years) )(JJ old) )(, ,) )(VP (MD will)(VP (VB join)(NP (DT the) (NN board) )(PP-CLR (IN as)(NP (DT a) (JJ nonexecutive) (NN director) ))(NP-TMP (NNP Nov.) (CD 29) )))(. .) ))8586Parsing Evaluation MetricsComputing Evaluation Metrics PARSEVAL metrics measure the fraction of theconstituents that match between the computed andhuman parse trees. If P is the system’s parse tree and Tis the human parse tree (the “gold standard”):– Recall (# correct constituents in P) / (# constituents in T)– Precision (# correct constituents in P) / (# constituents in P) Labeled Precision and labeled recall require getting thenon-terminal label on the constituent node correct tocount as correct. F1 is the harmonic mean of precision and recall.87Treebank ResultsCorrect Tree TComputed Tree PSSVPVerbbookNPDet Nominalthe NominalPPVPVPVerbbookNPNoun Prepflight through Proper-NounHoustonNPPPDet NominalNoun PrepNPflight through Proper-NounHouston# Constituents: 12# Constituents: 12# Correct Constituents: 10theRecall 10/12 83.3% Precision 10/12 83.3%F1 83.3%Word Sense Disambiguation (WSD) Words in natural language usually have afair number of different possible meanings. Results of current state-of-the-art systems on theEnglish Penn WSJ treebank are slightly greater than90% labeled precision and recall.– Ellen has a strong interest in computationallinguistics.– Ellen pays a large amount of interest on hercredit card. For many tasks (question answering,translation), the proper sense of eachambiguous word in a sentence must bedetermined.899015

Ambiguity Resolutionis Required for TranslationWord Sense Disambiguation (WSD)as Text Categorization Syntactic and semantic ambiguities must be properlyresolved for correct translation: Each sense of an ambiguous word is treated as a category.– “play” (verb) play-game play-instrument play-role– “John plays the guitar.” “John toca la guitarra.”– “John plays soccer.” “John juega el fútbol.”– “pen” (noun) An apocryphal story is that an early MT system gavethe following results when translating from English toRussian and then back to English:– “The spirit is willing but the flesh is weak.” “The liquor is good but the meat is spoiled.”– “Out of sight, out of mind.” “Invisible idiot.” writing-instrument enclosure Treat current sentence (or preceding and current sentence)as a document to be classified.– “play”: play-game: “John played soccer in the stadium on Friday.” play-instrument: “John played guitar in the band on Friday.” play-role: “John played Hamlet in the theater on Friday.”– “pen”: writing-instrument: “John wrote the letter with a pen in New York.” enclosure: “John put the dog in the pen in New York.”9192Learning for WSDWSD “line” Corpus Assume part-of-speech (POS), e.g. noun, verb,adjective, for the target word is determined. Treat as a classification problem with theappropriate potential senses for the target wordgiven its POS as the categories. Encode context using a set of features to be usedfor disambiguation. Train a classifier on labeled data encoded usingthese features. Use the trained classifier to disambiguate futureinstances of the target word given their contextualfeatures. 4,149 examples from newspaper articlescontaining the word “line.” Each instance of “line” labeled with one of6 senses from WordNet. Each example includes a sentencecontaining “line” and the previous sentencefor context.93Senses of “line” Product: “While he wouldn’t estimate the sale price, analysts haveestimated that it would exceed 1 billion. Kraft also told analysts it plansto develop and test a line of refrigerated entrees and desserts, under theChillery brand name.”Formation: “C-LD-R L-V-S V-NNA reads a sign in Caldor’s bookdepartment. The 1,000 or so people fighting for a place in line have notrouble filling in the blanks.”Text: “Newspaper editor Francis P. Church became famous for a 1897editorial, addressed to a child, that included the line “Yes, Virginia, there isa Santa Clause.”Cord: “It is known as an aggressive, tenacious litigator. Richard D.Parsons, a partner at Patterson, Belknap, Webb and Tyler, likes theexperience of opposing Sullivan & Cromwell to “having a thousand-poundtuna on the line.”Division: “Today, it is more vital than ever. In 1983, the act wasentrenched in a new constitution, which established a tricameral parliamentalong racial lines, whith separate chambers for whites, coloreds and Asiansbut none for blacks.”Phone: “On the tape recording of Mrs. Guba's call to the 911 emergencyline, played at the trial, the baby sitter is heard begging for an ambulance.” 9594Experimental Data for WSD of “line” Sample equal number of examples of eachsense to construct a corpus of 2,094. Represent as simple binary vectors of wordoccurrences in 2 sentence context.– Stop words eliminated– Stemmed to eliminate morphological variation Final examples represented with 2,859binary word features.9616

Learning AlgorithmsLearning Curves for WSD of “line” Naïve Bayes– Binary features K Nearest Neighbor– Simple instance-based algorithm with k 3 and Hamming distance Perceptron– Simple neural-network algorithm. C4.5– State of the art decision-tree induction algorithm PFOIL-DNF– Simple logical rule learner for Disjunctive Normal Form PFOIL-CNF– Simple logical rule learner for Conjunctive Normal Form PFOIL-DLIST– Simple logical rule learner for decision-list of conjunctive rules9798Discussion ofLearning Curves for WSD of “line” Naïve Bayes and Perceptron give the best results. Both use a weighted linear combination ofevidence from many features. Symbolic systems that try to find a small set ofrelevant features tend to overfit the training dataand are not as accurate. Nearest neighbor method that weights all featuresequally is also not as accurate. Of symbolic systems, decision lists work the best.Other Syntactic Tasks99Word Segmentation Breaking a string of characters (graphemes) into asequence of words. In some written languages (e.g. Chinese) wordsare not separated by spaces. Even in English, characters other than white-spacecan be used to separate words [e.g. , ; . - : ( ) ] Examples from English URLs:– jumptheshark.com jump the shark .com– myspace.com/pluckerswingbar myspace .com pluckers wing bar myspace .com plucker swing bar Morphological Analysis Morphology is the field of linguistics that studies theinternal structure of words. (Wikipedia) A morpheme is the smallest linguistic unit that hassemantic meaning (Wikipedia)– e.g. “carry”, “pre”, “ed”, “ly”, “s” Morphological analysis is the task of segmenting a wordinto its morphemes:– carried carry ed (past tense)– independently in (depend ent) ly– Googlers (Google er) s (plural)– unlockable un (lock able) ? (un lock) able ?17

Part Of Speech (POS) TaggingPhrase Chunking Annotate each word in a sentence with apart-of-speech. Find all non-recursive noun phrases (NPs)and verb phrases (VPs) in a sentence.I ate the spaghetti with meatballs.Pro V DetNPrepN– [NP I] [VP ate] [NP the spaghetti] [PP with][NP meatballs].– [NP He ] [VP reckons ] [NP the current accountdeficit ] [VP will narrow ] [PP to ] [NP only #1.8 billion ] [PP in ] [NP September ]John saw the saw and decided to take it to the table.PN V Det N Con V Part V Pro Prep Det N Useful for subsequent syntactic parsing andword sense disambiguation.Semantic Role Labeling (SRL) For each clause, determine the semantic roleplayed by each noun phrase that is anargument to the verb.Other Semantic Tasksagent patient source destination instrument– John drove Mary from Austin to Dallas in hisToyota Prius.– The hammer broke the window. Also referred to a “case role analysis,”“thematic analysis,” and “shallow semanticparsing”106Semantic ParsingTextual Entailment A semantic parser maps a natural-languagesentence to a complete, detailed semanticrepresentation (logical form). For many applications, the desired output isimmediately executable by another program. Example: Mapping an English database query toProlog: Determine whether one natural languagesentence entails (implies) another under anordinary interpretation.How many cities are there in the US?answer(A, count(B, (city(B), loc(B, C),const(C, countryid(USA))),A))10718

Textual Entailment Problemsfrom PASCAL ChallengeTEXTHYPOTHESISEyeing the huge market potential, currentlyled by Google, Yahoo took over searchcompany Overture Services Inc last year.Yahoo bought Overture.ENTAILMENTTRUEPragmatics/Discourse TasksMicrosoft's rival Sun Microsystems Inc.bought Star Office last month and plans toboost its development as a Web-basedMicrosoft bought Star Office.device running over the Net on personalcomputers and Internet appliances.FALSEThe National Institute for Psychobiology inIsrael was established in May 1971 as theIsrael Center for Psychobiology by Prof.Joel.Israel was established in May1971.FALSESince its formation in 1948, Israel foughtmany wars with neighboring Arabcountries.Israel was established in1948.TRUEAnaphora Resolution/Co-Reference Determine which phrases in a document referto the same underlying entity.– John put the carrot on the plate and ate it.– Bush started the war in Iraq. But the presidentneeded the consent of Congress.Ellipsis Resolution Frequently words and phrases are omittedfrom sentences when they can be inferredfrom context."Wise men talk because they have something to say;foolsbecauseto vehaveto saysomething.“(Plato) Some cases require difficult reasoning. Today was Jack's birthday. Penny and Janet went to the store.They were going to get presents. Janet decided to get a kite."Don't do that," said Penny. "Jack has a kite. He will make youtake it back."Information Extraction (IE)Other Tasks Identify phrases in language that refer to specifictypes of entities and relations in text. Named entity recognition is task of identifyingnames of people, places, organizations, etc. in text.people organizations places– Michael Dell is the CEO of Dell ComputerCorporation and lives in Austin Texas. Relation extraction identifies specific relationsbetween entities.– Michael Dell is the CEO of Dell ComputerCorporation and lives in Austin Texas.11419

Question AnsweringText Summarization Directly answer natural language questionsbased on information presented in a corporaof textual documents (e.g. the web). Produce a short summary of a longer document orarticle.– When was Barack Obama born? (factoid) August 4, 1961– Who was president when Barack Obama wasborn? John F. Kennedy– How many presidents have there been sinceBarack Obama was born?– Article: With a split decision in the final two primaries and a flurry ofsuperdelegate endorsements, Sen. Barack Obama sealed the Democraticpresidential nomination last night after a grueling and history-makingcampaign against Sen. Hillary Rodham Clinton that will make him thefirst African American to head a major-party ticket. Before a chanting andcheering audience in St. Paul, Minn., the first-term senator from Illinoissavored what once seemed an unlikely outcome to the Democratic racewith a nod to the marathon that was ending and to what will be anotherhard-fought battle, against Sen. John McCain, the presumptive Republicannominee .– Summary: Senator Barack Obama was declared thepresumptive Democratic presidential nominee. 9Machine Translation (MT) Translate a sentence from one naturallanguage to another.– Hasta la vista, bebé Until we see each other again, baby.NLP Conclusions The need for disambiguation makes languageunderstanding difficult. Levels of linguistic processing:– Syntax– Semantics– Pragmatics CFGs can be used to parse natural language butproduce many spurious parses. Statistical learning methods can be used to:– Automatically learn grammars from (annotated)corpora.– Compute the most likely interpretation based on alearned statistical model.11820

3 Syntactic Parsing Produce the correct syntactic parse tree for a sentence. Context Free Grammars (CFG) N a set of non-terminal symbols (or variables) Σ a set of terminal symbols (disjoint from N) R a set of productions or rules of the form A β, where A is a non-terminal and β is a