Trainable Sentence Planning for Complex Information Presentation in Spoken Dialog Systems
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Trainable Sentence Planning for Complex Information Presentation in Spoken Dialog Systems Amanda Stent Rashmi Prasad Marilyn Walker Stony Brook University University of Pennsylvania University of Sheffield Stony Brook, NY 11794 Philadelphia, PA 19104 Sheffield S1 4DP U.S.A. U.S.A. U.K. stent@cs.sunysb.edu rjprasad@linc.cis.upenn.edu M.A.Walker@sheffield.ac.uk Abstract context, may be inferior to that of a template- A challenging problem for spoken dialog sys- based system unless domain-specific rules are tems is the design of utterance generation mod- developed or general rules are tuned for the par- ules that are fast, flexible and general, yet pro- ticular domain. Furthermore, full NLG may be duce high quality output in particular domains. too slow for use in dialog systems. A promising approach is trainable generation, A third, more recent, approach is trainable which uses general-purpose linguistic knowledge generation: techniques for automatically train- automatically adapted to the application do- ing NLG modules, or hybrid techniques that main. This paper presents a trainable sentence adapt NLG modules to particular domains or planner for the MATCH dialog system. We user groups, e.g. (Langkilde, 2000; Mellish, show that trainable sentence planning can pro- 1998; Walker, Rambow and Rogati, 2002). duce output comparable to that of MATCH’s Open questions about the trainable approach template-based generator even for quite com- include (1) whether the output quality is high plex information presentations. enough, and (2) whether the techniques work well across domains. For example, the training 1 Introduction method used in SPoT (Sentence Planner Train- One very challenging problem for spoken dialog able), as described in (Walker, Rambow and Ro- systems is the design of the utterance genera- gati, 2002), was only shown to work in the travel tion module. This challenge arises partly from domain, for the information gathering phase of the need for the generator to adapt to many the dialog, and with simple content plans in- features of the dialog domain, user population, volving no rhetorical relations. and dialog context. This paper describes trainable sentence There are three possible approaches to gener- planning for information presentation in the ating system utterances. The first is template- MATCH (Multimodal Access To City Help) di- based generation, used in most dialog systems alog system (Johnston et al., 2002). We pro- today. Template-based generation enables a vide evidence that the trainable approach is programmer without linguistic training to pro- feasible by showing (1) that the training tech- gram a generator that can efficiently produce nique used for SPoT can be extended to a high quality output specific to different dialog new domain (restaurant information); (2) that situations. Its drawbacks include the need to this technique, previously used for information- (1) create templates anew by hand for each ap- gathering utterances, can be used for infor- plication; (2) design and maintain a set of tem- mation presentations, namely recommendations plates that work well together in many dialog and comparisons; and (3) that the quality contexts; and (3) repeatedly encode linguistic of the output is comparable to that of a constraints such as subject-verb agreement. template-based generator previously developed The second approach is natural language gen- and experimentally evaluated with MATCH eration (NLG), which divides generation into: users (Walker et al., 2002; Stent et al., 2002). (1) text (or content) planning, (2) sentence Section 2 describes SPaRKy (Sentence Plan- planning, and (3) surface realization. NLG ning with Rhetorical Knowledge), an extension promises portability across domains and dialog of SPoT that uses rhetorical relations. SPaRKy contexts by using general rules for each genera- consists of a randomized sentence plan gen- tion module. However, the quality of the output erator (SPG) and a trainable sentence plan for a particular domain, or a particular dialog ranker (SPR); these are described in Sections 3
strategy:recommend Alt Realization H SPR items: Chanpen Thai 2 Chanpen Thai, which is a Thai restau- 3 .28 relations:justify(nuc:1;sat:2); justify(nuc:1;sat:3); jus- rant, has decent decor. It has good tify(nuc:1;sat:4) service. It has the best overall quality content: 1. assert(best(Chanpen Thai)) among the selected restaurants. 2. assert(has-att(Chanpen Thai, decor(decent))) 5 Since Chanpen Thai is a Thai restau- 2.5 .14 3. assert(has-att(Chanpen Thai, service(good)) rant, with good service, and it has de- 4. assert(has-att(Chanpen Thai, cuisine(Thai))) cent decor, it has the best overall qual- ity among the selected restaurants. Figure 1: A content plan for a recommendation 6 Chanpen Thai, which is a Thai restau- 4 .70 for a restaurant in midtown Manhattan rant, with decent decor and good ser- vice, has the best overall quality among the selected restaurants. strategy:compare3 items: Above, Carmine’s relations:elaboration(1;2); elaboration(1;3); elabora- Figure 3: Some alternative sentence plan real- tion(1,4); elaboration(1,5); elaboration(1,6); izations for the recommendation in Figure 1. H elaboration(1,7); contrast(2;3); contrast(4;5); contrast(6;7) = Humans’ score. SPR = SPR’s score. content: 1. assert(exceptional(Above, Carmine’s)) 2. assert(has-att(Above, decor(good))) Alt Realization H SPR 3. assert(has-att(Carmine’s, decor(decent))) 11 Above and Carmine’s offer exceptional 2 .73 4. assert(has-att(Above, service(good))) value among the selected restaurants. 5. assert(has-att(Carmine’s, service(good))) Above, which is a New American 6. assert(has-att(Above, cuisine(New Ameri- restaurant, with good decor, has good can))) service. Carmine’s, which is an Italian 7. assert(has-att(Carmine’s, cuisine(italian))) restaurant, with good service, has de- cent decor. Figure 2: A content plan for a comparison be- 12 Above and Carmine’s offer exceptional 2.5 .50 value among the selected restaurants. tween restaurants in midtown Manhattan Above has good decor, and Carmine’s has decent decor. Above and Carmine’s have good service. Above is a New American restaurant. On the other and 4. Section 5 presents the results of two hand, Carmine’s is an Italian restau- experiments. The first experiment shows that rant. 13 Above and Carmine’s offer exceptional 3 .67 given a content plan such as that in Figure 1, value among the selected restaurants. SPaRKy can select sentence plans that commu- Above is a New American restaurant. nicate the desired rhetorical relations, are sig- It has good decor. It has good service. Carmine’s, which is an Italian restau- nificantly better than a randomly selected sen- rant, has decent decor and good service. tence plan, and are on average less than 10% 20 Above and Carmine’s offer exceptional 2.5 .49 worse than a sentence plan ranked highest by value among the selected restaurants. human judges. The second experiment shows Carmine’s has decent decor but Above has good decor, and Carmine’s and that the quality of SPaRKy’s output is compa- Above have good service. Carmine’s is rable to that of MATCH’s template-based gen- an Italian restaurant. Above, however, erator. We sum up in Section 6. is a New American restaurant. 25 Above and Carmine’s offer exceptional NR NR value among the selected restaurants. 2 SPaRKy Architecture Above has good decor. Carmine’s is an Italian restaurant. Above has good Information presentation in the MATCH sys- service. Carmine’s has decent decor. tem focuses on user-tailored recommendations Above is a New American restaurant. Carmine’s has good service. and comparisons of restaurants (Walker et al., 2002). Following the bottom-up approach to Figure 4: Some of the alternative sentence plan text-planning described in (Marcu, 1997; Mel- realizations for the comparison in Figure 2. H lish, 1998), each presentation consists of a set of = Humans’ score. SPR = SPR’s score. NR = assertions about a set of restaurants and a spec- Not generated or ranked ification of the rhetorical relations that hold be- tween them. Example content plans are shown in Figures 1 and 2. The job of the sentence The architecture of the spoken language gen- planner is to choose linguistic resources to real- eration module in MATCH is shown in Figure 5. ize a content plan and then rank the resulting The dialog manager sends a high-level commu- alternative realizations. Figures 3 and 4 show nicative goal to the SPUR text planner, which alternative realizations for the content plans in selects the content to be communicated using a Figures 1 and 2. user model and brevity constraints (see (Walker
DIALOGUE 3 Sentence Plan Generation MANAGER As in SPoT, the basis of the SPG is a set of Communicative Goals clause-combining operations that operate on tp- trees and incrementally transform the elemen- SPUR tary predicate-argument lexico-structural rep- Text resentations (called DSyntS (Melcuk, 1988)) Planner associated with the speech-acts on the leaves What to Say of the tree. The operations are applied in a bottom-up left-to-right fashion and the result- Sentence Surface Prosody ing representation may contain one or more sen- Planner Realizer Assigner tences. The application of the operations yields How to Say It two parallel structures: (1) a sentence plan Speech tree (sp-tree), a binary tree with leaves labeled Synthesizer by the assertions from the input tp-tree, and in- terior nodes labeled with clause-combining op- SYSTEM erations; and (2) one or more DSyntS trees UTTERANCE (d-trees) which reflect the parallel operations on the predicate-argument representations. Figure 5: A dialog system with a spoken lan- We generate a random sample of possible guage generator sentence plans for each tp-tree, up to a pre- specified number of sentence plans, by ran- domly selecting among the operations accord- et al., 2002)). The output is a content plan for ing to a probability distribution that favors pre- a recommendation or comparison such as those ferred operations1. The choice of operation is in Figures 1 and 2. further constrained by the rhetorical relation SPaRKy, the sentence planner, gets the con- that relates the assertions to be combined, as tent plan, and then a sentence plan generator in other work e.g. (Scott and de Souza, 1990). (SPG) generates one or more sentence plans In the current work, three RST rhetorical rela- (Figure 7) and a sentence plan ranker (SPR) tions (Mann and Thompson, 1987) are used in ranks the generated plans. In order for the the content planning phase to express the rela- SPG to avoid generating sentence plans that are tions between assertions: the justify relation clearly bad, a content-structuring module first for recommendations, and the contrast and finds one or more ways to linearly order the in- elaboration relations for comparisons. We put content plan using principles of entity-based added another relation to be used during the coherence based on rhetorical relations (Knott content-structuring phase, called infer, which et al., 2001). It outputs a set of text plan holds for combinations of speech acts for which trees (tp-trees), consisting of a set of speech there is no rhetorical relation expressed in the acts to be communicated and the rhetorical re- content plan, as in (Marcu, 1997). By explicitly lations that hold between them. For example, representing the discourse structure of the infor- the two tp-trees in Figure 6 are generated for mation presentation, we can generate informa- the content plan in Figure 2. Sentence plans tion presentations with considerably more inter- such as alternative 25 in Figure 4 are avoided; nal complexity than those generated in (Walker, it is clearly worse than alternatives 12, 13 and Rambow and Rogati, 2002) and eliminate those 20 since it neither combines information based that violate certain coherence principles, as de- on a restaurant entity (e.g Babbo) nor on an scribed in Section 2. attribute (e.g. decor). The clause-combining operations are general The top ranked sentence plan output by the operations similar to aggregation operations SPR is input to the RealPro surface realizer used in other research (Rambow and Korelsky, which produces a surface linguistic utterance 1992; Danlos, 2000). The operations and the (Lavoie and Rambow, 1997). A prosody as- 1 signment module uses the prior levels of linguis- Although the probability distribution here is hand- crafted based on assumed preferences for operations such tic representation to determine the appropriate as merge, relative-clause and with-reduction, it prosody for the utterance, and passes a marked- might also be possible to learn this probability distribu- up string to the text-to-speech module. tion from the data by training in two phases.
elaboration nucleus:assert-com-list_exceptional infer contrast contrast contrast nucleus:assert-com-decor nucleus:assert-com-service nucleus:assert-com-cuisine nucleus:assert-com-decor nucleus:assert-com-service nucleus:assert-com-cuisine elaboration nucleus:assert-com-list_exceptional contrast infer infer nucleus:assert-com-decor nucleus:assert-com-cuisine nucleus:assert-com-decor nucleus:assert-com-cuisine nucleus:assert-com-service nucleus:assert-com-service Figure 6: Two tp-trees for alternative 13 in Figure 4. constraints on their use are described below. vice;Chanpen Thai has the best overall quality merge applies to two clauses with identical among the selected restaurants) yields Chanpen matrix verbs and all but one identical argu- Thai, which is a Thai restaurant, with decent ments. The clauses are combined and the non- decor and good service, has the best overall qual- identical arguments coordinated. For example, ity among the selected restaurants. relative- merge(Above has good service;Carmine’s has clause also applies only for the relations infer good service) yields Above and Carmine’s have and justify. good service. merge applies only for the rela- cue-word inserts a discourse connective tions infer and contrast. (one of since, however, while, and, but, and on with-reduction is treated as a kind of the other hand), between the two clauses to be “verbless” participial clause formation in which combined. cue-word conjunction combines the participial clause is interpreted with the two distinct clauses into a single sentence with a subject of the unreduced clause. For exam- coordinating or subordinating conjunction (e.g. ple, with-reduction(Above is a New Amer- Above has decent decor BUT Carmine’s has ican restaurant;Above has good decor) yields good decor), while cue-word insertion inserts Above is a New American restaurant, with good a cue word at the start of the second clause, pro- decor. with-reduction uses two syntactic ducing two separate sentences (e.g. Carmine’s constraints: (a) the subjects of the clauses must is an Italian restaurant. HOWEVER, Above be identical, and (b) the clause that under- is a New American restaurant). The choice of goes the participial formation must have a have- cue word is dependent on the rhetorical relation possession predicate. In the example above, for holding between the clauses. instance, the Above is a New American restau- Finally, period applies to two clauses to be rant clause cannot undergo participial forma- treated as two independent sentences. tion since the predicate is not one of have- Note that a tp-tree can have very different possession. with-reduction applies only for realizations, depending on the operations of the the relations infer and justify. SPG. For example, the second tp-tree in Fig- relative-clause combines two clauses with ure 6 yields both Alt 11 and Alt 13 in Figure 4. identical subjects, using the second clause to However, Alt 13 is more highly rated than Alt relativize the first clause’s subject. For ex- 11. The sp-tree and d-tree produced by the SPG ample, relative-clause(Chanpen Thai is a for Alt 13 are shown in Figures 7 and 8. The Thai restaurant, with decent decor and good ser- composite labels on the interior nodes of the sp-
PERIOD_elaboration assert-com-list_exceptional PERIOD_contrast PERIOD_infer RELATIVE_CLAUSE_infer PERIOD_infer assert-com-service assert-com-cuisine MERGE_infer assert-com-cuisine assert-com-decor assert-come-decor assert-com-service Figure 7: Sentence plan tree (sp-tree) for alternative 13 in Figure 4 PERIOD offer PERIOD Above_and_Carmine’s value among PERIOD HAVE1 exceptional restaurant HAVE1 Carmine’s decor PERIOD selected Above service BE3 decent AND2 BE3 HAVE1 Above restaurant good Carmine’s restaurant service Above decor Italian good New_American good Figure 8: Dependency tree (d-tree) for alternative 13 in Figure 4 tree indicate the clause-combining relation se- each strategy. The SPG produced as many as 20 lected to communicate the specified rhetorical distinct sp-trees for each content plan. The sen- relation. The d-tree for Alt 13 in Figure 8 shows tences, realized by RealPro from these sp-trees, that the SPG treats the period operation as were then rated by two expert judges on a scale part of the lexico-structural representation for from 1 to 5, and the ratings averaged. Each sp- the d-tree. After sentence planning, the d-tree tree was an example input for RankBoost, with is split into multiple d-trees at period nodes; each corresponding rating its feedback. these are sent to the RealPro surface realizer. Features used by RankBoost: RankBoost Separately, the SPG also handles referring ex- requires each example to be encoded as a set of pression generation by converting proper names real-valued features (binary features have val- to pronouns when they appear in the previous ues 0 and 1). A strength of RankBoost is that utterance. The rules are applied locally, across the set of features can be very large. We used adjacent sequences of utterances (Brennan et 7024 features for training the SPR. These fea- al., 1987). Referring expressions are manipu- tures count the number of occurrences of certain lated in the d-trees, either intrasententially dur- structural configurations in the sp-trees and the ing the creation of the sp-tree, or intersenten- d-trees, in order to capture declaratively de- tially, if the full sp-tree contains any period op- cisions made by the randomized SPG, as in erations. The third and fourth sentences for Alt (Walker, Rambow and Rogati, 2002). The fea- 13 in Figure 4 show the conversion of a named tures were automatically generated using fea- restaurant (Carmine’s) to a pronoun. ture templates. For this experiment, we use two classes of feature: (1) Rule-features: These 4 Training the Sentence Plan features are derived from the sp-trees and repre- Ranker sent the ways in which merge, infer and cue- The SPR takes as input a set of sp-trees gener- word operations are applied to the tp-trees. ated by the SPG and ranks them. The SPR’s These feature names start with “rule”. (2) Sent- rules for ranking sp-trees are learned from a la- features: These features are derived from the beled set of sentence-plan training examples us- DSyntSs, and describe the deep-syntactic struc- ing the RankBoost algorithm (Schapire, 1999). ture of the utterance, including the chosen lex- Examples and Feedback: To apply Rank- emes. As a result, some may be domain specific. Boost, a set of human-rated sp-trees are en- These feature names are prefixed with “sent”. coded in terms of a set of features. We started We now describe the feature templates used with a set of 30 representative content plans for in the discovery process. Three templates were
used for both sp-tree and d-tree features; two dominated by a node labeled with that op- were used only for sp-tree features. Local feature eration in that tree (MIN); (2) the maximal templates record structural configurations local number of leaves dominated by a node la- to a particular node (its ancestors, daughters beled with that operation (MAX); and (3) etc.). Global feature templates, which are used the average number of leaves dominated by only for sp-tree features, record properties of the a node labeled with that operation (AVG). entire sp-tree. We discard features that occur For example, the sp-tree in Figure 7 has fewer than 10 times to avoid those specific to value 3 for “PERIOD infer max”, value 2 for particular text plans. “PERIOD infer min” and value 2.5 for “PE- RIOD infer avg”. Strategy System Min Max Mean S.D. Recommend SPaRKy 2.0 5.0 3.6 .71 HUMAN 2.5 5.0 3.9 .55 5 Experimental Results RANDOM 1.5 5.0 2.9 .88 We report two sets of experiments. The first ex- Compare2 SPaRKy 2.5 5.0 3.9 .71 HUMAN 2.5 5.0 4.4 .54 periment tests the ability of the SPR to select a RANDOM 1.0 5.0 2.9 1.3 high quality sentence plan from a population of Compare3 SPaRKy 1.5 4.5 3.4 .63 sentence plans randomly generated by the SPG. HUMAN 3.0 5.0 4.0 .49 Because the discriminatory power of the SPR is RANDOM 1.0 4.5 2.7 1.0 best tested by the largest possible population of Table 1: Summary of Recommend, Compare2 sentence plans, we use 2-fold cross validation for and Compare3 results (N = 180) this experiment. The second experiment com- pares SPaRKy to template-based generation. Cross Validation Experiment: We re- There are four types of local feature peatedly tested SPaRKy on the half of the cor- template: traversal features, sister features, pus of 1756 sp-trees held out as test data for ancestor features and leaf features. Local each fold. The evaluation metric is the human- feature templates are applied to all nodes in a assigned score for the variant that was rated sp-tree or d-tree (except that the leaf feature is highest by SPaRKy for each text plan for each not used for d-trees); the value of the resulting task/user combination. We evaluated SPaRKy feature is the number of occurrences of the on the test sets by comparing three data points described configuration in the tree. For each for each text plan: HUMAN (the score of the node in the tree, traversal features record the top-ranked sentence plan); SPARKY (the score preorder traversal of the subtree rooted at of the SPR’s selected sentence); and RANDOM that node, for all subtrees of all depths. An (the score of a sentence plan randomly selected example is the feature “rule traversal assert- from the alternate sentence plans). com-list exceptional” (with value 1) of the We report results separately for comparisons tree in Figure 7. Sister features record all between two entities and among three or more consecutive sister nodes. An example is the fea- entities. These two types of comparison are gen- ture “rule sisters PERIOD infer RELATIVE erated using different strategies in the SPG, and CLAUSE infer” (with value 1) of the can produce text that is very different both in tree in Figure 7. For each node in the terms of length and structure. tree, ancestor features record all the ini- Table 1 summarizes the difference between tial subpaths of the path from that node SPaRKy, HUMAN and RANDOM for recom- to the root. An example is the feature mendations, comparisons between two entities “rule ancestor PERIOD contrast*PERIOD and comparisons between three or more enti- infer” (with value 1) of the tree in Figure 7. ties. For all three presentation types, a paired Finally, leaf features record all initial substrings t-test comparing SPaRKy to HUMAN to RAN- of the frontier of the sp-tree. For example, the DOM showed that SPaRKy was significantly sp-tree of Figure 7 has value 1 for the feature better than RANDOM (df = 59, p < .001) and “leaf #assert-com-list exceptional#assert-com- significantly worse than HUMAN (df = 59, p cuisine”. < .001). This demonstrates that the use of a Global features apply only to the sp- trainable sentence planner can lead to sentence tree. They record, for each sp-tree and for plans that are significantly better than baseline each clause-combining operation labeling a non- (RANDOM), with less human effort than pro- frontier node, (1) the minimal number of leaves gramming templates.
System Realization H Comparison with template generation: Template Among the selected restaurants, the fol- 4.5 For each content plan input to SPaRKy, the lowing offer exceptional overall value. judges also rated the output of a template- Uguale’s price is 33 dollars. It has good based generator for MATCH. This template- decor and very good service. It’s a French, Italian restaurant. Da Andrea’s based generator performs text planning and sen- price is 28 dollars. It has good decor and tence planning (the focus of the current pa- very good service. It’s an Italian restau- per), including some discourse cue insertion, rant. John’s Pizzeria’s price is 20 dollars. It has mediocre decor and decent service. clause combining and referring expression gen- It’s an Italian, Pizza restaurant. eration; the templates themselves are described SPaRKy Da Andrea, Uguale, and John’s Pizze- 4 in (Walker et al., 2002). Because the templates ria offer exceptional value among the se- are highly tailored to this domain, this genera- lected restaurants. Da Andrea is an Ital- ian restaurant, with very good service, it tor can be expected to perform well. Example has good decor, and its price is 28 dol- template-based and SPaRKy outputs for a com- lars. John’s Pizzeria is an Italian , Pizza parison between three or more items are shown restaurant. It has decent service. It has mediocre decor. Its price is 20 dollars. in Figure 9. Uguale is a French, Italian restaurant, with very good service. It has good decor, Strategy System Min Max Mean S.D. and its price is 33 dollars. Recommend Template 2.5 5.0 4.22 0.74 SPaRKy 2.5 4.5 3.57 0.59 HUMAN 4.0 5.0 4.37 0.37 Figure 9: Comparisons between 3 or more Compare2 Template 2.0 5.0 3.62 0.75 items, H = Humans’ score SPaRKy 2.5 4.75 3.87 0.52 HUMAN 4.0 5.0 4.62 0.39 Compare3 Template 1.0 5.0 4.08 1.23 not easily model, but that a trainable sentence SPaRKy 2.5 4.25 3.375 0.38 planner can. For example, Table 3 shows the HUMAN 4.0 5.0 4.63 0.35 nine rules generated on the first test fold which have the largest negative impact on the final Table 2: Summary of template-based genera- RankBoost score (above the double line) and tion results. N = 180 the largest positive impact on the final Rank- Boost score (below the double line), for com- Table 2 shows the mean HUMAN scores for parisons between three or more entities. The the template-based sentence planning. A paired rule with the largest positive impact shows that t-test comparing HUMAN and template-based SPaRKy learned to prefer that justifications in- scores showed that HUMAN was significantly volving price be merged with other information better than template-based sentence planning using a conjunction. only for compare2 (df = 29, t = 6.2, p < .001). These rules are also specific to presentation The judges evidently did not like the template type. Averaging over both folds of the exper- for comparisons between two items. A paired iment, the number of unique features appear- t-test comparing SPaRKy and template-based ing in rules is 708, of which 66 appear in the sentence planning showed that template-based rule sets for two presentation types and 9 ap- sentence planning was significantly better than pear in the rule sets for all three presentation SPaRKy only for recommendations (df = 29, t types. There are on average 214 rule features, = 3.55, p < .01). These results demonstrate 428 sentence features and 26 leaf features. The that trainable sentence planning shows promise majority of the features are ancestor features for producing output comparable to that of a (319) followed by traversal features (264) and template-based generator, with less program- sister features (60). The remainder of the fea- ming effort and more flexibility. tures (67) are for specific lexemes. The standard deviation for all three template- To sum up, this experiment shows that the based strategies was wider than for HUMAN ability to model the interactions between do- or SPaRKy, indicating that there may be main content, task and presentation type is a content-specific aspects to the sentence plan- strength of the trainable approach to sentence ning done by SPaRKy that contribute to out- planning. put variation. The data show this to be cor- rect; SPaRKy learned content-specific prefer- 6 Conclusions ences about clause combining and discourse cue This paper shows that the training technique insertion that a template-based generator can- used in SPoT can be easily extended to a new
N Condition αs L. Danlos. 2000. G-TAG: A lexicalized formal- 1 sent anc PROPERNOUN RESTAURANT -0.859 ism for text generation inspired by tree ad- *HAVE1 ≥ 16.5 2 sent anc II Upper East Side*ATTR IN1* -0.852 joining grammar. In Tree Adjoining Grammars: locate ≥ 4.5 Formalisms, Linguistic Analysis, and Processing. 3 sent anc PERIOD infer*PERIOD infer -0.542 CSLI Publications. *PERIOD elaboration ≥ -∞ M. Johnston, S. Bangalore, G. Vasireddy, A. Stent, 4 rule anc assert-com-service*MERGE infer -0.356 P. Ehlen, M. Walker, S. Whittaker, and P. Mal- ≥ 1.5 oor. MATCH: An architecture for multimodal di- 5 sent tvl depth 0 BE3 ≥ 4.5 -0.346 6 rule anc PERIOD infer*PERIOD infer -0.345 alogue systems. In Annual Meeting of the ACL, *PERIOD elaboration ≥ -∞ 2002. 7 rule anc assert-com-decor*PERIOD infer -0.342 A. Knott, J. Oberlander, M. O’Donnell and C. Mel- *PERIOD infer*PERIOD contrast *PE- lish. Beyond Elaboration: the interaction of rela- RIOD elaboration ≥ -∞ tions and focus in coherent text. In Text Repre- 8 rule anc assert-com-food quality*MERGE 0.398 sentation: linguistic and psycholinguistic aspects, infer ≥ 1.5 pages 181-196, 2001. 9 rule anc assert-com-price*CW 0.527 CONJUNCTION infer*PERIOD justify B. Lavoie and O. Rambow. A fast and portable re- ≥ -∞ alizer for text generation systems. In Proc. of the 3rd Conference on Applied Natural Language Pro- Table 3: The nine rules generated on the first cessing, ANLP97, pages 265–268, 1997. test fold which have the largest negative impact W.C. Mann and S.A. Thompson. Rhetorical struc- on the final RankBoost score (above the dou- ture theory: A framework for the analysis of texts. ble line) and the largest positive impact on the Technical Report RS-87-190, USC/Information Sciences Institute, 1987. final RankBoost score (below the double line), D. Marcu. From local to global coherence: a for Compare3. αs represents the increment or bottom-up approach to text planning. In Proceed- decrement associated with satisfying the condi- ings of the National Conference on Artificial In- tion. telligence (AAAI’97), 1997. C. Mellish, A. Knott, J. Oberlander, and M. domain and used for information presentation O’Donnell. Experiments using stochastic search as well as information gathering. Previous work for text planning. In Proceedings of INLG-98. 1998. on SPoT also compared trainable sentence plan- I. A. Melčuk. Dependency Syntax: Theory and Prac- ning to a template-based generator that had tice. SUNY, Albany, New York, 1988. previously been developed for the same appli- O. Rambow and T. Korelsky. Applied text genera- cation (Rambow et al., 2001). The evalua- tion. In Proceedings of the Third Conference on tion results for SPaRKy (1) support the results Applied Natural Language Processing, ANLP92, for SPoT, by showing that trainable sentence pages 40–47, 1992. generation can produce output comparable to O. Rambow, M. Rogati and M. A. Walker. Evalu- template-based generation, even for complex in- ating a Trainable Sentence Planner for a Spoken formation presentations such as extended com- Dialogue Travel System In Meeting of the ACL, parisons; (2) show that trainable sentence gen- 2001. R. E. Schapire. A brief introduction to boosting. In eration is sensitive to variations in domain ap- Proc. of the 16th IJCAI, 1999. plication, presentation type, and even human D. R. Scott and C. Sieckenius de Souza. Getting preferences about the arrangement of particu- the message across in RST-based text generation. lar types of information. In Current Research in Natural Language Gener- ation, pages 47–73, 1990. 7 Acknowledgments A. Stent, M. Walker, S. Whittaker, and P. Maloor. User-tailored generation for spoken dialogue: An We thank AT&T for supporting this research, experiment. In Proceedings of ICSLP 2002., 2002. and the anonymous reviewers for their helpful M. A. Walker, S. J. Whittaker, A. Stent, P. Mal- comments on this paper. oor, J. D. Moore, M. Johnston, and G. Vasireddy. Speech-Plans: Generating evaluative responses References in spoken dialogue. In Proceedings of INLG-02., I. Langkilde. Forest-based statistical sentence gen- 2002. eration. In Proc. NAACL 2000, 2000. M. Walker, O. Rambow, and M. Rogati. Training a S. E. Brennan, M. Walker Friedman, and C. J. Pol- sentence planner for spoken dialogue using boost- lard. A centering approach to pronouns. In Proc. ing. Computer Speech and Language: Special Is- 25th Annual Meeting of the ACL, Stanford, pages sue on Spoken Language Generation, 2002. 155–162, 1987.
You can also read