An implemented model of punning riddles
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
An implemented model of punning riddles Kim Binsted∗ and Graeme Ritchie Department of Artificial Intelligence University of Edinburgh Edinburgh, Scotland EH1 1HN kimb@aisb.ed.ac.uk graeme@aisb.ed.ac.uk Abstract What do you use to flatten a ghost? A spirit level. (Webb 1978) In this paper, we discuss a model of simple question–answer punning, implemented in a pro- This riddle is of a general sort which is of particular gram, JAPE - 1, which generates riddles from interest for a number of reasons. The linguistics of humour–independent lexical entries. The model uses two main types of structure: schemata, riddles has been investigated before (e.g. (Pepicello & which determine the relationships between key Green 1984)). Also, there is a large corpus of riddles words in a joke, and templates, which produce to examine: books such as (Webb 1978) record them the surface form of the joke. JAPE - 1 succeeds by the thousand. Finally, riddles exhibit more regular in generating pieces of text that are recognizably structures and mechanisms than some other forms of jokes, but some of them are not very good jokes. We mention some potential improvements and humour. extensions, including post–production heuristics We have devised a formal model of the punning for ordering the jokes according to quality. mechanisms underlying some subclasses of riddle, and have implemented a computer program which uses these symbolic rules and structures to construct pun- Humour and artificial intelligence ning riddles from a humour-independent (i.e. linguis- If a suitable goal for AI research is to get a computer tically general) lexicon. An informal evaluation of the to do “. . . a task which, if done by a human, requires performance of this program suggests that its output is intelligence to perform,” (Minsky 1963), then the pro- not significantly worse than that produced by human duction of humorous texts, including jokes and riddles, composers of such riddles. is a fit topic for AI research. As well as probing some intriguing aspects of the notion of “intelligence”, it has Punning riddles the methodological advantage (unlike, say, computer Pepicello and Green (Pepicello & Green 1984) describe art) of leading to more directly falsifiable theories: the the various strategies incorporated in riddles. They resulting humorous artefacts can be tested on human hold the common view that humour is closely related to subjects. ambiguity, whether it be linguistic (such as the phono- Although no computationally tractable model of hu- logical ambiguity in a punning riddle) or contextual mour as a whole has yet been developed (see (At- (such as riddles that manipulate social conventions to tardo & Raskin 1991) for a general theory of verbal confuse the listener). What the linguistic strategies humour, and (Attardo 1994) for a comprehensive sur- have in common is that they ask the “riddlee” to ac- vey), we believe that by tackling a very limited and cept a similarity on a phonological, morphological, or linguistically-based set of phenomena, it is realistic to syntactic level as a point of semantic comparison, and start developing a formal symbolic account. thus get fooled (cf. “iconism” (Attardo 1994)). Rid- One very common form of humour is the question- dles of this type are known as puns. answer joke, or riddle. Most of these jokes (e.g. almost We decided to select a subset of riddles which dis- a third of the riddles in the Crack-a-Joke Book (Webb played regularities at the level of semantic, or logical, 1978)) are based on some form of pun. For example: structure, and whose structures could be described in ∗ Thanks are due to Canada Student Loans, the Overseas fairly conventional linguistic terms (simple lexical rela- Research Students Scheme, and the St Andrew’s Society of tions). As a sample of existing riddles, we studied “The Washington, DC, for their financial support. Crack-a-Joke Book” (Webb 1978), a collection of jokes
chosen by British children. These riddles are simple, Symbolic descriptions and their humour generally arises from their punning Our analysis of word-substitution riddles is based nature, rather than their subject matter. This sample (semi-formally) on the following essential items, re- does not represent sophisticated adult humour, but it lated as shown in Figure 1: suffices for an initial exploration. There are three main strategies used in puns to • a valid English word/phrase exploit phonological ambiguity: syllable substitution, • the meaning of the word/phrase word substitution, and metathesis. This is not to say • a shorter word, phonologically similar to part of that other strategies do not exist; however, none were the word/phrase found among the large number of punning jokes exam- • the meaning of the shorter word ined. • a fake word/phrase, made by substituting the shorter word into the word/phrase • the meaning of the fake word/phrase, made Syllable substitution: Puns using this strategy by combining the meanings of the original confuse a syllable (or syllables) in a word with a word/phrase and the shorter word. similar- or identical-sounding word. For example: What do short-sighted ghosts wear? Spooktacles. fake meaning (Webb 1978) Word substitution: Word substitution is very sim- constructs meaning_of constructs ilar to syllable substitution. In this strategy, an en- fake word/phrase tire word is confused with another similar- or identical- sounding word. For example: constructs constructs meaning 1 How do you make gold soup? Put fourteen carrots meaning 2 in it. (Webb 1978) meaning_of meaning_of Metathesis: Metathesis is quite different from syl- valid word/phrase 1 valid word 2 lable or word substitution. Also known as spooner- ism, it uses a reversal of sounds and words to sug- gest (wrongly) a similarity in meaning between two Figure 1: The relationships between parts of a pun semantically-distinct phrases. For example: At this point, it is important to distinguish between What’s the difference between a very short witch the mechanism for building the meaning of the fake and a deer running from hunters? One’s a stunted word/phrase, and the mechanism that uses that mean- hag and the other’s a hunted stag. (Webb 1978) ing to build a question with the word/phrase as an All three of the above-described types of pun are po- answer. Consider the joke: tentially tractable for detailed formalisation and hence What do you give an elephant that’s exhausted? computer generation. We chose to generate only word- Trunkquillizers. (Webb 1978) substitution puns, simply because lists of phonolog- ically identical words (homonyms) are readily avail- In this joke, the word “trunk”, which is phonologi- able, whereas the other two types require some kind of cally similar to the syllable “tranq”, is substituted into sub-word comparison. In particular, the class of jokes the valid English word “tranquillizer”. The resulting which we chose to generate all: use word substitution; fake word “trunkquillizer” is given a meaning, referred have the substituted word in the punchline of the joke, to in the question part of the riddle, which is some combination of the meanings of “trunk” and “tranquil- rather than the question; and substitute a homonym for a word in a common noun phrase (cf. the “spirit lizer” (in this case, a tranquillizer for elephants). The level” riddle cited earlier). These restrictions are sim- following questions use the same meaning for ‘trunk- ply to reduce the scope of the research even further, quillizer’, but refer to that meaning in different ways: so that the chosen subset of jokes can be covered in a • What do you use to sedate an elephant? comprehensive, rigorous manner. We believe that our • What do you call elephant sedatives? basic model, with some straightforward extensions, is • What kind of medicine do you give to a stressed- general enough to cover other forms. out elephant?
On the other hand, these questions are all put together specified. Moreover, some of the relationships are still in the same way, but from different constructed mean- quite general — the characteristic link merely indicates ings: that some lexical relationship must be present, and • What do you use to sedate an elephant? the homonym link allows either a homophone or the • What do you use to sedate a piece of luggage? same word with an alternative meaning. Instantiating • What do you use to medicate a nose? a schema means inserting lexemes in the schema, and specifying the exact relationships between those lex- We have adopted the term schema for the symbolic emes (i.e. making exact the characteristic links). For description of the underlying configuration of meanings example, in the lexicon, the lexeme spring cabbage and words, and template for the textual patterns used might participate in relations as follows: to construct a question-answer pair. class(spring_cabbage, vegetable) Lexicon location(spring_cabbage, garden) Our minimal assumptions about the structure of the action(spring_cabbage, grows) adjective(spring_cabbage, green) lexicon are as follows. There is a (finite) set of lexemes. .... A lexeme is an abstract entity, roughly correspond- ing to a meaning of a word or phrase. Each lexeme If spring cabbage were to be included in a schema, has exactly one entry in the lexicon, so if a word has at one end of a characteristic link, the other end of the two meanings, it will have two corresponding lexemes. link could be associated with any one, or any combina- Each lexeme may have some properties which are true tion of, these values (vegetable, garden, etc), depend- of it (e.g. being a noun), and there are a number of pos- ing on the exact label (class, location, etc.) chosen for sible relations which may hold between lexemes (e.g. the characteristic link. synonym, homonym, subclass). Each lexeme is also associated with a near-surface form which indicates Constructed meaning: bounces green (roughly) the written form of the word or phrase. act_verb Schemata Constructed phrase: spring cabbage adjective A schema stipulates a set of relationships which must hold between the lexemes used to build a homophone Identity joke. More specifically, a schema determines how real words/phrases are glued together to make a fake Original Noun Phrase: spring cabbage word/phrase, and which parts of the lexical entries for real words/phrases are used to construct the meaning spring_cabbage of the fake word/phrase. There are many different possible schemata (with Figure 3: A completely instantiated lotus schema obscure symbolic labels which the reader can ignore). For example, the schema in Figure 2 constructs a fake The completely instantiated lotus schema in Figure 3 phrase by substituting a homonym for the first word in could (with an appropriate template — see below) be a real phrase, then builds its meaning from the mean- used to construct the joke: ing of the homonym and the real phrase. CharacteristicNP What’s green and bounces? A spring cabbage. Characteristic1 Constructed meaning: (Webb 1978) Characteristic Homophone1 Word2 Characteristic Templates Constructed phrase: Homophone Identity A template is used to produce the surface form of a joke from the lexemes and relationships specified Word1 Word2 in an instantiated schema. Templates are not inher- Original ently humour-related. Given a (real or nonsense) noun noun phrase: Word1_Word2NP phrase, and a meaning for that noun phrase (genuine or constructed), a template builds a suitable question- Figure 2: The lotus schema answer pair. Because of the need to provide a suitable amount of information in the riddle question, every The schema shown in Figure 2 is uninstantiated; schema has to be associated with a set of appropriate that is, the actual lexemes to use have not yet been templates. Notice that the precise choice of relations
for the under-specified “characteristic” links will also schemata require semantic information only for nouns affect the appropriateness of a template. (Conversely, and adjectives. one could say that the choice of template influences The “homonym” relation between lexemes was im- the choice of lexical relation for the characteristic link, plemented as a separate homonym base derived from and this is in fact how we have implemented it.) Ab- a list (Townsend & Antworth 1993) of homophones in stractly, a template is a mechanism which maps a set of American English, shortened considerably for our pur- lexemes (from the instantiated schema) to the surface poses. The list now contains only common, concrete form of a joke. nouns and adjectives. The homonym base also includes words with two distinct meanings (e.g. “lemon”, the The JAPE-1 computer program fruit, and “lemon”, slang for a low-quality car). Introduction Schemata We have implemented the model described earlier in a JAPE - 1 has a set of six schemata, one of which is the computer program called JAPE - 1, which produces the jumper schema, shown in Figure 4. The same schema, chosen subtype of jokes — riddles that use homonym instantiated in two different ways, is shown in Figure 5 substitution and have a noun phrase punchline. Such and Figure 6. riddles are representative of punning riddles in general, and include approximately one quarter of the punning Constructed meaning: Characteristic1 Characteristic2 riddles in (Webb 1978). Characteristic Characteristic JAPE - 1 is significantly different from other attempts Word1 Homophone2 to computationally generate humour in various ways: Constructed phrase: its lexicon is humour-independent (i.e. the structures Identity Homophone that generate the riddles are distinct from the semantic and syntactic data they manipulate), and it generates Original Noun Phrase: Word1 Word2 riddles that are similar on a strategic and structural Word1_Word2NP level, rather than in surface form. JAPE - 1’s main mechanism attempts to construct a punning riddle based on a common noun phrase. It has Figure 4: The uninstantiated jumper schema several distinct knowledge bases with which to accom- plish this task: the lexicon (including the homonym Constructed meaning: sheep kangaroo base), a set of schemata, a set of templates, and a post-production checker. describes_all describes_all Lexicon Constructed phrase: woolly jumper_2 The lexicon contains humour–independent semantic Identity homophone and syntactic information about the words and noun phrases entered in it, in the form of “slots” which can Original Noun Phrase: woolly jumper_1 contain other lexemes or may contain other symbols. A typical entry might be: woolly_jumper lexeme = jumper_1 countable = yes category = noun class = clothing Figure 5: The instantiated jumper schema, with links written_form = ‘‘jumper’’ specifying_adj = warm vowel_start = no synonym = sweater suitable for the syn syn template. Gives the riddle: What do you get when you cross a sheep and a kanga- Although the lexicon stores syntactic information, roo? A woolly jumper. the amount of syntax used by the rest of the program is minimal. Because the templates are based on certain fixed forms, the only necessary syntactic information Templates has to do with the syntactic category, verb person, and Since riddles often use certain fixed forms (for example, determiner agreement. Also, the lexicon need only con- “What do you get when you cross with ?”), JAPE - tain entries for nouns, verbs, adjectives, and common 1’s templates embody such standard forms. A JAPE - noun phrases — other types of word (conjunctions, de- 1 template consists of some fragments of canned text terminers, etc) are built into the templates. Moreover, with “slots” where generated words or phrases can be because the model implemented in JAPE - 1 is restricted inserted, derived from the lexemes in an instantiated to covering riddles with noun phrase punchlines, the schema. For example, the syn syn template:
Constructed meaning: sheep leap to them. These definitions were then sifted by a “com- mon knowledge judge” (simply to check for errors and describes_all act_verb excessively obscure suggestions), entered into JAPE - 1’s Constructed phrase: woolly jumper_2 lexicon, and a substantial set of jokes were produced. A different group of volunteers then gave verdicts, both Identity homophone quantitative and qualititative, on these jokes. The use of volunteers to write lexical entries was a way of mak- Original Noun Phrase: woolly jumper_1 ing the testing slightly more rigorous. We did not have woolly_jumper access to a suitable large lexicon, but if we had hand- crafted the entries ourselves there would have been the risk of bias (i.e. humour-oriented information) creep- Figure 6: The instantiated jumper schema, with links ing in. suitable for the syn verb template. Gives the riddle: JAPE - 1 produced a set of 188 jokes in near-surface What do you call a sheep that can leap? A woolly form, which were distributed in batches to 14 judges, jumper. who gave the jokes scores on a scale from 0 (“Not a joke. Doesn’t make any sense.”) to 5 (“Really good”). They were also asked for qualitative information, such What do you get when you cross [text fragment as how the jokes might be improved, and if they had generated from the first characteristic lex- heard any of the jokes before. eme(s)] with [text fragment generated from This testing was not meant to be statistically rigor- the second characteristic lexeme(s)]? [the ous. However, when it comes to analyzing the data, constructed noun phrase]. this lack of rigour causes some problems. Because A template also specifies the values it requires to there were so few jokes and joke judges, the scores are be used for “characteristic” links in the schema; the not statistically significant. Moreover, there was no control group of jokes. We suspect that jokes of this describes all labels in Figure 5 are derived from the syn syn template. When the schema has been fully genre are not very funny even when they are produced instantiated, JAPE - 1 selects one of the associated tem- by humans; however, we do not know how human- produced jokes would fare if judged in the same way plates, generates text fragments from the lexemes, and slots those fragments into the template. JAPE - 1’s jokes were, so it is difficult to make the com- parison. Ideally, with hindsight, JAPE - 1’s jokes would Another template which can be used with the jumper then have been mixed with similar jokes (from (Webb schema (see Figure 6) is the syn verb template: 1978), for example), and then all the jokes would have What do you call [text fragment generated been judged by a group of schoolchildren, who would from the first characteristic lexeme(s)] that be less likely to have heard the jokes before and more [text fragment generated from the second likely to appreciate them. characteristic lexeme(s)]? [the constructed The results of the testing are summarised in Fig- noun phrase.] ure 7. The average point score for all the jokes JAPE - 1 produced from the lexical data provided by volunteers Post-production checking is 1.5 points, over a total of 188 jokes. Most of the jokes To improve the standard of the jokes slightly, some were given a score of 1. Interestingly, all of the nine simple checks are made on the final form. The first is jokes that were given the maximum score of five by one that none of the lexemes used to build the question and judge, were given low scores by the other judge — three punchline are accidentally identical; the second is that got zeroes, three got ones, and three got twos. Overall, the lexemes used to build the nonsense noun phrase the current version of JAPE - 1 produced, according to and its meaning, do not build a genuine common noun the scores the judges gave, “jokes, but pathetic ones”. phrase. The top end of the output are definitely of Crack-a- Joke book quality, and some (according to the judges) The evaluation procedure existed already as jokes, including: An informal evaluation of JAPE - 1 was carried out, What do you call a murderer that has fibre? A with three stages: data acquisition, common knowl- cereal killer. edge judging and joke judging. During the data acqui- What kind of tree can you wear? A fir coat. sition stage, volunteers unfamiliar with JAPE - 1 were What kind of rain brings presents? A bridal asked to make lexical entries for a set of words given shower.
which the judges gave an average of two points. NUMBER OF JOKES Another problem was that the definitions provided by the volunteers were often too general for our pur- 62.5 poses. For example, the entry for the word “hanger” gave its class as device, producing jokes like: What kind of device has wings? An aeroplane 50 hanger. which scored half a point. 37.5 Conclusions This evaluation has accomplished two things. It has shown that JAPE - 1 can produce pieces of text that are recognizably jokes (if not very good ones) from a rela- 25 tively unbiased lexicon. More importantly, it has sug- gested some ways that JAPE - 1 could be improved: • The description of the lexicon could be made 12.5 more precise, so that it is easier for people unfa- miliar with JAPE - 1 to make appropriate entries. Moreover, multiple versions of an entry could be 0 compared for ‘common knowledge’, and that com- 0 1 2 3 4 5 mon knowledge entered in the lexicon. • More slots could be added to the lexicon, allow- POINTS SCORED ing the person entering words to specify what a thing is made of, what it uses, and/or what it is part of. Figure 7: The point distribution over all the output • New, more detailed templates could be added, such as ones which would allow more complex What do you call a good-looking taxi? A hand- punchlines. some cab. • Templates and schemata that give consistently What do you call a perforated relic? A holey grail. poor results could be removed. What kind of pig can you ignore at a party? A • The remaining templates could be adjusted so wild bore. that they use the lexical data more gracefully, by What kind of emotion has bits? A love byte. providing the right amount of information in the question part of the riddle. It was clear from the evaluation that some schemata • Schema-template links that give consistently and templates tended to produce better jokes than oth- poor results could be removed. ers. For example, the use syn template produced sev- • JAPE - 1 could be extended to handle other joke eral texts that were judged to be non-jokes, such as: types, such as simple spoonerisms and sub-word What do you use to hit a waiting line? A pool puns. queue. If even the simplest of the trimming and ordering heuristics described above were implemented, JAPE - 1’s The problem with this template is probably that it output would be restricted to good–quality punning uses the definition constructed by the schema inap- riddles. Although there is certainly room for improve- propriately. The schema-generated definition is ‘non- ment in JAPE - 1’s performance, it does produce recog- sense’, in that it describes something that doesn’t exist; nizable jokes in accordance with a model of punning nonetheless, the word order of the punchline does con- riddles, which has not been done successfully by any tain some semantic information (i.e. which of its words other program we know of. In that, it is a success. is the object and which word describes that object), and it is important for the question to reflect that infor- Acknowledgments mation. A more appropriate template, class has rev, We would like to thank Salvatore Attardo for letting produced this joke: us have access to his unpublished work, and for his What kind of line has sixteen balls? A pool queue. comments on the research reported here.
References Attardo, S., and Raskin, V. 1991. Script theory revis(it)ed: joke similarity and joke representation model. Humor 4(3):293–347. Attardo, S. 1994. Linguistic Theories of Humour. Berlin: Mouton de Gruyter. Binsted, K., and Ritchie, G. 1994. A symbolic de- scription of punning riddles and its computer imple- mentation. Research Paper 688, University of Edin- burgh, Edinburgh, Scotland. Ephratt, M. 1990. What’s in a joke. In Golumbic, M., ed., Advances in AI: Natural Language and Knowl- edge Based Systems. Springer Verlag. 43–74. Minsky, M. 1963. Steps towards artificial intelligence. In Feigenbaum, E., and Feldman, J., eds., Computers and Thought. McGraw-Hill. 406–450. Minsky, M. 1980. Jokes and the logic of the cognitive unconscious. Technical report, Massachusetts Insti- tute of Technology, Artificial Intelligence Laboratory. Palma, P. D., and Weiner, E. J. 1992. Riddles: ac- cessibility and knowledge representation. In Proceed- ings of the 15th International Conference on Compu- tational Linguistics (COLING-92), volume 4. 1121– 1125. Pepicello, and Green. 1984. The Language of Riddles. Ohio State University. Townsend, W., and Antworth, E. 1993. Handbook of Homophones (online version). Webb, K., ed. 1978. The Crack-a-Joke Book. Puffin.
You can also read