GEDI: GENERATIVE DISCRIMINATOR GUIDED SEQUENCE GENERATION
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
GeDi: Generative Discriminator Guided Sequence Generation WARNING: This paper contains GPT-3 outputs which are offensive in nature. Ben Krause∗, Akhilesh Deepak Gotmare∗, Bryan McCann†, Nitish Shirish Keskar Shafiq Joty, Richard Socher†, Nazneen Fatema Rajani Salesforce Research {bkrause,akhilesh.gotmare}@salesforce.com Abstract to learn the distribution of their training set well enough to generate realistic text. However, simply While large-scale language models (LMs) are imitating the distribution of the training data dur- able to imitate the distribution of natural lan- ing generation has many drawbacks (Bender et al., guage well enough to generate realistic text, it is difficult to control which regions of the 2021); large-scale text training sets are crawled distribution they generate. This is especially from the web, which is imbued with toxicity, bias, problematic because datasets used for train- and misinformation. Methods for controlling gen- ing large LMs usually contain significant toxi- eration are valuable for making LMs trained on city, hate, bias, and negativity. One promising such data safer and more useful for downstream approach to address this is to use discrimina- applications. tors to guide decoding from LMs, but exist- Existing approaches to controlling LMs have ing methods for this are too slow to be use- ful in practice for many applications. We limitations. Class-conditional LMs (CC-LMs) such present GeDi as a significantly more efficient as CTRL (Keskar et al., 2019) attempt to control discriminator-based approach for guiding de- text generation by conditioning on a control code, coding. GeDi guides generation at each step which is an attribute variable representing a data by computing classification probabilities for source. However, using a specific control code can all possible next tokens via Bayes rule by nor- reduce sample diversity across prompts, as sam- malizing over two class-conditional distribu- ples will generally resemble the data source of the tions; one conditioned on the desired attribute, or control code, and another conditioned on control code. the undesired attribute, or anti control code. Another approach for controlling LMs is to use We find that GeDi gives controllability on par discriminators to guide decoding, but existing meth- with or better than previous controllable gen- ods to do this are very computationally intensive. eration methods. GeDi results in significantly Weighted decoding (Holtzman et al., 2018) requires faster generation speeds than the only previ- feeding candidate next tokens into a discriminator, ous method that achieved comparable control- and thus scales linearly in computation with the lability in our experiments. We also show that number of tokens to be re-weighted. Plug and Play GeDi can make GPT-2 and GPT-3 significantly less toxic while maintaining linguistic fluency, LM (Dathathri et al., 2020, PPLM) applies up to without sacrificing significantly on generation 10 updates to the generating LM’s latent states per speed. Lastly, we find training GeDi on only time step using gradients from a discriminator, also three topics allows us to controllably generate making it many times slower than generating from new topics zero-shot from just a keyword. the LM directly. We present GeDi1,2 as a significantly more effi- 1 Introduction cient algorithm for discriminator guided decoding. Natural language generation has seen great Our proposed method uses class-conditional LMs progress with the advent of Transformers (Vaswani as generative discriminators (GeDis) to steer lan- et al., 2017) and large scale training (Radford et al., guage generation towards desired attributes. We 2017, 2018, 2019; Brown et al., 2020). Large lan- use GeDis to compute classification likelihoods guage models (LMs) like GPT-2 (Radford et al., for all candidate next tokens during generation 2019) and GPT-3 (Brown et al., 2020) are able using Bayes rule, saving many thousand-fold in ∗ 1 Equal Contribution pronounced “Jedi” † 2 Work performed while at Salesforce Research Code available at https://github.com/salesforce/GeDi
computation as compared with using a standard Models can assign probabilities to sequences by (non-generative) discriminator of the same size to iteratively predicting a distribution over the next compute this for large vocabulary sizes. We then token given the previous tokens. Generating from show how these likelihoods can guide decoding language models requires iteratively sampling from from large language models via weighted decoding Pθ (xt |x
ic ic g at at g in in m m ay l az l az … fu ay fu … ne ne aw ok desired attribute: positive am aw am ok ci ci The party was .09 0.01 … 0.2 0.05 The party was amazing Discriminator 0.9 The party was awful Discriminator 0.1 The party was 0.01 0.09 … 0.2 0.05 ……. ……. ……. The party was cinematic Discriminator 0.5 0.9 0.1 … 0.5 0.5 The party was okay Discriminator 0.5 The party was GPT-2 p1 p2 … pn-1 pn 0.9 0.1 … 0.5 0.5 Figure 1: A toy example of how GeDi-guided decod- The party was GPT-2 p1 ing uses Bayes rule to efficiently compute classifica- p2 … pn-1 pn tion probabilities for possible next tokens at each gen- eration timestep using only element-wise operations. Figure 2: A toy example of using a language model These classification probabilities can then be used to with a discriminator head to guide next token genera- guide generation from a language model (e.g., GPT-2) tion. This requires feeding in each word in the vocabu- to achieve attribute control across domains. If a class lary to compute the probability that the resulting gener- conditional language model was trained on movie re- ation would have positive sentiment, and using these views for sentiment control, its direct class-conditional probabilities to guide the base language model (e.g., predictions will be biased towards predicting movie re- GPT-2) towards positive sentiment. This requires |V| 2 view words (illustrated by next word prediction of “cin- times the amount of computation to compute the final ematic”). However, the bias towards movie reviews can hidden states of the network as compared with using be canceled out by contrasting the predictions of oppos- GeDi if computing for the full vocabulary and using ing control codes via Bayes rule. the same neural architecture for both methods. Equation (4) is normalized over all xt in the vo- cabulary to obtain Pw (xt |x
that this does not give strong attribute control. Algorithm 1 GeDi-guided decoding In practice, applying Equation (5) to long se- Inputs: base LM PLM , CC-LM Pθ , vocabulary V, quences often results in poorly calibrated distribu- posterior mixing weight ω, decoding scheme tions later in the sequence that assign classification 1: P (x|c) ← 1 probabilities of 1 or 0 to all candidate next words, 2: P (x|c̄) ← 1 which provides no useful signal. We addressed this 3: for t = 1 . . . , N do by normalizing probabilities by current sequence 4: pLM ← [PLM (xt = v|x
2017; Yu et al., 2017; Hu et al., 2017) or using a ation. RealToxictyPrompts (Gehman et al., 2020) discriminator to guide decoding (Ghazvininejad gives an automatic evaluation of toxicity using gen- et al., 2017; Holtzman et al., 2018; Dathathri et al., erations from different language models using a 2020). Keskar et al. (2019) train a CC-LM with pre- set of webtext prompts. (Gehman et al., 2020) defined control codes placed at the start of every also tests methods for mitigating toxicity, and finds sequence. GeDi also uses CC-LMs, but instead of that applying PPLM was more effective than sim- generating from them directly, GeDi uses them as pler decoding-based detoxification methods such discriminators to guide decoding from another lan- as swear word filters. Xu et al. (2020) develop a guage model. This is much more computationally human in the loop method for adversarially probing efficient than previous methods for discriminator toxic responses in conversational agents, and train a guided decoding. Holtzman et al. (2018) apply dis- model to give preset responses when encountering criminators to re-weight a beam search, requiring potentially unsafe probes. Other work has focused all candidate tokens to be passed through the dis- on removing gender bias from language models criminator, scaling linearly with the number of re- (Bordia and Bowman, 2019; Dinan et al., 2020; scored tokens. PPLM (Dathathri et al., 2020) trains Bolukbasi et al., 2016). Related to the problem of an attribute model on top of a language model’s addressing toxicity in generation is toxicity detec- last hidden layer and backpropagates gradients to tion, which can be performed using the Perspective update the hidden states of the model. This is com- API or using a classifier trained on a labelled tox- putationally intensive because it requires multiple icity dataset such as the Jigsaw Toxic Comment forward and backward passes for each generation Classification Dataset (Borkan et al., 2019). Tox- step. For instance, applying PPLM with 10 up- icity detection is difficult as toxicity labelling is date steps as done in Dathathri et al. (2020) would subjective and often has poor annotator agreement require an additional factor of 20 fold computa- (Waseem, 2016; Ross et al., 2017). Additionally, tion (10 forward passes, 10 backward passes) as existing toxicity classifiers are often biased in that compared to base LM generation at the first de- they overestimate the toxicity of text that mentions coding timestep. This factor also increases as the sexual orientations or racial minorities (Dixon et al., sequence length increases, since PPLM updates the 2018; Sap et al., 2019; Hutchinson et al., 2020). previously stored keys and values. GeDi in compar- ison only adds constant overhead that is indepen- 5 Experiments dent of the size of the base LM, and this constant will be minimal if the GeDi is significantly smaller We experiment with GeDi-guided decoding for sen- than the base LM. GeDi also relates to the ratio- timent, detoxification, and topic control. We fine- nal speech acts framework for computational prag- tune GPT2-medium (345M parameter) (Radford matics (Frank and Goodman, 2012; Goodman and et al., 2019) using the loss in Equation (3) with Stuhlmüller, 2013) where a “listener” model and a control codes specific to each task to form a class- “speaker” model interactively generate a sequence conditional language model. We use these CC-LMs such that the listener can recover the input. GeDi as GeDis to guide generation from GPT2-XL (1.5B most closely relates to distractor based pragmat- parameter), and GPT-3 (Brown et al., 2020) in our ics (Andreas and Klein, 2016; Cohn-Gordon et al., detoxification experiments. All experiments were 2018; Shen et al., 2019), where a single model pro- performed using adaptations of Huggingface Trans- cesses a true input and a distractor input, and uses formers (Wolf et al., 2020). Bayes rule to produce text that fits the true input but We include experiments with greedy decoding not the distractor input. GeDi differs from previ- with a repetition penalty (Keskar et al., 2019) (con- ous pragmatics based approaches in that it trains a ditioning on varying prompts to give diversity separate class-conditional language model (which across generations), which we found to give the acts as the listener) on a single attribute, allowing best quality generations, and top-p sampling (Holtz- that attribute to be isolated, and uses it to guide man et al., 2020). Our hyper-parameter settings for generation from a separate language model (which GeDi-guided generation are given in Appendix C.1. acts as the speaker). We also perform ablation studies in Appendix D, and find that combining both the weighted decod- Other previous works seek to understand and ing and filtering heuristics appears to be beneficial address toxicity and hate speech in language gener- although is not critical to the success of the method,
Model Generation time for Amazon review sentiment, greedy decoding (sec/token) (repetition penalty of 1.2). GPT2-XL 0.060 CTRL was applied using the control codes corre- GeDi-guided (w/ GPT2-XL) 0.095 sponding to positive and negative Amazon reviews PPLM (w/ GPT2-XL) 3.116 used during training by Keskar et al. (2019). The PPLM discriminator was trained on SST-5 as in Dathathri et al. (2020), with the step size param- Table 1: Average generation time in seconds per token eter retuned for GPT2-XL (since Dathathri et al. for generating sequences of length 256 on a V100 GPU. (2020) used GPT2-medium.). We found that it was more than 30× faster to guide GPT2-XL with a and that applying a very small LSTM (Hochre- GeDi as compared with PPLM (assuming 10 up- iter and Schmidhuber, 1997) discriminator that can date steps as used in (Dathathri et al., 2020) and in match the efficiency of GeDi is not as effective for our experiments), as shown in Table 1. controlling generation. Amazon Mechanical Turk annotators rated the generated text on sentiment, how book-like the text 5.1 Controlling sentiment of generations was, fluency, and whether or not the text resembled from book prompts an Amazon review or movie review (since CTRL We experiment with GeDi-guided decoding from was trained on Amazon reviews and GeDi was GPT-2 for sentiment control using CC-LMs fine- trained on movie reviews). Instructions given to tuned on IMDb movie reviews. We noticed that, annotators are given in Appendix G. The results of while direct generation from CC-LMs could effec- the experiment are given in Table 2. Using GeDi to tively control the sentiment of movie reviews, it guide GPT2-XL was able to generate book-like and struggled to generalize to out-of-domain prompts, linguistically fluent text while giving strong con- and would generally try to convert prompts into trol over the tone. In the greedy setting, GeDi was movie reviews. However, when we used this same also able to give roughly equivalent positive senti- model as a GeDi to guide sampling from GPT-2, ment control and statistically significantly stronger we were able to effectively control the sentiment negative sentiment control compared with PPLM of a wide variety of topics. (p < 0.01 by two-tailed Wilcoxon signed rank test). To experimentally verify that GeDi can gener- In the top-p setting, GeDi achieved statistically sig- alize the concepts of “positivity” and “negativity” nificantly stronger sentiment control than PPLM beyond its training domain, we evaluate on a task for both positive and negative sentiment (p = 0.01 where models conditionally generate text from the and p = 0.005 for positive and negative sentiment start of book chapters from Bookcorpus (Zhu et al., respectively). p-values for all significance tests are 2015), and each prompt is at least 150 characters given in Appendix E. We include samples from all and ends on the first word break after the mini- greedy decoding models in Tables 11, 12, 13. mum length. We run human evaluation on gen- CTRL struggled to control tone/sentiment in this erations from 50 different book prompts from 14 setting because its training domain for sentiment different models; including raw GPT2-XL with was Amazon reviews, and direct generation from both top-p sampling (p = 0.9) and greedy de- the CC-LMs that we used as GeDis failed to gen- coding (repetition penalty=1.2), and the following erate book-like text because their training domain models with both positive and negative sentiment: was movie reviews. According to our annotators, 1. GPT2-XL guided by GeDi, greedy decoding 27% of CTRL samples resembled Amazon reviews, (repetition penalty of 1.2). 2. GPT2-XL guided and 61% of CC-LM samples resembled movie re- by GeDi, top-p sampling with p = 0.9 (repetition views (Amazon and movie review resemblance penalty of 1.05). 3. PPLM (w/GPT2-XL), greedy percentages were less than 5% for samples from decoding (repetition penalty of 1.2). 4. PPLM all other models). This is a critical drawback of (w/GPT2-XL), top-p sampling with p = 0.9. CTRL-style generation – the model can only reli- 5. CC-LM trained on movie reviews (same model ably generate text and control attributes within the used as GeDi, but with direct CTRL-style genera- training domain corresponding to the control code. tion), greedy decoding (repetition penalty of 1.2). Samples that illustrate this are given in Table 14. 6. CTRL (Keskar et al., 2019) using control codes Discriminator-guided methods GeDi and PPLM
Model Positivity Book-like ↑ Fluency ↑ Label fidelity ↑ Perplexity score ↓ GeDi-guided-pos (greedy) 3.73 4.18 4.43 96 % 12.8 GeDi-guided-pos (top-p) 3.82 4.17 4.35 100 % 17.3 PPLM-pos (greedy) 3.70 4.31 4.37 76 % 14.0 PPLM-pos (top-p) 3.47 4.24 4.00 66 % 21.4 CC-LM-pos (greedy) 3.13 3.18 3.83 62 % 14.7 CTRL-pos (greedy) 2.85 3.76 3.99 48 % 9.7 GPT2-XL (greedy) 3.16 4.45 4.35 - 10.4 GPT2-XL (top-p) 2.89 4.45 4.16 - 13.8 CTRL-neg (greedy) 2.87 3.59 4.07 48 % 9.7 CC-LM-neg (greedy) 2.30 2.70 3.68 76 % 14.3 PPLM-neg (top-p) 2.56 4.15 4.03 62 % 32.3 PPLM-neg (greedy) 2.57 4.31 4.21 78 % 15.8 GeDi-guided-neg (top-p) 2.04 4.01 3.88 98 % 26.7 GeDi-guided-neg (greedy) 2.15 4.21 4.06 96 % 14.2 Table 2: Human and automatic evaluation for sentiment on book text generation (rated for positivity, book resem- blance and fluency all on a scale of 1-5). For human evaluation, we average three annotations on generations from 50 prompts for each model, where prompts are from the start of book chapters, and are a minimum of 150 char. For automatic evaluation, we use a RoBERTa classifier trained on SST-2 (Socher et al., 2013) to measure label fidelity (how often the sample is classified as having the same label as the control code), and measure the perplexity of generations under GPT-2 to compute perplexity scores. We compare using a CC-LM as a GeDi to guide GPT2-XL (GeDi-guided), vs. direct class conditional generation (CC-LM). GeDi gives the strongest control over sentiment. PPLM also gives strong sentiment control, but results in generation 30× slower. . result in text rated more book-like that very rarely of 1.2). We add details of how we apply GeDi to if ever reverts back to the domain that the discrimi- GPT-3 in Appendix B. nator was trained on. However, as compared with For our large-scale automatic evaluation, we se- PPLM, GeDi was able to generate 30× faster, and lect 5000 prompts from RealToxicityPrompts at sentiment control that was on par with or better random and draw generations from each model. than PPLM in all settings. Following Gehman et al. (2020), we measure the expected toxicity score and toxicity probability sep- 5.2 Detoxifying GPT-2 and GPT-3 arately for generations from toxic and non-toxic We test GeDi’s ability to detoxify language genera- prompts using the Perspective API 3 , which is a tion. We train a CC-LM on the Jigsaw Toxic Com- toxicity classier that returns a probability between ment Classification Dataset (Borkan et al., 2019), 0 and 1 that the submitted text is toxic. The ex- which contains text samples labeled as “toxic” or pected toxicity is given by the average classification “non-toxic”. The “toxic” label indicates the pres- probability under Perspective’s toxicity classifier ence of profanity, obscenity, threats, insults, or of continuations from a given model, whereas the identity hate. We train the model on an even split toxicity probability is the fraction of generations of toxic and non-toxic examples, with “clean” and that the Perspective API classifies as having a toxi- “dirty” control codes to specify toxic and non-toxic city probability greater than 0.5. For models that text. For evaluation, we use generations condi- use sampling, we draw 10 generations from each tioned on RealToxicityPrompts (Gehman et al., prompt, and use the most toxic continuation as eval- 2020). We consider two toxicity evaluations, one uated by the Perspective API to measure all statis- based on automatic toxicity evaluations from a tics, following the expected max toxicity scores large number of prompts following Gehman et al. and probabilities used by Gehman et al. (2020). (2020), and one using human annotations on a The results are given in Table 3. GeDi was able to smaller number of trigger prompts that tend to lead reduce the toxicity of GPT-2 and GPT-3 and gave to especially toxic generations from LMs. We ex- a stronger detoxification effect as compared with periment with the same models as in the previous PPLM (The reductions in expected toxicity of GeDi section (expect for pretrained CTRL, which does vs. PPLM, GeDi vs. GPT-2, and GeDi vs. GPT-3 not have a detoxification control code), but also add were strongly statistically significant in all com- results using 1. GPT3 using Open AI API, greedy parisons by a paired sample t-test). The advantage (repetition penalty of 1.2). 2. GPT3 using Open AI 3 API, guided by GeDi, greedy (repetition penalty https://www.perspectiveapi.com/
Expected toxicity ↓ Toxicity probability ↓ Model toxic prompt non-toxic prompt toxic prompt non-toxic prompt GPT2-XL (top-p, most toxic of 10 per prompt) 0.790.14 0.350.23 0.98 0.25 GeDi-guided GPT-2 (top-p, most toxic of 10 per prompt) 0.710.16 0.210.14 0.89 0.04 PPLM (top-p, most toxic of 10 per prompt) 0.750.14 0.300.19 0.94 0.15 GPT2-XL (greedy) 0.670.18 0.170.16 0.79 0.05 GeDi-guided GPT-2 (greedy) 0.610.21 0.120.11 0.67 0.01 PPLM (greedy) 0.630.19 0.140.12 0.71 0.02 CC-LM (greedy) 0.690.19 0.170.18 0.83 0.10 GPT-3 da-vinci (greedy) 0.670.18 0.170.16 0.79 0.05 GeDi-guided GPT-3 (greedy) 0.610.22 0.110.10 0.69 0.01 Table 3: RealToxicityPrompts automated toxicity evaluation. We measure the expected toxicity score (with stan- dard deviation given in subscript) and toxicity probability from continuations from toxic (perspective toxicity score > 0.5) and non-toxic (perspective toxicity score < 0.5) prompts for 9 models. Generations from 5000 prompts were used (1054 toxic, 3946 non-toxic, approximately matching the ratios used by Gehman et al. (2020)). For models that use top-p sampling, we measure the expected toxicity and toxicity probability of the most toxic sam- ple out of 10 generations per prompt. For generation with greedy models we simply average these metrics across prompts. GeDi significantly reduced the toxicity of GPT-2 and GPT-3 and resulted in a stronger detoxification effect as compared with PPLM. of GeDi over PPLM was especially pronounced Model Toxicity ↓ Fluency ↑ (human eval) (human eval) in the case of top-p sampling, where PPLM gener- GPT2-XL (top-p) 49 % 4.10 ated at least one toxic sample (out of 10 samples GeDi-guided GPT-2 (top-p) 16 % 4.07 PPLM (top-p) 30 % 4.19 per prompt) from a non-toxic prompt more than 3 GPT2-XL (greedy) 60 % 4.32 times as often, suggesting that GeDi is more robust GeDi-guided GPT-2 (greedy) 27 % 4.47 to worst case scenarios when applying sampling. PPLM (greedy) 28 % 4.41 CC-LM (greedy) 37 % 4.19 GPT-3 da-vinci (greedy) 57 % 4.32 We also applied human evaluation to measure GeDi-guided GPT-3 (greedy) 21 % 4.23 toxicity using a smaller number of prompts that Table 4: Human evaluation of toxicity on 100 trigger probe LMs to generate toxic text. To identify strong prompts. We collect 3 annotations of toxicity labels triggers, we selected a subset of prompts with Per- (where we classify each sample based on majority) and spective API toxicity probabilities between 0.3 and linguistic fluency scores (scale of 1-5) for each model. 0.5, that also were classified as non-toxic by a We find that GeDi is effective for detoxifying GPT-2 RoBERTa toxicity classifier trained on the Jigsaw and GPT-3 while maintaining fluency. dataset. We used GPT2-XL to draw 32 samples from each prompt, and selected the 100 prompts with the highest average toxicity probability over 5.3 Extending GeDi to the multi-class setting their 32 completions according to the RoBERTa To experiment with multi-class GeDi, we use the toxicity classifier. Our goal with this procedure AG news topic classification data set (Zhang et al., was to identify prompts that are non-toxic, but have 2015) which has 4 topics (World, Sports, Busi- a high probability of causing language models to ness, and Science/Tech). In order to test GeDi’s generate toxic text. ability to generate never seen before classes zero- shot, we trained 4 different CC-LMs; each one is We ran human evaluation to measure toxicity trained on only 3 out of 4 of the AG news classes, and linguistic fluency [1: very low fluency, 5: very with one class held out. We then compare direct high fluency]. Results are given in Table 4 and (CTRL-style) generation from CC-LMs with GeDi- generations from evaluated models are given in guided decoding from GPT-2, on topics included Table 15. GeDi was able to significantly reduce in training and held out (zero-shot) topics. To eval- the toxicity in GPT-2 and GPT-3 (p < 0.001 by a uate topic relevance, we use a RoBERTa classifier 2 proportion z-test in all settings). GeDi resulted trained on all 4 AG news topics to estimate the in a similar toxicity as compared with PPLM for topic of generation. We obtain generations condi- greedy decoding and was significantly less toxic tioning on short (minimum 30 characters, ending than PPLM for sampling (p = 0.02), while also on a space) prompts from the multi-news data-set achieving 30× faster generation speeds. (Fabbri et al., 2019), and report results in Table 5.
Topic Model Trained on class Zero-shot (Label fidelity) (Label fidelity) Topic GeDi-guided generation World GPT2-XL - 22 % In a shocking finding NASA have GeDi-guided 72 % 30 % announced the discovery of CC-LM 53 % 28 % a mysterious object orbiting Space our nearest neighbour, Proxima Sports GPT2-XL - 6% Centauri. GeDi-guided 91 % 62 % Advertisement CC-LM 49 % 12 % Researchers have spent years Business GPT2-XL - 4% studying the strange object in GeDi-guided 55 % 36 % space as they are increasingly CC-LM 35 % 10 % becoming more convinced that it’s real. In fact, some researchers Science GPT2-XL - 68 % are starting to wonder if this GeDi-guided 83 % 84 % new discovery may prove to be CC-LM 59 % 50 % one of the greatest scientific discoveries of recent years. Table 5: Automatic label fidelity on topics, measured In a shocking finding police believe two fire crews, including by how often a RoBERTa classifier’s label matches a senior paramedic, were the control code used to generate the sample. We Fire deliberately set alight as part of a revenge plot. trained 4 different CC-LMs, each with 1 class held out It comes as a huge investigation and we considered direct CTRL-style generation (CC- into an apparent conspiracy in which arsonists targeted at least LM), and GeDi-guided decoding from these models. three other London fire engines “trained on class” label fidelity averages the label fideli- in just one night on Friday and Saturday night. ties from 3 models trained with the given class as one of the training classes. The “zero-shot” label fidelity for In a shocking finding historians believe to be "unprecedented" each class uses generations from the model trained on British documents have been the other 3 classes, using a zero-shot control code for History unearthed which reveal the true history of King Richard II and the desired class. We include results from raw GPT-2- show that he was not only the XL to show how much GeDi and CC-LM are influenc- son of Godfrey of Gloucester, but also descended from King Henry ing generation. We find that GeDi is able to influence VIII. generation more effectively than CC-LM when condi- Richard, whose father was executed for his crimes in tioning on both training classes and held out classes. 1483, became King in 1485 after defeating John Balliol in a battle at Bosworth. GeDi was able to generate topics included in Table 6: Controlling topic of generation (zero-shot) training with a higher label fidelity than CTRL- with GeDi (greedy decoding). This topic GeDi was style generation from a CC-LM. Unlike CC-LM, trained on only three classes: science, sports and busi- GeDi was able to bias generation towards never ness. The topics of Space, Fire, and History were not seen before zero-shot control codes that are held a part of the GeDi training set. Boldfaced string out from training. GeDi’s ability to generalize to indicates the context provided to the language model followed by its generation. new control codes zero-shot gives the ability to generate text corresponding to many topics and subtopics. This ability likely emerges because gen- erative classifiers can classify unseen topics zero- ference, making it far more efficient than previous shot from learned word embeddings (Yogatama methods that use discriminators to guide decoding. et al., 2017), and GeDi uses a generative classifier GeDi achieves stronger controllability of sentiment to guide generation. While GPT-3 can also gener- than PPLM while also giving a generation speed ate topics zero shot by conditioning prompts such more than 30× faster. GeDis trained on 3 topics as “Write an article about sports:”, zero-shot gen- can also controllably generate new topics zero-shot eration with GeDi does not necessarily need to be from just a keyword. We also show that GeDi is an article or have any other constraints that would able to significantly reduce the toxicity of GPT-2 come about from the prompt. We provide examples and GPT-3 without sacrificing noticeably on lin- of zero-shot topic generation with GeDi in Table 6. guistic fluency. GeDi moves towards unifying nat- ural language generation with classification, and 6 Conclusion suggests that we may be able to efficiently generate text that corresponds to any attribute that we can We present GeDi as an approach for controllable accurately classify. This could have broad impli- generation that uses generative discriminators to cations for improving text generation systems by classify candidate next tokens on the fly during in- making them more controllable.
References deep bidirectional transformers for language under- standing. In Proceedings of the 2019 Conference Jacob Andreas and Dan Klein. 2016. Reasoning about of the North American Chapter of the Association pragmatics with neural listeners and speakers. In for Computational Linguistics: Human Language Proceedings of the 2016 Conference on Empirical Technologies, Volume 1 (Long and Short Papers), Methods in Natural Language Processing, pages pages 4171–4186, Minneapolis, Minnesota. Associ- 1173–1182, Austin, Texas. Association for Compu- ation for Computational Linguistics. tational Linguistics. Emily Dinan, Angela Fan, Adina Williams, Jack Ur- Emily M Bender, Timnit Gebru, Angelina McMillan- banek, Douwe Kiela, and Jason Weston. 2020. Major, and Shmargaret Shmitchell. 2021. On the Queens are powerful too: Mitigating gender bias in dangers of stochastic parrots: Can language models dialogue generation. In Proceedings of the 2020 be too big. In Proceedings of the 2020 Conference Conference on Empirical Methods in Natural Lan- on Fairness, Accountability, and Transparency; As- guage Processing (EMNLP), pages 8173–8188, On- sociation for Computing Machinery: New York, NY, line. Association for Computational Linguistics. USA. Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, Tolga Bolukbasi, Kai-Wei Chang, James Zou, and Lucy Vasserman. 2018. Measuring and mitigat- Venkatesh Saligrama, and Adam Kalai. 2016. Man ing unintended bias in text classification. In Pro- is to computer programmer as woman is to home- ceedings of the 2018 AAAI/ACM Conference on AI, maker? debiasing word embeddings. In Advances Ethics, and Society, pages 67–73. in neural information processing systems. Shikha Bordia and Samuel R. Bowman. 2019. Identify- Alexander Fabbri, Irene Li, Tianwei She, Suyi Li, and ing and reducing gender bias in word-level language Dragomir Radev. 2019. Multi-news: A large-scale models. In Proceedings of the 2019 Conference of multi-document summarization dataset and abstrac- the North American Chapter of the Association for tive hierarchical model. In Proceedings of the 57th Computational Linguistics: Student Research Work- Annual Meeting of the Association for Computa- shop, pages 7–15, Minneapolis, Minnesota. Associ- tional Linguistics, pages 1074–1084, Florence, Italy. ation for Computational Linguistics. Association for Computational Linguistics. Daniel Borkan, Lucas Dixon, Jeffrey Sorensen, Nithum Angela Fan, David Grangier, and Michael Auli. 2018. Thain, and Lucy Vasserman. 2019. Nuanced met- Controllable abstractive summarization. In Proceed- rics for measuring unintended bias with real data for ings of the 2nd Workshop on Neural Machine Trans- text classification. In Companion proceedings of the lation and Generation, pages 45–54, Melbourne, 2019 world wide web conference, pages 491–500. Australia. Association for Computational Linguis- tics. Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Jessica Ficler and Yoav Goldberg. 2017. Controlling Neelakantan, Pranav Shyam, Girish Sastry, Amanda linguistic style aspects in neural language genera- Askell, et al. 2020. Language models are few-shot tion. In Proceedings of the Workshop on Stylis- learners. In Advances in neural information process- tic Variation, pages 94–104, Copenhagen, Denmark. ing systems. Association for Computational Linguistics. Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Michael C Frank and Noah D Goodman. 2012. Pre- and Jie Fu. 2021. Cocon: A self-supervised ap- dicting pragmatic reasoning in language games. Sci- proach for controlled text generation. ICLR. ence, 336(6084):998–998. Reuben Cohn-Gordon, Noah Goodman, and Christo- Samuel Gehman, Suchin Gururangan, Maarten Sap, pher Potts. 2018. Pragmatically informative image Yejin Choi, and Noah A. Smith. 2020. RealToxi- captioning with character-level inference. In Pro- cityPrompts: Evaluating neural toxic degeneration ceedings of the 2018 Conference of the North Amer- in language models. In Findings of the Association ican Chapter of the Association for Computational for Computational Linguistics: EMNLP 2020, pages Linguistics: Human Language Technologies, Vol- 3356–3369, Online. Association for Computational ume 2 (Short Papers), pages 439–443, New Orleans, Linguistics. Louisiana. Association for Computational Linguis- tics. Marjan Ghazvininejad, Xing Shi, Jay Priyadarshi, and Kevin Knight. 2017. Hafez: an interactive poetry Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane generation system. In Proceedings of ACL 2017, Hung, Eric Frank, Piero Molino, Jason Yosinski, and System Demonstrations, pages 43–48, Vancouver, Rosanne Liu. 2020. Plug and play language mod- Canada. Association for Computational Linguistics. els: a simple approach to controlled text generation. ICLR. Noah D Goodman and Andreas Stuhlmüller. 2013. Knowledge and implicature: Modeling language un- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and derstanding as social cognition. Topics in cognitive Kristina Toutanova. 2019. BERT: Pre-training of science, 5(1):173–184.
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Björn Ross, Michael Rist, Guillermo Carbonell, Ben- Long short-term memory. Neural computation, jamin Cabrera, Nils Kurowsky, and Michael Wo- 9(8):1735–1780. jatzki. 2017. Measuring the reliability of hate speech annotations: The case of the european Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and refugee crisis. Proceedings of NLP4CMC III: Yejin Choi. 2020. The curious case of neural text 3rd Workshop on Natural Language Processing degeneration. ICLR. for Computer-Mediated Communication (Bochum), Bochumer Linguistische Arbeitsberichte, vol. 17, sep Ari Holtzman, Jan Buys, Maxwell Forbes, Antoine 2016, pp. 6-9. Bosselut, David Golub, and Yejin Choi. 2018. Learning to write with cooperative discriminators. Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, In Proceedings of the 56th Annual Meeting of the As- and Noah A. Smith. 2019. The risk of racial bias sociation for Computational Linguistics (Volume 1: in hate speech detection. In Proceedings of the Long Papers), pages 1638–1649, Melbourne, Aus- 57th Annual Meeting of the Association for Com- tralia. Association for Computational Linguistics. putational Linguistics, pages 1668–1678, Florence, Italy. Association for Computational Linguistics. Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, and Eric P. Xing. 2017. Toward con- Sheng Shen, Daniel Fried, Jacob Andreas, and Dan trolled generation of text. In Proceedings of the Klein. 2019. Pragmatically informative text gen- 34th International Conference on Machine Learning, eration. In Proceedings of the 2019 Conference volume 70 of Proceedings of Machine Learning Re- of the North American Chapter of the Association search, pages 1587–1596, International Convention for Computational Linguistics: Human Language Centre, Sydney, Australia. PMLR. Technologies, Volume 1 (Long and Short Papers), pages 4060–4067, Minneapolis, Minnesota. Associ- Ben Hutchinson, Vinodkumar Prabhakaran, Emily ation for Computational Linguistics. Denton, Kellie Webster, Yu Zhong, and Stephen De- nuyl. 2020. Social biases in NLP models as barriers Richard Socher, Alex Perelygin, Jean Wu, Jason for persons with disabilities. In Proceedings of the Chuang, Christopher D. Manning, Andrew Ng, and 58th Annual Meeting of the Association for Compu- Christopher Potts. 2013. Recursive deep models tational Linguistics, pages 5491–5501, Online. As- for semantic compositionality over a sentiment tree- sociation for Computational Linguistics. bank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, pages 1631–1642, Seattle, Washington, USA. Asso- Caiming Xiong, and Richard Socher. 2019. Ctrl: A ciation for Computational Linguistics. conditional transformer language model for control- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob lable generation. arXiv preprint arXiv:1909.05858. Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all Andrea Madotto, Etsuko Ishii, Zhaojiang Lin, Sumanth you need. In Advances in neural information pro- Dathathri, and Pascale Fung. 2020. Plug-and-play cessing systems, pages 5998–6008. conversational models. In Findings of the Associa- tion for Computational Linguistics: EMNLP 2020, Zeerak Waseem. 2016. Are you a racist or am I seeing pages 2422–2433, Online. Association for Compu- things? annotator influence on hate speech detection tational Linguistics. on Twitter. In Proceedings of the First Workshop on NLP and Computational Social Science, pages 138– Alec Radford, Rafal Jozefowicz, and Ilya Sutskever. 142, Austin, Texas. Association for Computational 2017. Learning to generate reviews and discovering Linguistics. sentiment. arXiv preprint arXiv:1704.01444. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Alec Radford, Karthik Narasimhan, Tim Salimans, and Chaumond, Clement Delangue, Anthony Moi, Pier- Ilya Sutskever. 2018. Improving language under- ric Cistac, Tim Rault, Remi Louf, Morgan Funtow- standing by generative pre-training. icz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Teven Le Scao, Sylvain Gugger, Mariama Drame, Dario Amodei, and Ilya Sutskever. 2019. Language Quentin Lhoest, and Alexander Rush. 2020. Trans- models are unsupervised multitask learners. OpenAI formers: State-of-the-art natural language process- Blog, 1(8):9. ing. In Proceedings of the 2020 Conference on Em- pirical Methods in Natural Language Processing: Nazneen Fatema Rajani, Bryan McCann, Caiming System Demonstrations, pages 38–45, Online. Asso- Xiong, and Richard Socher. 2019. Explain yourself! ciation for Computational Linguistics. leveraging language models for commonsense rea- soning. In Proceedings of the 57th Annual Meet- Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Ja- ing of the Association for Computational Linguis- son Weston, and Emily Dinan. 2020. Recipes for tics, pages 4932–4942, Florence, Italy. Association safety in open-domain chatbots. arXiv preprint for Computational Linguistics. arXiv:2010.07079.
Dani Yogatama, Chris Dyer, Wang Ling, and Phil Blun- som. 2017. Generative and discriminative text clas- sification with recurrent neural networks. arXiv preprint arXiv:1703.01898. Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In Thirty-First AAAI Confer- ence on Artificial Intelligence. Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text clas- sification. In Advances in neural information pro- cessing systems, pages 649–657. Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhut- dinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In The IEEE International Con- ference on Computer Vision (ICCV). Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, Paul Chris- tiano, and Geoffrey Irving. 2019. Fine-tuning lan- guage models from human preferences. arXiv preprint arXiv:1909.08593.
A GeDi filtering heuristic C Additional model and hyper-parameter details In addition to Equation 4, we also used an addi- tional filtering heuristic that was beneficial for steer- C.1 Hyper-parameters for GeDi guided ing generation more aggressively. This heuristic, generation inspired by top-p sampling (Holtzman et al., 2020), GeDi used ρ = 0.7 and ω = 30 for sentiment, samples from the set Vm ⊆ V which contains the ρ = 0.8 and ω = 30 for GPT-2 detoxification, minimum number of tokens possible from the head ρ = 0.8 and ω = 90 for GPT-3 detoxification of the distribution for Pθ (c|xt , x
enough to efficiently classify every candidate next token. For the weighted decoding setting, we set ρ = 0 which turns off filtering, and tune ω to give a similar perplexity score to the combined heuristic (higher ω results in more aggressive steering and generally gives better attribute control and a worse perplexity score). For the filtering setting, we set ω = 0 to turn off weighted decoding, and tune ρ to give a similar perplexity score to the combined heuristic (higher ρ results in more aggressive filter- ing and generally gives a worse perplexity score and higher label fidelity). For evaluation, we mea- sure the label fidelity according to an external clas- sifier, and perplexity scores under GPT-2-XL, using the prompts corresponding to the experiments in Tables 2 and 4 for sentiment and detoxification re- spectively. For tuning parameters, we use prompts from IMDb to condition on for sentiment genera- tions, and an additional trigger 100 prompts (that do not overlap with the evaluation prompts) for detoxification. We tune hyperparameters ρ and ω to give a good trade-off between label fidelity (as measured by RoBERTa) and perplexity scores. For the LSTM discriminator, we train a unidirectional LSTM with 600 hidden units, use mean pooling, and tune the training learning rate to give the best held out accuracy. The LSTM discriminator is then used to guide generation by applying a forward pass for each candidate token across the full vocab- ulary, and applying Equations 4 and 7 to guide gen- eration. This results in generation that is slightly slower as compared to GeDi (assuming we batch the LSTM forward passes across the vocabulary), and results in higher memory usage. Results are given in Table 7 for sentiment and Table 8 for detoxification. Both the filtering and weighted decoding methods are able to control gen- eration on their own, but the combined heuristic appears to perform slightly better for detoxifica- tion, and may be more robust to settings where one method or the other do not work as well in isolation. Using a lightweight LSTM discriminator to guide generation gave weaker control over sentiment and detoxification as compared with using GeDi.
Model Label fidelity ↑ perplexity scores ↓ GeDi-guided (combined heuristic, ρ = 0.7, ω = 20) 96 % 13.5 GeDi-guided (weighted decoding heuristic, ρ = 0, ω = 600) 86 % 13.6 GeDi-guided (filtering heuristic, ρ = 0.7, ω = 0) 95 % 13.3 Lightweight LSTM discriminator greedy (combined heuristic, ρ = 0.8, ω = 30) 73 % 16.6 Table 7: Sentiment label fidelity and perplexity scores for the weighted decoding heuristic (ρ = 0), filtering heuristic (ω = 0), combined weighted decoding filtering heuristic, and comparing with a generative discriminator with a lightweight LSTM discriminator. Model Toxicity (RoBERTa) ↓ perplexity scores ↓ GeDi-guided greedy (combined heuristic, ρ = 0.8, ω = 30) 8% 10.9 GeDi-guided greedy (weighted decoding heuristic, ρ = 0, ω = 150) 13 % 10.8 GeDi-guided greedy (filtering heuristic, ρ = 0.85, ω = 0) 24 % 10.7 Lightweight LSTM discriminator greedy (combined heuristic, ρ = 0.8, ω = 30) 18 % 10.9 Table 8: Toxicity and perplexity scores for the weighted decoding heuristic (ρ = 0), filtering heuristic (ω = 0), combined weighted decoding filtering heuristic, and comparing with a generative discriminator with a lightweight LSTM discriminator. E Statistical significance tables for human evaluation experiments Model 1 Model 2 p-value positivity p-value book resemblance p-value fluency GeDi-pos greedy GPT2-XL greedy 4E-05 0.16 0.44 GeDi-pos top-p GPT2-XL top-p 2E-07 0.04 0.09 GeDi-pos greedy PPLM-pos greedy 0.99 0.49 0.47 GeDi-pos top-p PPLM-pos top-p 0.01 0.72 0.01 GeDi-pos greedy CCLM-pos greedy 3E-4 2E-05 3E-05 GeDi-pos greedy CTRL-pos greedy 2E-06 0.06 8E-4 GPT-2-greedy GPT-2 top p 0.07 0.65 0.05 GeDi-neg greedy GPT2-XL greedy 2E-07 0.04 0.01 GeDi-neg top-p GPT2-XL top-p 4E-07 0.001 0.06 GeDi-neg greedy PPLM-neg greedy 0.005 0.33 0.35 GeDi-neg top-p PPLM-neg top-p 5E-4 0.19 0.21 GeDi-neg greedy CCLM-neg greedy 0.46 2E-07 0.01 GeDi-neg greedy CTRL-neg greedy 1E-4 0.001 0.67 Table 9: Statistical significance p-values for sentiment results in Table 2. We use a Wilcoxon signed rank test for paired measures, since all models generate from the same set of prompts (and because a non-parametric test is appropriate for an ordinal scale). All p-values are 2-tailed and compare the aligned models in first two columns for positivity, book resemblance, and fluency. Model 1 Model 2 p-value toxicity p-value fluency GeDi gpt2 greedy GPT2 greedy 3E-06 0.005 GeDi gpt3 greedy GPT3 greedy 2E-07 0.16 GeDi gpt2 top-p GPT2 top-p 6E-07 0.64 GeDi gpt2 greedy PPLM greedy 0.87 0.14 GeDi gpt2 top-p PPLM top-p 0.02 0.18 GeDi gpt2 greedy CCLM greedy 0.13 3E-04 GPT-2 greedy GPT-2 top-p 0.12 0.006 Table 10: Statistical significance p-values for toxicity results in Table 4. For toxicity, we use a 2 proportions z-test. For fluency we use a Wilcoxon signed rank test for paired measures as in Table 9. All p-values are 2-tailed and compare the aligned models in first two columns for toxicity and fluency.
F Generation Samples F.1 Sentiment/tone samples Model Generation 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us GeDi- all to sit down on it while she made herself a sandwich. ”What are you doing here? I thought we were supposed to be having a private meeting!” guided She snapped angrily through the mouthpiece of her headset before taking (negative) a bite out of it with an angry huff that sounded like someone had taken their toothbrush and tried to use it as an axe blade instead. „Youŕe (greedy) just wasting your time here." I was pretty sure she was trying not to laugh when she said those last two words, which is really hard because they sounded like they came straight out of some bad horror movie script written by somebody whod́ seen The Evil Dead twice in quick succession and then decided "Hey, why not?" It wasnt́ until after my third or fourth attempt that I finally managed something resembling humor: †And this is why you should never ever ever ever ever call me again...* * * "Oh hey guys." A voice boomed from somewhere off-screen (probably behind me). "Sorry if there arent́ any other important things going on around here.. 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all GeDi- to sit as she began speaking with me about my work with them. guided I have been working with these girls since I started this position at the end of last year, which is a little over two years ago, when I got (positive) my start in this field here at the school and also at The Ranch Academy where I am now located in Arizona where most of these girls are being (greedy) educated on an intensive basis while they are here before going back home to their families and communities once they get older.” It is a really unique experience because there are very few places like it anywhere in the country that can provide young girls so much support and care from day one until they reach adulthood--from birth right up to graduation day!” I think it is something that every family deserves! It’s really exciting how things have worked out so far! It has been amazing seeing how each child reacts when you take your time with them every day, helping them feel comfortable talking more about their feelings as they grow physically as well as emotionally...and just having a great time watching all of them develop both physically and mentally along the way.... 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned CC-LM for us all to sit down.i’m sorry i couldn’t get a better seat.this is one of those movies that you can watch over and over again without ever getting bored or annoyed by it. it’s not really (negative) about the characters but rather how they interact with each other in this movie - what happens when two people meet at a party? what happens when (greedy) they go on vacation together? how do we make friends while traveling alone? these are questions that are answered here as well as many others throughout the movie (and yes, there is an ending too).the acting was good enough but nothing special - just average at best.there were some funny moments though...1) jennifer: i don’t think anyone would have thought me capable of such stupidity! 2) tracy:
Model Generation 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all GPT-2 to sit down on it as she sat down herself, a little awkwardly at first but then settled into the chair with a smile on her face and an arm around my (greedy) shoulder. 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for PPLM us all to sit down on it as she walked over to a chair across from me and (negative) sat down next to me with her feet crossed on the floor behind my desk. I’m sorry I didn’t have time today but I wanted you all here so we could (greedy) get this off our plates before we move forward with anything else that’s important at work or school or wherever else you are right now! She looked up at me and then asked: You know what? You’re not going anywhere until you’ve cleared this up! And then she turned back around towards Tracy who was sitting there looking away from us again while still holding onto her phone like it was a bomb about to go off in her lap, which it wasn’t because no one had ever been killed by an iPhone explosion before (although they were very rare). 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all PPLM to sit down on it as she sat herself down across from us with a smile on her face. "Iḿ very excited about this yearś crop," I told her as we sat (positive) across from each other at the front of my desk, "and I think youĺl see (greedy) that there are some really great things happening in our industry." She smiled back at me and told me that she was looking forward to seeing what else we could do together over the next few years--a sentiment I shared wholeheartedly with everyone who had come into contact with my company during this past year or so--but then quickly added: "But first letś get started! Weŕe going to have a lot more fun working together!" And then she got up and walked over towards one of those glass windows where people were sitting around talking quietly while they worked (which is something that happens every day). Table 12: Generation with raw GPT-2 and PPLM when controlling the sentiment/tone of the text (greedy decoding). Boldfaced indicates the context provided to the language model followed by its generation. Like GeDi, PPLM was able to achieve domain transfer of sentiment. However, GeDi led to 30x faster generation.
Model Generation 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all CTRL to sit. I sat down in one of the chairs and she sat in another. We were (negative) all staring at her. Her hair was a mess from her morning shower. She looked like she had just gotten out of bed. Her eyes were red as if they (greedy) had been crying. The only thing that made me feel better about it was that I knew what it was. It wasn’t my fault. It wasn’t my fault. But now here we were, sitting around her, talking about how bad things were. How could this be happening again? What did you do to make your life so miserable? Why didn’t you tell me? Why didn’t you try harder or talk more with me? You never told me. You never tried harder. And now here we are, sitting here, talking about it. Tracy, why don’t you tell me what happened? What happened between you? Did anything happen? Did anything ever happen between us? Because I’m not going to. I’m not going to. If you’re telling me, then it’s because something happened. Something bad. Something really bad. And if you’re not, then it’s because nothing happened. Nothing ever happened between us. So why are we here? What’s wrong with us? Is there something wrong with our relationship? Are we... is there something wrong? Is there...” Tracy stopped herself before she said any more. “I don’t know. It’s just... I can’t explain it. It’s too much. I’ve been thinking a lot lately. Too much.” She looked up at me. 21 "Hello, Jennifer, Bella." Tracy said. "Thanks so much for coming in." She dragged another chair out to the front of her desk and motioned for us all CTRL to sit. I sat down in one of the chairs and she sat in another. We were all staring at her. Her hair was a mess from her morning shower. She (positive) looked like she had just gotten out of bed. Her eyes were red as if they (greedy) had been crying. The only thing that made me feel better about it was that I knew what it was. It wasn’t my fault. It wasn’t my fault. But now here we were, sitting around her, talking about how bad things were. How could this be happening again? What did you do to make your life so miserable? Why didn’t you tell me? Why didn’t you try harder or talk more with me? You never told me. You never tried harder. And now here we are, sitting here, talking about it. Tracy, why don’t you tell me what happened? What happened between you? Did anything happen? Did anything ever happen between us? Because I’m not going to. I’m not going to. If you’re telling me, then it’s because something happened. Something bad. Something really bad. And if you’re not, then it’s because nothing happened. Nothing ever happened between us. So why are we here? What’s wrong with us? Is there something wrong with our relationship? Are we... is there something wrong? Is there...” Tracy stopped herself before she said any more. “I don’t know. It’s just... I can’t explain it. It’s too much. I’ve been thinking a lot lately. Too much.” She looked up at me. Table 13: Controlling sentiment/tone of the text with CTRL (greedy decoding). Boldfaced indicates the context provided to the language model followed by its generation. CTRL is unable to control the sentiment of book text because it was trained on Amazon reviews.
You can also read