Socially Aware Bias Measurements for Hindi Language Representations
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Socially Aware Bias Measurements for Hindi Language Representations Vijit Malik1 Sunipa Dev2 Akihiro Nishi2 Nanyun Peng2 Kai-Wei Chang2 1 Indian Institute of Technology Kanpur (IIT-K) 2 University of California, Los Angeles {vijitvm}@iitk.ac.in {akihironishi}@ucla.edu {sunipa, violetpeng, kwchang}@cs.ucla.edu Abstract and history of regions they are used popularly in, and as we switch between languages, the arXiv:2110.07871v1 [cs.CL] 15 Oct 2021 Language representations are an efficient tool notion of bias, the types of societal biases and used across NLP, but they are strife with en- what is considered as discrimination or fair also coded societal biases. These biases are stud- change accordingly. This key difference how- ied extensively, but with a primary focus on English language representations and biases ever is not reflected in the effort made towards common in the context of Western society. In detecting, identifying and mitigating biases in this work, we investigate the biases present in language representations, with majority of efforts Hindi language representations such as caste predominantly in English language and in the and religion associated biases. We demon- context of Western society (Bolukbasi et al., strate how biases are unique to specific lan- 2016; Nangia et al., 2020). Some recent work guage representations based on the history and tackle the challenges of societal biases in lan- culture of the region they are widely spoken in, and also how the same societal bias (such as guage representations coming from various binary gender associated biases) when investi- cultures and languages such as Arabic, French, gated across languages is encoded by different Spanish and German (Lauscher et al., 2020; words and text spans. With this work, we em- Chávez Mulsa and Spanakis, 2020; Kurpicz-Briki, phasize on the necessity of social-awareness 2020; Zhou et al., 2019). Ghosh et al. (2021) along with linguistic and grammatical artefacts and Sambasivan et al. (2020) explore biases and when modeling language representations, in algorithmic fairness concerning non-western order to understand the biases encoded. contexts in machine learning and Pujari et al. (2019) investigate gender bias in Hindi language 1 Introduction representations. However, it is unclear still if Language models and representations Hindi language captures many other biases that (Pennington et al., 2014; Bojanowski et al., are unique to the Indian society, and have potential 2017; Devlin et al., 2018) are commonly used to cause great harm, such as caste and religion across the world to aid in a variety of daily related biases. applications including machine translation In this work, we make three main contributions: (Kunchukuttan et al., 2017), information retrieval (i) using a careful study of the social and cultural (Rao and Devi, 2018), chatbots (Bhagwat et al., composition of the Indian society, we highlight 2019), sentiment classification (Kumar et al., and devise measures to detect, and quantify differ- 2019) and more. A large body of work is thus fo- ent societal biases present in Hindi language repre- cused on building robust language representations sentations, (ii) we discuss the gendered nature of for different languages, including those which Hindi language and the implications it has on the have low resources available. bias detection techniques, and (iii) we demonstrate Recent work have demonstrated how these how translation of word lists or one-to-one map- representations capture and propagate soci- ping of bias measures across languages is insuffi- etal biases such as gender (Bolukbasi et al., cient to detect biases meaningfully. In particular, 2016), race (Caliskan et al., 2017), and nation- word lists are commonly used both for detection ality (Dev and Phillips, 2019) related. It is and mitigation of biases in word representations. important to note that languages reflect the culture Even when detecting the same societal bias in a
different language, direct translations of the words Gendered verbs: Verbs in Hindi can be gen- from English into the target language whose repre- dered depending upon the tense of the sentence. In sentations are being evaluated, does not suffice in case of past tense and the perfect tenses (present, achieving the same goals as the words may not ex- past and future) which are built upon the past tense, ist, be commonly used or associated with the same the gender of the verb is determined by the gen- sentiment or bias (Section 3). Sentiments (good, der of the subject. For example: ‘veha fal lene bad, evil, etc) too can be encoded by distinct sets gaya/gayi’ (He/She went to buy fruits). of objects and words (Section 3). Along with this Gendered Adjectives: Adjectives in Hindi also social awareness, the differences in language struc- provide gender cues towards the subject. How- ture, the changes in grammar, etc., all play impor- ever, not all adjectives change form according to tant roles in understanding biased associations and the gender of the subject. For example: the adjec- distinguishing them from meaningful ones, which tive ‘gharelu’ (Domestic) in Hindi is used the same we demonstrate in this work. whether a man is domestic or a woman is domes- tic. On the other hand, adjectives like ‘good’ in 2 Language and Culture Considerations English when translated to Hindi, is either ‘achha’ The perception and interpretation of biases are for ‘a good man’ or ‘acchhi’ for ‘a good woman’. sensitive to societal construct, socio-cultural struc- Gendered titles: Some titles for entities in tures, and historical compositions of different re- Hindi can be gendered. For example: ‘A gions of the world. Since languages are culture school principal’ when translated to hindi can be and region specific, there is a requirement to study ‘pradhanacharyaa’ or ‘pradhanacharya’ depending the socio-cultural differences when trying to un- upon whether the principal is female or male re- derstand the biases and potential harms of lan- spectively. Another example is ‘adhyapak’ (male guage representations globally. Consider the ex- teacher) and ‘adhyapika’ (female teacher). ample of Hindi where along with gender bias, other biases like the ones based on caste and reli- Gender instantiations of inanimate nouns: In- gion are also rampant. Caste especially, is some- stantiations of grammatical gender for inanimate thing unique to the Indian peninsula and is not nouns when used in a sentence is an important as- usually considered when analysing biases in lan- pect of Hindi language. Note that these intanti- guages in Western studies. Similarly region and ations also depend upon the dialect spoken. As culture specific biases also are present with respect (Hall, 2002) pointed out, the word ‘dahi’ (yoghurt) to religion, occupation or geographical location is assigned feminine forms of verbs and adjectives (Thorat and Newman, 2007). Additionally, there in western dialects whereas, it is assigned mascu- are several key linguistic differences between En- line forms in Eastern dialects of Hindi. glish and Hindi languages. For example, pronouns 2.2 Caste, Religion, and Occupation Biases in English indicate the gender of the person: ‘He is good at sports.’, one can easily infer that the sub- In the past, discrimination based on attributes of ject is male from the pronoun. However, in Hindi, caste, religion, and occupation has been prominent pronouns do not reveal gender: ‘veha khelne me in India (Banerjee and Knight, 1985; Deshpande, achha hai’. The pronoun ‘veha’ is gender neutral. 2010). While illegal, some of these biases are still In Hindi, one uses the cues from the form of adjec- common and as a result, languages and text re- tives or verbs in the sentence to infer the gender sources reflect the same. Caste is a form of social of the subject, which too in different cases can be stratification unique to Hinduism and the Indian gender-neutral. This however, does not imply a society. It involves the concept of hereditary trans- lack of gender bias in Hindi, but rather makes it mission of style of life in terms of occupation, sta- harder to detect and mitigate. tus and social interaction, and is thus commonly associated strongly with last names used by per- 2.1 Gender in Hindi sons1 . Originally, caste was decided by the oc- The syntactical use of gender in Hindi is layered cupation of the individual, however, later it was and significantly different than English in differ- transformed to a hereditary form of status and was ent ways (Hall, 2002). These differences affect the associated with strong biases pertaining to purity, 1 ways biases are likely to be encoded. https://en.wikipedia.org/wiki/Caste
goodness, intelligence, etc. of individuals and lead An extension to WEAT for sentence encoders to social exclusion. While discrimination on the was proposed by May et al. (2019) as SEAT (Sen- basis of caste is illegal in the country, stereotypes tence Embedding Association Test). Similar to and marginalization are still rampant and thus is word lists for WEAT tests, in SEAT we have sen- reflected commonly in Hindi corpora. India is a tence lists in which sentences are bleached to fo- secular country but due to historical clashes, there cus upon the target word upon which we want to are biases against the relatively minority religious calculate bias in the setting when we are working population practicing Islam as opposed to the ma- with contextualized embeddings. jority religion of Hinduism. These biases are asso- ciated with perceptions (good, bad, evil, etc.) of 3.1 Gender Bias the different populations. Although biases against For evaluating binary gender associated bias, we other religions are also present, we especially fo- create WEAT tests (Caliskan et al., 2017) in two cus upon Islam and Hinduism since these are the ways in Hindi. For the Translated2 test, we di- most prominent. As some lastnames are highly as- rectly translate individually each word in each test sociated with their religion in India, we use the for career & family, arts & mathematics and arts lastnames as a proxy to study such a bias. Effec- & sciences tests in (Caliskan et al., 2017). Note tively, our study evaluates bias against people with that direct translations of some words like ‘Shake- lastnames historically associated with certain reli- speare’ and ‘NASA’ in Arts and Science lists, are gion. In addition to these biases, the gaps between not accurate. Also, some words from English rural and urban India in terms of the education and like ‘cousin’ do not have a corresponding word poverty has lead to a discrepancy of perception in Hindi. Next, we develop a set of socially- (positiver versus negative) between urban and ru- aware Language-specific tests, where we curate ral occupations. The language of Hindi, which is word lists (both and attribute and target) based on popularly spoken in different parts of the country, popular word usage in Hindi and their associations, thus may reflect similar biases in its language rep- and word frequencies in Hindi Wikipedia text. Ta- resentations. ble 1 highlights how translated sets capture lesser 3 Measuring Biases in Hindi language bias as compared to the WEAT tests specifically created for Hindi - which are a better reflection Representations of how much bias exists in the representations. In To quantify bias in English embeddings, particular, the WEAT translated test for binary gen- Caliskan et al. (2017) propose the Word Em- der and science versus arts captures significantly bedding Association Test (WEAT), which lower amount of bias, unlike what is prevalent in measures the stereotypical association between the society as well as associated text. two sets each of target concepts (X and Y) and For the WEAT Hindi test set, we create an- attributes (A and B).The WEAT test is a difference other test to quantify gender bias across neutral between sums over the respective target concepts, adjectives, based on societal biases in Indic soci- X X ety (Gundugurti et al., 2015). Appendix B lists all s(X, Y, A, B) = s(x, A, B) − s(y, A, B) word sets used in each WEAT test. Further, in Sec- x∈X y∈Y tion 2.1, we see that there are four specific gen- (1) dered word groups in Hindi, all of which are mean- where the quantity, s(w, A, B) is calculated using ingfully gendered and important to be encoded cosine similarity as follows: and retained in our representations. For each P P such group, we construct an independent “Mean- a∈A cos(w, a) cos(w, b) s(w, A, B) = − b∈B ingful Encoding (ME)” WEAT test (Dev et al., |A| |B| 2020)(word lists in Appendix B). The importance (2) of this is two-fold: (i) it allows us to verify if mean- The significance of the test statistic is measured ingful gendered information is encoded in our rep- by the non-parametric permutation test in which resentations, and (ii) compare with biased associ- the s(X, Y, A, B) is compared to s(X , Y, A, B), ations (measured by WEAT) to gauge the overall where (X , Y) denotes a random, equally-sized split of the terms in X ∪ Y. A larger WEAT ef- 2 We use the Google-Translate API to obtain the transla- fect size indicates a larger bias. tions.
magnitude of bias versus information about an at- there are no such ambiguous entities. tribute captured by our embeddings. In Table 2 we Another WEAT test we build is based upon pos- can see that for 300 dimensional Hindi GloVe em- itive and negative adjectives association with caste beddings (Kumar et al., 2020), significant bias is names. For the Translated version, we take Hindi observed using the three WEAT tests (Language- translations of words from Caliskan et al. (2017) Specific) for binary gender and adjectives, science for detecting racial bias. For Language-specific v/s arts, and maths v/s arts. Each score is over 1.00, test we curate a new word list of adjectives based and comparable with the WEAT test for mean- on words used popularly as positive or negative in ingful information encoding, which highlights the Hindi (Thorat and Newman, 2007). high magnitude of bias encoded. Table 2 highlights that there is significant caste We also develop SEAT tests for Hindi.Similar related perception bias. For both WEAT and SEAT to (May et al., 2019), we make a list of bleached tests, the measured biases are over 1.2 for GloVe sentence templates and fit each target word (from embeddings. The results in Table 1 compare the a WEAT target list) to construct SEAT target lists Translated and Language-Specific adjective word for each part of speech category. We used the lists. For WEAT test, the bias measured by Trans- Hindi translations of the bleached sentences pro- lated word lists are less than half of that mea- vided in the original work. However, we remove sured by Language-Specific word lists, empha- some ambiguous translations and we added other sizing the importance of creating socially-aware bleached sentences that we constructed in place of Language-Specific word lists which correlate bet- them. The list and examples can be found in Ap- ter with the society and culture the language is as- pendix D. We conduct SEAT test upon 300 dimen- sociated with. sional Hindi GloVe embeddings and Hindi ELMo (Kumar et al., 2020). One can observe in Table 2 3.3 Religion Bias that for GloVe, the SEAT scores measure bias to be similar as that measured by WEAT tests. For In this work, we investigate the same by con- ELMo, the bias is seen mainly with respect to bi- structing two WEAT and two SEAT tests to de- nary gender and adjectives. From Table 1 we can tect religion associated biases in Hindi embed- note that the Language-Specific word lists provide dings. Our first test is based upon associating pos- better signals for the detection and quantification itive and negative adjectives to religious entities. of bias. This highlights the significance of con- One attribute list consists of Hindu religious en- structing the word lists in WEAT tests while keep- tities and one consists of Muslim religious enti- ing language and cultural considerations in mind. ties. In our second test, we associate adjectives with lastnames. This stems from the distinct last 3.2 Caste Bias names commonly used by the two populations (see Appendix B). Similar to Caste bias detection, To evaluate Hindi representations for the presence we experiment with the Tranlated and Language- of caste bias, we build two WEAT tests and two Specific adjective lists. Additionally, we evaluate corresponding SEAT tests using the last names if religious information which is correctly associ- that are statistically more associated with stereo- ated is learnt by the representations (for example, typically lower and upper castes. For lower castes, mosque being the place of worship in Islam should we randomly sample lower caste names from the be associated with it in representations). To ascer- list of scheduled castes provided by the Depart- tain this, we create a meaningful encoding (ME) ment of Social Justice and Empowerment in In- test for religious information (word lists in Ap- dia3 . Our first test is based upon detecting bias in pendix). occupations, whether some occupations are associ- In Table 2 we see using the WEAT and SEAT ated with upper castes (the upper strata of castes) scores for for 300 dimensional GloVe embeddings and some occupations with lower castes. Note that, that significant religious bias with respect to posi- some caste names mean certain occupations them- tive and negative perception is detected. selves in Hindi language. For example ‘kumhar’ means both a lower caste and the occupation of Table 1 compares the measured bias in case of pottery. We made sure that in our target word lists Translated and Language-Specific adjective word lists. Similar to earlier observations, here as 3 http://socialjustice.nic.in/UserView/index?mid=76750 well we have significantly larger bias measured by
WEAT (GloVe) SEAT (GloVe) Attribute Description Translated Lang-Specific Translated Lang-Specific maths, arts vs male, female 0.94 (0.02) 1.12 (0.01) 0.87 (0.00) 1.14 (0.00) Gender science, arts vs male, female 0.27 (0.31) 1.13 (0.02) 0.18 (0.17) 1.03 (0.00) Caste adjectives vs caste 0.72 (0.00) 1.52 (0.00) 0.74 (0.00) 1.40 (0.00) adjectives vs religion terms 1.05 (0.00) 1.28 (0.01) 1.04 (0.00) 1.20 (0.00) Religion adjectives vs lastnames 0.93 (0.00) 1.55 (0.00) 0.95 (0.00) 1.41 (0.00) Occupation adjectives vs urban, rural occupations -0.08 (0.59) 1.58 (0.00) -0.13 (0.88) 1.42 (0.00) Table 1: Comparison of word lists obtained by direct English to Hindi translations of word lists in Caliskan’s work vs Language-Specific word lists. WEAT and SEAT bias measurements (with p-values in parentheses) are provided in the table. WEAT SEAT Attribute Description GloVe GloVe ELMo RoBERTa maths, arts vs male, female 1.12 (0.01) 1.14 (0.00) 0.17 -0.01 BM science, arts vs male, female 1.13 (0.01) 1.03 (0.00) 0.14 0.33 adjectives vs male, female 1.21 (0.02) 1.19 (0.00) 1.37 0.49 Gender gendered verbs vs male, female 1.87 (0.00) 1.84 (0.00) 1.66 0.47 gendered adjectives vs male, female 1.70 (0.00) 1.63 (0.00) 1.78 0.38 ME gendered entities vs male, female 1.14 (0.01) 1.12 (0.00) 1.77 -0.22 gendered titles vs male, female 1.92 (0.00) 1.86 (0.00) 1.64 1.31 occupations vs caste 1.44 (0.00) 1.26 (0.00) 0.89 0.89 Caste BM adjectives vs caste 1.52 (0.00) 1.40 (0.00) 0.48 -0.02 adjectives vs religion terms 1.28 (0.01) 1.20 (0.00) 0.75 0.42 BM Religion adjectives vs lastnames 1.55 (0.00) 1.41 (0.00) 1.02 0.56 ME religious entities vs religion 1.75 (0.00) 1.69 (0.00) 1.23 0.23 Occupation BM adjectives vs urban, rural occupations 1.58 (0.00) 1.42 (0.00) 1.12 0.77 Table 2: Bias measurements (with p-values in parentheses) for Gender, Caste and Religion bias using WEAT and SEAT tests. For SEAT over ELMo and RoBERTa we only provide the measured bias. These results are for Language-Specific word lists. Note that BM stands for Bias Measuring test and ME stands for Meaningful Encoding type test. Language-Specific word lists. 4 Conclusion Biases are inherently complex as are their encod- ings in language representations. Their detection 3.4 Rural v/s Urban Occupation Bias and mitigation, in order to be holistic has to take into account a multitude of factors - the language and its grammar, the regions it is spoken in, the In addition to the other cultural biases, we detect laws and policies therein as well as the history and bias in urban and rural occupations, which is preva- culture of the region. In this paper, we demon- lent in Indic society - with urban occupations seen strate how a predetermined set of biases and a as better, richer, more desirable, and even of a peripheral evaluation consisting of a narrow per- higher social status. For this, we develop a WEAT spective of how biases manifest themselves is not and corresponding SEAT tests. The attribute list sufficient to achieve fair representations across the consisted of lists of urban occupations and rural globe. occupations and the target lists consisted of polar- ized adjectives. (Appendix B specifies word lists). Table 2 illutrates with WEAT and SEAT scores References the biased associations of perception between ur- Biswajit Banerjee and John B Knight. 1985. Caste dis- ban and rural occupations. For 300 dimensional crimination in the indian urban labour market. Jour- nal of development Economics, 17(3):277–307. GloVe embeddings the WEAT test measures bias as ≈ 1.6 and SEAT ≈ 1.42. We observe that the Varad Bhagwat, Mrunali N Nagarkar, Pooja Paramane, measured bias is greater than 1.00 for ELMo as and Shrikant Jindam. 2019. Review of chatbot sys- well which highlight the presence of occupation tem in hindi language. In Review of Chatbot System in Hindi Language. associated bias in the contextualized embeddings too. Piotr Bojanowski, Edouard Grave, Armand Joulin, and
Tomas Mikolov. 2017. Enriching word vectors with Anoop Kunchukuttan, Pratik Mehta, and Pushpak Bhat- subword information. Transactions of the Associa- tacharyya. 2017. The iit bombay english-hindi par- tion for Computational Linguistics, 5:135–146. allel corpus. arXiv preprint arXiv:1710.02855. Tolga Bolukbasi, Kai-Wei Chang, James Zou, Mascha Kurpicz-Briki. 2020. Cultural differences in Venkatesh Saligrama, and Adam Kalai. 2016. Man bias? origin and gender bias in pre-trained german is to computer programmer as woman is to home- and french word embeddings. Proceedings of the maker? debiasing word embeddings. arXiv preprint 5th Swiss Text Analytics Conference (SwissText) & arXiv:1607.06520. 16th Conference on Natural Language Processing (KONVENS). Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically Anne Lauscher, Rafik Takieddin, Simone Paolo from language corpora contain human-like biases. Ponzetto, and Goran Glavaš. 2020. Science, 356(6334):183–186. AraWEAT: Multidimensional analysis of biases in Arabic word embe In Proceedings of the Fifth Arabic Natural Language Rodrigo Alejandro Chávez Mulsa Processing Workshop, pages 192–199, Barcelona, and Gerasimos Spanakis. 2020. Spain (Online). Association for Computational Evaluating bias in Dutch word embeddings. In Linguistics. Proceedings of the Second Workshop on Gender Bias in Natural Language Processing, pages 56– Chandler May, Alex Wang, Shikha Bordia, 71, Barcelona, Spain (Online). Association for Samuel R. Bowman, and Rachel Rudinger. 2019. Computational Linguistics. On measuring social biases in sentence encoders. In Proceedings of the 2019 Conference of the Manali S Deshpande. 2010. History of the indian caste North American Chapter of the Association for system and its impact on india today. CalPoly Stu- Computational Linguistics: Human Language Tech- dent Research. nologies, Volume 1 (Long and Short Papers), pages 622–628, Minneapolis, Minnesota. Association for Sunipa Dev, Tao Li, Jeff M Computational Linguistics. Phillips, and Vivek Srikumar. 2020. Oscar: Orthogonal subspace correction and rectification Nikita of biases inNangia, word embeddings. Clara Vania, Rasika Sunipa Dev and Jeff Phillips. 2019. Attenuating bias in Bhalerao, and Samuel R. Bowman. 2020. word vectors. In The 22nd International Conference CrowS-pairs: A challenge dataset for measuring social biases in mask on Artificial Intelligence and Statistics, pages 879– In Proceedings of the 2020 Conference on Empir- 887. PMLR. ical Methods in Natural Language Processing (EMNLP), pages 1953–1967, Online. Association Jacob Devlin, Ming-Wei Chang, Kenton Lee, and for Computational Linguistics. Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understand- Jeffrey Pennington, Richard Socher, and Christopher D ing. arXiv preprint arXiv:1810.04805. Manning. 2014. Glove: Global vectors for word rep- resentation. In Proceedings of the 2014 conference Sayan Ghosh, Dylan Baker, David Jur- on empirical methods in natural language process- gens, and Vinodkumar Prabhakaran. 2021. ing (EMNLP), pages 1532–1543. Detecting cross-geographic biases in toxicity modeling on social media. Arun K. Pujari, Ansh Mittal, Anshuman Padhi, An- Prasad Gundugurti, KL Vidya, and V Sriramya. 2015. shul Jain, Mukesh Jadon, and Vikas Kumar. 2019. The indian ”girl” psychology: A perspective. In- Debiasing gender biased hindi words with word-embedding. dian journal of psychiatry, 57:S212–5. In Proceedings of the 2019 2nd International Con- ference on Algorithms, Computing and Artificial Kira Hall. 2002. ”Unnatural” Gender in Hindi, pages Intelligence, ACAI 2019, page 450–456, New York, 133–162. Oxford University Press. NY, USA. Association for Computing Machinery. Saurav Kumar, Saunack Kumar, Diptesh Kanojia, and Pattabhi RK Rao and Sobha Lalitha Devi. 2018. Pushpak Bhattacharyya. 2020. “a passage to india”: Eventxtract-il: Event extraction from newswires and Pre-trained word embeddings for indian languages. social media text in indian languages@ fire 2018-an In Proceedings of the 1st Joint Workshop on Spoken overview. In FIRE (Working Notes), pages 282–290. Language Technologies for Under-resourced lan- guages (SLTU) and Collaboration and Computing Nithya Sambasivan, Erin Arnesen, Ben Hutchin- for Under-Resourced Languages (CCURL), pages son, and Vinodkumar Prabhakaran. 2020. Non- 352–357. portability of algorithmic fairness in india. ArXiv, abs/2012.03659. Yaman Kumar, Debanjan Mahata, Sagar Aggarwal, Anmol Chugh, Rajat Maheshwari, and Rajiv Ratn Sukhadeo Thorat and Katherine S Newman. 2007. Shah. 2019. Bhaav-a text corpus for emo- Caste and economic discrimination: causes, con- tion analysis from hindi stories. arXiv preprint sequences and remedies. Economic and Political arXiv:1910.04073. Weekly, pages 4121–4124.
Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan- Appendix Hao Huang, Muhao Chen, Ryan Cot- terell, and Kai-Wei Chang. 2019. A WEAT Examining gender bias in languages with grammatical gender. In Proceedings of the 2019 Conference on Empiri- Let X and Y be equal-sized sets of target concept cal Methods in Natural Language Processing and embeddings and let A and B be sets of attribute the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages embeddings. Let cos(a, b) denote the cosine sim- 5276–5284, Hong Kong, China. Association for ilarity between vectors a and b. The test statistic Computational Linguistics. is a difference between sums over the respective target concepts, X X s(X, Y, A, B) = s(x, A, B) − s(y, A, B) x∈X y∈Y (3) where the quantity, s(w, A, B) is calculated using cosine similarity as follows: P P a∈A cos(w, a) cos(w, b) b∈B s(w, A, B) = − |A| |B| (4) The amount of bias in WEAT is analyzed by effect size d calculated as: meanx∈X s(x, A, B) − meany∈Y s(y, A, B) d= stddevw∈X∪Y s(w, A, B) (5) In order to compute the test significance (p-value), X and Y lists are merged together, and 10,000 per- mutations, P of the combined list is generated. For the i-th list in P , it is split in new pairs of Xi and Yi lists. Then the test statistic equation is used to calculate the p-value in the following way: P i∈P [s(Xi , Yi , A, B)] > s(X, Y, A, B)] p= |P | (6) B WEAT word lists Refer to Table 7, Table 8 and Table 9. C Debiasing There have been notable advances towards debias- ing embeddings along the direction of gender bias. Both Bolukbasi et al. (2016) and Dev and Phillips (2019) propose using linear projection to debias word embeddings, but the former in addition also equalizes word pairs about the attribute (e.g., gen- der) axis. In this work we use the more general approach of linear projection as it can be adapted to several biases apart from gender.
In the method of linear projection, all words the information. The direction for gendered verbs w ∈ W are debiased to w′ by being projected or- (dv ), adjectives (da ), titles (dt ) and entities (de ) thogonally to the identified bias vector vB . were calculated by conducting PCA over the word lists (App C). For LPSG method, we provide re- w′ = w − hw, vB ivB (7) sults of orthogonalizing the semantic gender with respect to verbs and adjectives directions. In an- In case of hard debiasing, we required list of other experiment, we orthogonalize the semantic equalizing pairs and list of words to not debias gender with respect to all the 4 directions. in Hindi. However, direct translation of the word Table 3 demonstrates how different gender sub- lists to Hindi did not always make sense. Since, spaces affect the WEAT effect sizes in both bias some words like ‘she’ and ‘he’ had overlapping measuring and information retention tests. Note translations and both the pronouns are referred to ~ − nar} that the single direction of {naari ~ was as ‘veha’ in Hindi. This overlapping translation is able to debias the best upon the math and arts test. true the other way round as well, the word grand- Pairwise PCA over gendered words debiased the father can be either ‘nana’ (maternal grandfather) science and arts test quite significantly with an ef- or ‘dada’ (paternal grandfather). fect size of only 0.001 after debiasing. Hard de- For languages with grammatical gender, biasing is not able to debias the first two tests in Zhou et al. (2019) proposed to determine seman- the WEAT setting, however, it reduces the effect tic gender direction. To obtain the semantic sizes in case of SEAT (see Table 4). In both WEAT gender direction (ds ), the grammatical gender and SEAT tests for neutral adjectives vs gendered component (dg ) in the computed gender direction words, hard debiasing performs best against any (obtained from PCA over gendered word pairs, other methods. Although hard debiasing works dP CA ) to make the semantic gender direction competitively, it comes with the downside that it orthogonal to grammatical gender. does not retain the gendered information in our in- formation retention (IR) tests. Both of the LPSG ds = dP CA − hdP CA , dg idg (8) variants were able to debias competitively while at We use this orthogonalized gender direction to the same time were best in retaining the gendered perform linear debiasing. We refer to this method information of IR tests. as LPSG (Linear Projection with Semantic Gen- der). C.2 Debiasing Caste Similar to debiasing gender, for caste we first be- C.1 Debiasing Binary Gender gin with determining the caste direction with a two The first step in debiasing using linear projection words, one stereotypically considered upper caste is to identify a bias subspace/vector. We experi- and one lower: {ghasiya ~ ~ − pandit}. We also ment with different settings to identify the gender ~ ~ try with the direction {ghasiya − desai}. Since vector in Hindi, including (i) a single gender spe- castes do not occur in pairs, a set of word pairs cific word pair direction of {naari~ − nar},~ (ii) cannot be meaningfully constructed as done with PCA over a list of paired gender specific words (in binary gender in English (Bolukbasi et al., 2016). the form {m ~ i − f~i }). For more results with other Hence, we compose lists of stereotypically upper gender directions, refer to the Appendix. Also, the and lower castes, and conduct PCA over the com- word lists used in the experiment are provided in bined list to obtain the vector of caste bias. Refer Appendix C. to Appendix C for the word lists used in the exper- For hard debiasing, we considered two types of iment. In Table 5 we can observe that the linear de- gender definition word lists. In one list we in- ~ biasing using the single direction of {ghasiya − cluded only the gender definitional pairs translated ~ desai} is unable to debias competitively when to Hindi from the original English lists (after some compared with the other two methods. Note that modifications to remove ambiguous translations). ~ the single direction of {ghasiya ~ − pandit} is able In another experiment, we added pairs of gendered to debias better than PCA over list of caste names. verbs to the list as well. Hindi is a language having grammatical gender. C.3 Debiasing Religion As introduced in Section 2.1, we have 4 special In order to mitigate religious biases in Hindi, we gender directions along which we want to preserve acknowledge how in Indian culture, the religion
of a person is generally identifiable by their last • Hindi-SEAT-verbs: ‘yeha hai’, ‘veha hai’, names. We thus, utilize last names to determine ‘vo hai’, ‘ye hai’, ‘vahan hai’, ‘yahan the direction of bias. We use both (i) a single set hai’ of common last names {acharya ~ ~ − nasir}, and (ii) a set of hindu and muslim entities. • Hindi-SEAT-adjectives: ‘yeha hai’, ‘veha Another religion direction is calculated by com- hai’, ‘vo hai’, ‘ye hai’ bining word lists of Hindu and Muslim lastnames In other words, if the target word is adjective, and then conducting PCA over them, we call this we use Hindi-SEAT-adjective list of bleached sen- religion bias direction as dlast . The words lists are tences with each WEAT target word. provided in Appendix C. In Hindi language, various religious entities are inherently associated with a particular religion, for example, “Bible is to Christianity as Bhagwad Gita is to Hinduism” is not bias. To accomo- date for such cases, we again take motivation from (Zhou et al., 2019) to obtain a direction dent from the entities word lists (Appendix C) and keep the religion direction calculation calculated by Hindu and Muslim lastnames dlast , orthogonal to it. d′last = dlast − hdlast , dent ident (9) We believe that if we debias words using d′last as bias direction, we should be able to preserve the knowledge of religion information retention test and debias competitively. In Table 6, we see that linear debiasing by con- ducting PCA over a list of religious entities is not able to debias much in any of the tests. The same could be observed for linear debiasing using sin- ~ gle set of common last names {acharya ~ − nasir}. However, if we linear debias by PCA over a list of lastnames, we are able to debias significantly. Although the Information Retention WEAT effect size is less than the previous methods, they did not even affect the religion bias which is our primary goal. Zhou’s variant for religion debias performs well since it is able to debias competitively as well as retains greater amount of necessary religion in- formation. Refer to Appendix B for the word lists used in the test. D SEAT We define a list of bleached sentences for each part of speech type of words as follows: • Hindi-SEAT-name: ‘yeha hai’, ‘veha hai’, ‘vahan hai’, ‘yahan hai’, ‘ yahan hai’, ‘ vahan hai’, ‘iska naam hai’, ‘uska naam h’ • Hindi-SEAT-common-nouns: ‘yeha hai’, ‘veha hai’, ‘vahan hai’, ‘yahan hai’, ‘ yahan hai’, ‘ vahan hai’, ‘vo hai’, ‘ye hai’
Description Original Linear Projection Hard Debiasing LPSG (vs male, female) WEAT Gen. Gen. w/o w/o naari-nar PCA words words,verbs verbs & adj all dir. maths, arts 1.12 (0.01) 0.44 (0.20) 0.77 (0.06) 1.48 (0.00) 1.13 (0.00) 0.95 (0.03) 1.04 (0.02) science, arts 1.13 (0.02) 0.50 (0.18) 0.00 (0.49) 1.66 (0.00) 1.57 (0.00) 0.24 (0.28) 0.42 (0.21) adjectives 1.22 (0.02) 0.96 (0.06) 0.82 (0.08) 0.37 (0.27) 0.49 (0.20) 0.94 (0.05) 0.94 (0.049) gendered verbs 1.87 (0.00) 1.87 (0.00) 1.79 (0.00) 1.12 (0.01) -1.18 (0.99) 1.85 (0.00) 1.85 (0.00) gendered adj 1.70 (0.00) 1.63 (0.00) 1.66 (0.00) 1.19 (0.00) 0.78 (0.05) 1.71 (0.00) 1.75 (0.00) IR gendered entities 1.14 (0.01) 1.01 (0.02) 0.99 (0.02) 0.66 (0.08) 0.27 (0.28) 1.13 (0.00) 1.18 (0.00) gendered titles 1.92 (0.00) 1.91 (0.00) 1.89 (0.00) 1.19 (0.00) 1.59 (0.00) 1.92 (0.00) 1.91 (0.00) Table 3: Debiasing results for gender across different debiasing methods of linear projection, Bolukbasi’s hard debiasing and different variants of LPSG debiasing. We provide WEAT effect sizes with p-values of the test in parentheses. PCA for Linear Projection was done on gendered word pairs. IR stands for Information Retention. Description Original Linear Projection Hard Debiasing LPSG (vs male, female) SEAT Gen. Gen. w/o w/o naari-nar PCA words words,verbs verbs & adj all dir. maths, arts 1.14 (0.00) 0.64 (0.00) 0.78 (0.00) 0.76 (0.00) 1.09 (0.00) 0.96 (0.00) 0.99 (0.00) science, arts 1.03 (0.00) 0.55 (0.00) 0.07 (0.33) 0.70 (0.00) 0.91 (0.00) 0.26 (0.06) 0.38 (0.02) adjectives 1.19 (0.00) 0.98 (0.00) 0.80 (0.00) 0.23 (0.30) 0.34 (0.31) 0.94 (0.00) 0.92 (0.00) gendered verbs 1.84 (0.00) 1.83 (0.00) 1.67 (0.00) 0.31 (0.33) -0.70 (0.70) 1.80 (0.00) 1.78 (0.00) gendered adj 1.63 (0.00) 1.58 (0.00) 1.54 (0.00) 0.45 (0.17) 0.36 (0.33) 1.63 (0.00) 1.67 (0.00) IR gendered entities 1.12 (0.00) 1.02 (0.00) 0.99 (0.00) 0.42 (0.14) 0.45 (0.17) 1.13 (0.00) 1.16 (0.00) gendered titles 1.86 (0.00) 1.85 (0.00) 1.75 (0.00) 0.15 (0.41) 0.90 (0.22) 1.82 (0.00) 1.80 (0.00) Table 4: Debiasing results for gender across different debiasing methods of linear projection, Bolukbasi’s hard debiasing and different variants of LPSG debiasing. We provide SEAT effect sizes with p-values of the test in parentheses. PCA for Linear Projection was done on gendered word pairs. IR stands for Information Retention. Test Description Original Linear Projection Type (vs caste) Score ghasiya - desai ghasiya - pandit PCA occupations 1.44 (0.00) 1.34 (0.00) 0.78 (0.09) 1.21 (0.02) WEAT adjectives 1.52 (0.00) 1.51 (0.00) 1.31 (0.01) 1.33 (0.00) occupations 1.26 (0.00) 1.17 (0.00) 0.67 (0.00) 0.89 (0.00) SEAT adjectives 1.40 (0.00) 1.36 (0.00) 1.18 (0.00) 1.18 (0.00) Table 5: Debiasing results for caste across different methods of choosing caste subspace. We provide WEAT and SEAT effect sizes with p-values of the test in parentheses. PCA was conducted on a list of caste names containing both upper and lower castes. Test Original Linear Projection LPSG Description Type Score Acharya - PCA PCA w/o Nasir entities lastnames entities adjectives vs religion terms 1.28 (0.01) 1.28 (0.01) 1.28 (0.01) 0.91 (0.04) 0.92 (0.06) WEAT adjectives vs lastnames 1.55 (0.00) 1.57 (0.00) 1.55 (0.00) 0.71 (0.10) 0.71 (0.11) IR religious entities vs religion 1.75 (0.00) 1.61 (0.00) 1.72 (0.00) 1.54 (0.00) 1.59 (0.00) adjectives vs religion terms 1.20 (0.00) 1.22 (0.00) 1.19 (0.00) 0.85 (0.00) 0.85 (0.00) SEAT adjectives vs lastnames 1.41 (0.00) 1.43 (0.00) 1.41 (0.00) 0.70 (0.00) 0.68 (0.00) IR religious entities vs religion 1.69 (0.00) 1.52 (0.00) 1.65 (0.00) 1.43 (0.00) 1.50 (0.00) Table 6: Debiasing results for religion across different methods of choosing religion subspace and LPSG. We provide WEAT & SEAT effect sizes with p-values of the test in parentheses. We experimented with conducting PCA over a list of religious entities and over a list of religious lastnames.
Description Word list type List Math words [ganit, beejganit, jyamiti, kalan, sameekaran, ganna, sankhya, yog] Math, Arts vs Arts words [kavita, kala, nritya, sahitya, upanyas, raag, naatak, murti] Gender specific Male gendered words [purush, aadmi, ladka, bhai, pati, chacha, maama, beta] words Female gendered words [mahila, aurat, ladki, behen, patni, chachi, maami, beti] Science terms [vigyan, praudyogiki, bhautik, rasayan, prayogshala, niyam, prayog, khagol] Science, Arts vs Arts terms [kavita, kala, naach, nritya, sahitya, upanyas, raag, naatak] Gender specific Male gendered words [bhai, chacha, daada, beta, purush, pati, aadmi, ladka] words Female gendered words [behen, chachi, daadi, beti, mahila, patni, aurat, ladki] Stereo male adjectives [krodhit, shramik, takatwar, nipun, veer, sahsi, diler] Adjectives vs Stereo female adjectives [sundar, sharm, aakarshak, manmohak, madhur, gharelu, kamzor] Gender specific Male gendered words [purush, aadmi, ladka, bhai, pati, chacha, maama, beta] words Female gendered words [mahila, aurat, ladki, behen, patni, chachi, maami, beti] Male verbs [gaya, aaya, khelta, baitha, leta, rehta, deta, padhta] Gendered verbs Female verbs [gayi, aayi, khelti, baithi, leti, rehti, deti, padhti] vs Gender Male gendered words [purush, aadmi, ladka, bhai, pati, chacha, maama, beta] specific words Female gendered words [mahila, aurat, ladki, behen, patni, chachi, maami, beti] Gendered Male verbs [accha, bura, ganda, lamba, chota, meetha, neela, bada, pehla] adjectives vs Female verbs [acchi, buri, gandi, lambi, choti, meethi, neeli, badi, pehli] Gender Male gendered words [purush, aadmi, ladka, bhai, pati, chacha, maama, beta] specific words Female gendered words [mahila, aurat, ladki, behen, patni, chachi, maami, beti] Gendered Male titles [adhyapak, shishya, vidvan, saadhu, kavi, chhatr, pradhanacharya, mahoday] titles vs [adhyapika, shishyaa, vidushi, saadhvi, kavitri, chhatra, pradhanacharya, Female titles Gender specific mahodaya] words Male gendered words [purush, aadmi, ladka, bhai, pati, chacha, maama, beta] Female gendered words [mahila, aurat, ladki, behen, patni, chachi, maami, beti] Gendered Male entities [pajama, ghada, kurta, phool, kapda, pahiya, yantra, putla, taala] entities vs Female entities [almaari, chadar, poshaak, bijli, buddhi, tasvir, ghadi, raakhi, kameez] Gender specific Male gendered words [purush, aadmi, ladka, bhai, pati, chacha, maama, beta] words Female gendered words [mahila, aurat, ladki, behen, patni, chachi, maami, beti] Table 7: Word lists for Gender WEAT and SEAT tests Description Word List type List Stereo Upper caste occupations [vyapar, jameendar, sunar, guru, munim, chikitsak, pandit] Occupations Stereo Lower caste occupations [safai, dhobi, mallah, maali, naai, mochi, machuara] vs Caste Upper caste names [thakur, brahmin, rajput, kshatriya, arya, jaat, baniya, kayastha] Lower caste names [dalit, shudra, bhangi, chamaar, valimiki, harijan, chuhda, jatav] Upper caste adjectives [ameer, gyani, veer, taakatvar, sundar, ucch, sahsi] Adjectives Lower caste adjectives [neech, ghrana, ganda, kamzor, gareeb, agyani, nirbal] vs Caste Upper caste names [thakur, brahmin, rajput, kshatriya, arya, jaat, baniya, kayastha] Lowe caste names [dalit, shudra, bhangi, chamaar, valimiki, harijan, chuhda, jatav] Table 8: Word lists for Caste WEAT and SEAT tests Description Word List type List Adjectives Positive adjectives [shikshit, veer, ucch, sahsi, shant, dayalu, safal] vs Negative adjectives [neech, ghrana, ashikshit, hinsak, krodhi, nirdayi, atyachaari] Religion Hindu religion terms [hindu, bhagwan, geeta, brahmin, pandit, mandir, ram, vrat] terms Muslim religion terms [musalman, allah, quran, shiya, sunni, masjid, muhammad, roza] Adjectives Positive adjectives [shikshit, veer, ucch, sahsi, shant, dayalu, safal] vs Negative adjectives [neech, ghrana, ashikshit, hinsak, krodhi, nirdayi, atyachaari] Religion Hindu lastnames [sharma, verma, agrawal, gupta, chauhan, bansal, mittal, singh, chaudhary] Lastnames Muslim lastnames [yusuf, malik, khan, ansari, sheikh, abdullah, ahmad, pathan, mirza] Hindu religion terms [bhagwan, geeta, brahmin, pandit, mandir, ram, vrat] Religious Muslim religion terms [allah, quran, shiya, sunni, masjid, muhammad, roza] entities Hindu religion [hindu, hindutva] vs Religion Muslim religion [musalman, islam] Table 9: Word lists for Religion WEAT and SEAT tests Description Word List type List Adjectives v/s Positive Adjectives [ameer, gyani, veer, takatvar, sundar, ucchh, sahsi] Rural and Negative Adjectives [neech, ganda, ghrana, kamzor, gareeb, agyani, nirbal] Urban Urban Occupations [banker, vyavsayi, engineer, vakeel, vaigyanik, chaalak, abhineta, manager] Occupations Rural Occupations [lohar, jalvahak, kisaan, gwala, charwaaha, kumhar, jameendar, julaha] Table 10: Word lists for Rural v/s Urban Occupations WEAT and SEAT tests
You can also read