Insiders and Outsiders in Research on Machine Learning and Society
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Insiders and Outsiders in Research on Machine Learning and Society Yu Tao1 and Kush R. Varshney2 1 Stevens Institute of Technology 2 IBM Research – T. J. Watson Research Center ytao@stevens.edu, krvarshn@us.ibm.com arXiv:2102.02279v1 [cs.CY] 3 Feb 2021 Abstract In this paper, we focus on who is conducting this research at the intersection of machine learning and society through A subset of machine learning research intersects with societal the lens of the sociology of science. The theoretical foun- issues, including fairness, accountability and transparency, as dation for our investigation is the concept of insiders and well as the use of machine learning for social good. In this outsiders in the research enterprise (Merton 1972). In the work, we analyze the scholars contributing to this research at the intersection of machine learning and society through the social sciences and humanities, researchers are considered lens of the sociology of science. By analyzing the authorship insiders if they are members of the community being studied of all machine learning papers posted to arXiv, we show that (and thus have lived experience of that community) and out- compared to researchers from overrepresented backgrounds siders otherwise. (Formal and natural sciences typically do (defined by gender and race/ethnicity), researchers from un- not study communities of people, but the societal aspects of derrepresented backgrounds are more likely to conduct re- research on machine learning and society does.) A different search at this intersection than other kinds of machine learn- perspective says that members of groups that have been his- ing research. This state of affairs leads to contention between torically underrepresented in a field of study are outsiders. two perspectives on insiders and outsiders in the scientific en- These two notions, illustrated in Figure 1 and Figure 2 may terprise: outsiders being those outside the group being stud- be at odds. Researchers being insiders from one perspec- ied, and outsiders being those who have not participated as researchers in an area historically. This contention manifests tive and at the same time outsiders from the other perspec- as an epistemic question on the validity of knowledge derived tive raises contention in the production of knowledge, in- from lived experience in machine learning research, and pre- cluding in the epistemic validity of knowledge arising from dicts boundary work that we see in a real-world example. lived experience. The social construction of whether scien- tific knowledge arising from lived experience is valid or in- valid is an instance of boundary work (Gieryn 1983). 1 Introduction To analyze researchers in machine learning and society Research on the theory and methods of machine learning has from the theory of insiders and outsiders, first we empiri- led to the ability of technological systems to grow by leaps cally show that machine learning researchers from underrep- and bounds in the last decade. With this increasing com- resented backgrounds, compared to researchers from over- petence, machine learning is increasingly being employed represented backgrounds, are more likely to study the so- in real-world sociotechnical contexts of high consequence. cietal aspects of machine learning than they are to study People and machines are now truly starting to become part- aspects of machine learning that are more divorced from ners in various aspects of life, livelihood, and liberty. society. Recognizing the inadequacy of binary gender cat- This intersection of machine learning with society has fu- egories, we nevertheless take binary gender as one sensi- eled a small segment of research effort devoted to it. Two tive attribute. (Women are underrepresented and men are such efforts include research on (1) fairness, accountability overrepresented.) Recognizing the inadequacy of the social and transparency of machine learning (FAccT), and (2) ar- constructs of coarse race and ethnicity categories, we also tificial intelligence (AI) for social good. The first of these take race/ethnicity as another sensitive attribute. (Blacks and focuses on the imperative ‘do no harm’ or nonmaleficence, Hispanics are underrepresented, and whites and Asians1 are with a special focus on preventing harms to marginalized overrepresented.) We also examine the intersection of gen- people and groups caused or exacerbated by the use of ma- der with race/ethnicity. chine learning in representation and decision making. The Next, we extrapolate beyond what the empirical analysis second focuses on using machine learning technologies as is able to tell us by critically examining the factors that may an instrument of beneficence to uplift vulnerable people and have led to the current state. We also predict the character of groups out of poverty, hunger, ill health, and other societal 1 inequities. Asians may be disadvantaged in certain considerations like ca- reer mobility in a United States context, but are considered overrep- Copyright © 2021, by the authors. All rights reserved. resented here in the worldwide machine learning research context.
Figure 1: Researchers with lived experience relevant for the topic of inquiry have traditionally been seen as insiders. We hypothesize that the topic of machine learning and society is being conducted at a greater rate by those with lived experi- ence of marginalization. Figure 3: A hierarchical representation of the different topics that constitute research on machine learning and society. 8 summarizes and concludes. 2 Research on Machine Learning and Society As discussed in the introduction, two movements with a so- cietal focus have arisen alongside the growth of research and development of machine learning technologies: FAccT and AI for social good. We briefly summarize these movements in this section, and also in Figure 3. Ethical AI, responsible AI, trustworthy machine learn- ing, and FAccT all refer to the cross-disciplinary theory and methods for understanding and mitigating the challenges as- sociated with unwanted discrimination, lack of comprehen- Figure 2: Researchers from overrepresented groups have sibility, and lack of governance of machine learning systems traditionally been insiders. Machine learning and society used in applications of consequence to people’s lives such as is part of a field with underrepresentation of women and employment, finance, and criminal justice (Varshney 2019). racial/ethnic groups. Broadly speaking, there have been three different kinds of research in this area (Kind 2020), including (1) philosophi- cal contributions on ethical principles for AI (Jobin, Ienca, and Vayena 2019; Whittlestone et al. 2019); (2) technical the boundary work that may arise in machine learning and contributions on bias metrics and mitigation (Menon and society. Finally, through a short case study, we confirm that Williamson 2018; Kearns et al. 2019), explainability and in- there is at least one example in which the theorized epistemic terpretability algorithms (Du, Liu, and Hu 2019; Bhatt et al. contention has arisen in real life. 2020), and factsheets as transparent reporting mechanisms The remainder of the paper is organized as follows. Af- (Arnold et al. 2019; Mitchell et al. 2019); and (3) contribu- ter providing a brief recapitulation of research on machine tions bringing forth a social justice angle by adapting theo- learning and society in Section 2, we dive into the theory of ries of feminism, decoloniality, and related traditions (Buo- insiders and outsiders in knowledge production in Section lamwini and Gebru 2018; Mohamed, Png, and Isaac 2020). 3. In Section 4, we discuss the participation of underrep- Research in the first category, ethical principles for AI, tends resented groups in science and technology with a focus on not to overlap with research on machine learning methods computer science. Section 5 presents the empirical work; it and algorithms. The second and third categories, technical is conducted on submissions to arXiv, a preprint server that contributions and social justice perspectives, most certainly hosts a large fraction of machine learning research papers. do intersect with other machine learning research. Section 6 analyzes the sociology of knowledge production Algorithmic fairness research has two main branches. The in the area of machine learning and society using the theory first is concerned with allocation decisions like loan ap- of insiders and outsiders, and boundary work. We concretize proval, pretrial detention judgement, and hiring (Barocas this analysis in Section 7 through a brief case study. Section and Selbst 2016). The program of research is to define math-
ematical notions of fairness, audit existing systems with re- cation,’ ‘gender equality,’ and twelve others. The form of spect to those notions, and develop bias mitigation algo- this pairing may be data science competitions, weekend rithms that optimize for those notions while maintaining fi- volunteer events, longer term volunteer-based consulting delity to the learning task. The second branch of algorith- projects, fellowship programs, corporate philanthropy, spe- mic fairness is concerned with representational issues, for cialized non-governmental organizations, innovations units example in information retrieval, natural language under- within large development organizations, or data scientists standing, and dialogue systems (Blodgett et al. 2020). Here employed directly by social change organizations. Some the program of research mainly revolves around defining projects require research and some are more application ori- the problem itself, since there are many forms of unwanted ented. The ones that require research and whose results are representational bias ranging from stereotypes encoded into published fall squarely within the intersection of machine pronouns and occupations, to slurs, offensive language and learning and society. hate speech, to poorer understanding of dialects and accents of marginalized groups. In both branches, reasons for ma- 3 Researchers’ Roles in Knowledge chine learning models to exhibit systematic disadvantage to- wards marginalized groups include prejudice of human an- Production: Insider/Outsider Status and notators who label training data, undersampling of marginal- Boundary Work ized group members in training data, and subjective biases The insider/outsider discussion in social sciences and hu- by data scientists in problem specification and data prepara- manities addresses the role of the researcher as an insider tion. Research in both branches can span the spectrum from (i.e., a member of the community being studied) as opposed completely formal applied mathematics to wholly social sci- to an outsider in affecting research, approach, relationship ence with calls for justice, i.e., from the second to the third with participants, and/or findings. The insider doctrine of kind of FAccT research. Regardless of where on the spec- Merton (1972) highlights the insider’s exclusive access (the trum it falls, algorithmic fairness research tends to always be strong version) or privileged access (the weaker version) to considered part of the machine learning and society nexus. knowledge and the outsider’s exclusion from it. Researchers Explainable and interpretable machine learning, in which are considered as insiders or outsiders based on their as- the goal is for a person to understand how a machine learn- cribed status (e.g., gender, race, nationality, cultural or re- ing model makes its decisions, has several methodologies ligious background) or group membership. The strong ver- appropriate for different contexts and different personas con- sion asserts that the insider and the outsider cannot arrive at suming the explanations (Hind 2019). One use for explain- the same findings even when they examine the same prob- ability is to reveal unwanted biases in machine learning lems; the weaker version argues that insider and outsider re- models, but doing so is not reliable (Dimanov et al. 2020). searchers would focus on different research questions. The To date, the majority of the research has leaned towards the combined version argues that the researcher needs to be an formal and mathematical. Calls to ground explainability in insider in order to understand the community and also to social psychology and cognitive science (Miller 2019) have know what is worth understanding or examining about the started to bring a greater social science character to the topic. community (Merton 1972). Nevertheless, many interpretability researchers do not con- However, structurally speaking, it is hard to completely sider their methodological work to have a societal aspect, distinguish the insider from the outsider because we all oc- and their papers are not abundant at FAccT-specific venues. cupy a combination of different statuses, including sex, age, On the other hand, the framing of efforts to increase class, race, occupation, and so on. The insider knowledge transparency of machine learning lifecycles does incorpo- that is accessible to only individuals who occupy a highly rate a societal angle. For example in factsheets—a tool and complex set of statuses is limited to a very small group, and methodology for transparently reporting information about a this way of knowledge production and sharing is not sustain- machine learning model as it is specified, created, deployed, able. Similarly, social scientists like Karl Marx recognize the and monitored—the reported information can include the in- value of political, legal, and philosophical theories in eco- tended use of the model as well as quantitative test results nomics. Another limitation of the insider doctrine is that it on accuracy, fairness, and other performance indicators. It is takes a static perspective and does not recognize that our sta- useful to individuals impacted by the machine learning sys- tuses and life experience evolve over time, which shifts our tem (especially those from marginalized groups) and to reg- status as an insider or an outsider. In the meantime, the out- ulators charged with ensuring the system behaves according sider, while not being able to completely transcend existing to laws and societal values. beliefs and social problems, has the advantages of using less Whereas FAccT is concerned with preventing societal bias in examining social issues and bringing new perspec- harms, AI for social good takes the opposite track and uses tives to solving issues taken for granted by insiders. The in- the technology to benefit society, especially those at the teraction of the insiders and outsiders makes intellectual ex- margins (Chui et al. 2018; Varshney and Mojsilović 2019). change possible, and Merton argues that we could integrate The working paradigm is to pair data scientists with social both sides in the process of seeking truth. change organizations to work towards the 17 Sustainable Extending Merton’s and other scholars’ thoughts on the Development Goals (SDGs) ratified by the member states insider/outsider debate, Griffith (1998) also believes that of the United Nations in 2015, which include: ‘no poverty,’ the researcher occupies a particular social location, and her ‘zero hunger,’ ‘good health and well-being,’ ‘quality edu- knowledge is situated in particular sets of social relations.
However, the insider status is just the beginning but not the 4 Participation in Computer Science: The end of the research process. Reflecting on her own research Outsider Status experience in mothering work for schooling, she and her col- laborator who were both single mothers started as insiders Participation of Women (mothers). However, they had to cross the social and con- In science, women have been at the “Outer Circle” for a ceptual boundaries to include only mothers from two-parent long time. Historically, women faced multiple barriers in en- families (and thus become outsiders in the research process) tering a scientific career, and even those who were able to as the two-parent family is the ideological norm perceived become a scientist were not allowed into the inner circles by the schools and society. In other words, researchers are of the emerging scientific community (Zuckerman, Cole, rarely insiders or outsiders but oftentimes insiders and out- and Bruer 1991). While women’s representation, experi- siders at the same time, and research is constructed between ence, and advancement in science has increased over time, the researcher and many Others. many of them continue to face barriers, especially at the Dwyer and Buckle (2009) argue that both the insider and cultural and structural levels (Zuckerman and Cole 1975; the outsider statuses have pros and cons, so what is impor- Rosser 2004; Hill, Corbett, and St Rose 2010; National tant is not the insider or the outsider status but “an ability to Research Council of the National Academies 2010; Ceci be open, authentic, honest, deeply interested in the experi- et al. 2014). This is especially true in computer science ence of one’s research participants, and committed to accu- (CS), where, unlike other scientific fields, women’s partic- rately and adequately representing their experience.” In fact, ipation has been consistently low, with some fluctuations. researchers occupy the ‘space between.’ Challenging the di- Hayes (2010) records the changes of women’s representa- chotomy and the static nature of insider versus outsider sta- tion in CS in multiple decades: women represented 11% of tus, the ‘space between’ recognizes the evolving nature of all CS bachelor’s degree recipients in 1967; this percentage the researcher’s life experience and knowledge on the re- peaked at 37% in 1984 and then declined to only 20% in search topic as well as her relationship to participants. 2006. For comparison, women represented 44% of all bach- elor’s degree recipients in 1966 and 58% in 2006, and other When there are insiders and outsiders in scientific re- STEM fields also witnessed steady increases in this period. search, there is also a boundary between them. Specifically, Despite the rapid growth of the computer and mathemati- the boundary delineates what is considered ‘science’ and cal science workforce, women’s proportion declined from what is considered ‘non-science’ in a particular subfield. 31% in 1993 to 27% in 2017. However, the silver lining is Boundary work attempts to shape or disrupt the boundary of that among workers with a doctoral degree in these occupa- what is considered as valid knowledge (Gieryn 1983, 1999). tions, women’s share increased from 16% in 1993 to 31% Research reveals two types of boundary work: symbolic and in 2017 (National Academies of Sciences, Engineering, and social boundaries. Symbolic boundaries are formed when Medicine 2020). members agree on meaning and definition of the field and Multiple factors that oftentimes reinforce each other con- obtain a collective identity. Social boundaries enable mem- tribute to women’s low representation relative to men’s in bers’ access to material and non-material resources (e.g., sta- CS at different life stages. Earlier research reports individ- tus, legitimacy, and visibility) (Lamont and Molnár 2002; ual factors, such as a lack of early exposure to and experi- Grodal 2018). ence with computing, women students’ inaccurate percep- For example, Grodal (2018) details how core communi- tions of their low quantitative abilities, and a lack of self ties who entered the nanotechnology field early expanded confidence despite their good performance and computer the boundaries of the field by enlarging the definition of the knowledge level. Other research has also focused on so- field and associating new members. Peripheral communities, cial, cultural, and structural factors which are much harder to including service providers, entrepreneurs, and university change. For instance, women’s perceptions of their abilities scientists, self-claimed membership during the expansion and the field of computing could be affected by the ‘chilly phase due to newly available material and cultural resources. classroom’ with male students’ unfriendly reactions and pro- Later on, while some peripheral communities continued to fessors’ lack of attention to them; a lack of role models and associate themselves to nanotechnology, the core communi- mentoring; stereotypes against women and against the peo- ties, realizing their collective identity being threatened and ple, work involved, and values of CS; and the perceived mis- resources being restricted because of the enlarged symbolic match of women’s career orientation to help people and so- boundaries, contracted boundaries by restricting the defini- ciety and what they think CS could offer. Combating these tion and policing membership. Also, some peripheral com- barriers could increase women’s representation in CS or munities, not identifying strongly with the more restrictive lower their attrition from CS (Gürer and Camp 2002; Beyer, collective identity, self-disassociated and focused on other Rynes, and Haller 2004; Beyer and DeKeuster 2006; Co- fields of interest. In this process, the insiders or the core hoon 2006; Kim, Fann, and Misa-Escalante 2011; Cheryan, communities entered the field earlier and had a vested inter- Master, and Meltzoff 2015; Lehman, Sax, and Zimmerman est to protect, while the outsiders or the peripheral commu- 2016; Cheryan et al. 2017). nities entered the field later and had a weak association with Policy recommendations and college intervention pro- the field. The insiders had more power than the outsiders in grams have been made and established to change the cul- defining the boundaries of the field and making certain types tural and institutional environment in order to recruit and re- of work and research legitimate. tain more women students and professionals in CS. Some of
the recommendations were repeatedly made in different time Hill 2018). While minority women scientists of different periods, reflecting a reluctance of change over time. They in- racial/ethnic groups differ from each other in their career clude involving women students in research at both the un- experience and outcomes, they all tend to fare less well dergraduate level and early in their graduate study, actively than comparable white women as well as men of the same countering stereotypes and misperceptions of CS, and high- racial/ethnic group (Malcom, Hall, and Brown 1976; Mal- lighting and showing women students the positive social im- com and Malcom 2011; Pearson 1985; Ong et al. 2011; Tao pact that scientists can make and the diverse group of scien- 2018; Tao and McNeely 2019), revealing the persistent in- tists making social impacts in their fields (Cuny and Aspray tersectional effect of race and gender. 2002; National Academies of Sciences, Engineering, and Medicine 2020). Successful college intervention programs Status and Career/Research Focus in increasing the number and percentage of women CS stu- Broadly speaking, women2 tend to be more engaged than dents and their sense of belonging all tackled the culture of their male peers in relatively new, interdisciplinary scien- CS and the institution instead of changing the (women) stu- tific fields (e.g., environmental studies) that are oftentimes dents. These efforts changed the stereotypes of CS by creat- more contextual and problem-based than traditional fields, ing introductory CS courses to be inclusive of a diverse stu- may not have existing gender hierarchy, and are not well- dent body, providing role models and mentoring to women embedded in the structure of academia or knowledge pro- students, providing research experience, and exposing stu- duction, providing more opportunities for women to build dents to a wide range of applications of CS in solving so- the discipline (Rhoten and Pfirman 2007). While some cietal issues (Roberts, Kassianidou, and Irani 2002; Muller women shy away from technical fields because they do 2003; Wright et al. 2019; Frieze and Quesenberry 2019; Na- not see the social engagement of these fields, e.g., (Carter tional Academies of Sciences, Engineering, and Medicine 2006), those who choose technical fields do so not only 2020). because of the excitement of solving technical problems, but also the potential of addressing issues concerning them Participation of Racial and Ethnic Minorities and positively impacting people’s lives, which is consistent In addition to gender, race/ethnicity also shapes scientists’ with their interpersonal and career orientations (Silbey 2016; representation and experience in science as well as their out- Bossart and Bharti 2017). Women CS majors choose com- sider and insider statuses. Among racial/ethnic minorities in puting in the context of what they could do for the world a United States context, while Asians tend to be overrepre- with computing—they would like to use the computer in the sented in science, the other groups (blacks, Hispanics, and broader context of education, medicine, music, communi- American Indians or Alaska natives) are considered as un- cation, healthcare, environmental studies, crime prevention, derrepresented minorities (URMs) due to their low represen- etc. While they also enjoy exploring the computer, the main tation in scientific fields, despite their growth over time. For factor reported by men, women are more likely than their instance, URMs made up 9% of workers in computer sci- male peers to address the broader social context (Fisher, ence and mathematics occupations in 2003, and this percent- Margolis, and Miller 1997; Carter 2006; Hoffman and Fried- age increased to 13% in 2017 (Khan, Robbins, and Okrent man 2018). While few women in AI were at the “outer cir- 2020). While their participation increased over time, it was cle” in its initial stage, they were attracted to it when it still lower than their representation in the general popu- started to develop in the 1980s and 1990s because it was lation, confirming their persistent “outsider” status in sci- more cognitive than other areas of CS and there were fewer ence. Similar trends hold in a world context with Asians and existing stereotypes to fight against (Strok 1992). The in- whites overrepresented compared to black, Hispanic, and in- tersection of machine learning and society makes careers in digenous people. machine learning meaningful to them (Hoffman and Fried- Research on race and science finds that racial/ethnic mi- man 2018). norities, especially URMs, tend to be less likely to pub- lish their research, receive research grants, get recognition 5 Empirical Study of Knowledge Production for their work, and get promoted but more likely to work As discussed in Section 4, women and URMs tend to select in institutions with less resources and more likely to be fields in which they perceive they can help people and so- marginalized in formal and informal scientific communities ciety. The intersection of machine learning and society pro- than their white counterparts. Research also reveals some vides exactly that opportunity to make social impact. There- improvement in their representation in scientific fields as fore, we hypothesize that women and URMs are more likely well as in their career experience and outcomes over time, to contribute to research in machine learning and society but the progress is slow relative to the growth of the sci- rather than machine learning without a direct societal com- entific workforce (Pearson 1985, 2005; Ginther et al. 2011; ponent. Ginther 2018; Tao and McNeely 2019). In the meantime, We performed the following analysis to test our hypoth- an increasing number of studies employ intersectionality as esis. On September 19, 2020, we downloaded the full col- the research framework that indicates power relations and social inequalities to examine the double disadvantages that 2 In this part of the paper and later, we discuss women more minority women scientists suffer from due to both their gen- than Hispanics and blacks, not because of differing experiences, der and race, e.g., (Malcom, Hall, and Brown 1976; Mal- but because of a dearth of published literature on the analogous com and Malcom 2011; Collins 2015; Metcalf, Russell, and experience (Spertus 1991).
Table 1: Average soft classification score of race/ethnicity and gender for three categories of authors. Asian Hispanic Black White Male no cs.cy 0.370 0.077 0.057 0.497 0.791 both 0.367 0.073 0.055 0.504 0.777 only cs.cy 0.266 0.097 0.071 0.566 0.726 slope -0.0430 0.0077 0.0055 0.0298 -0.0293 p-value 0 0.0062 0.0241
Table 2: Average soft classification score of race/ethnicity Table 4: Average soft classification score of gender among among estimated males for three categories of authors. estimated Asians for three categories of authors. Asian Hispanic Black White Male no cs.cy 0.335 0.078 0.059 0.528 no cs.cy 0.738 both 0.343 0.076 0.057 0.525 both 0.732 only cs.cy 0.247 0.097 0.073 0.583 only cs.cy 0.696 slope -0.0345 0.0072 0.0051 0.0223 slope -0.0183 p-value
Table 6: Average soft classification score of gender among learning and society represents an area of AI that is relatively estimated blacks for three categories of authors. new, interdisciplinary, not well-embedded in the structure of academia, and without existing hierarchies, and thus one Male with an opportunity for women and URMs to build, which no cs.cy 0.825 they are doing. both 0.871 While our empirical analysis finds that women of differ- only cs.cy 0.721 ent racial/ethnic groups tend to behave more similarly to slope -0.0345 each other than to their male counterparts, we also find that p-value 0 the women’s groups differ from each other, confirming the intersectionality perspective and that some groups are not Table 7: Average soft classification score of gender among purely insiders or purely outsiders. We would like to high- estimated whites for three categories of authors. light Asian women, who are in a unique position in ma- chine learning (or science as a whole) because Asians are Male overrepresented but women are underrepresented in science. no cs.cy 0.822 Asian men tend to behave similarly to their white counter- both 0.801 only cs.cy 0.739 parts in their career outcomes, but Asian women tend to behave more like other women’s groups, making the gen- slope -0.0375 der gap among Asians greater than that among some other p-value 0 racial/ethnic groups (Tao 2015, 2018; Tao and McNeely 2019). Being insiders in machine learning on the one hand (Asians) and being outsiders on the other hand (women) result, it is hard to get into the inner circle of the community. could possibly constrain some of their choices because they However through the empirical study of Section 5, we know may receive inconsistent expectations and experience multi- that women and URMs are overrepresented in research on level barriers. In the meantime, the Asian and Asian Amer- machine learning and society as compared to plain machine ican cultures tend to emphasize technical expertise and the learning research. instrumental value of education to fight their marginal status and to achieve upward social mobility in American society Analysis of the Current State (Xie and Goyette 2003; Min and Jang 2015). The emphasis Despite being at the center of building the field of machine on technical aspects and some structural barriers they experi- learning and society research, women’s (and URM’s) experi- ence in their careers may suggest that Asian women pursue ence in the workplace reflects their overall struggles in soci- machine learning occupations due to the technical and fi- ety. Similar to women in some other scientific fields, women nancial aspects more than the social impact of such occupa- computer scientists tend to be more subject to stereotyping, tions that are more likely to be highlighted by other women’s less likely to be full professors or in senior research and groups. technical positions, less recognized for their work and paid In addition, the findings reveal complicated issues of less, more likely to be subject to overt discrimination and power and inequality in the ML community, which reflects harassment, more likely to face pressures in balancing work societal inequality. Both at the personal level, e.g., in terms and life, and more likely to be marginalized than their male of exposure to and experience with computing at an early peers (Strok 1992; Simard and Gilmartin 2010; Rosser 2004; age, and at the cultural and structural levels, e.g., in terms Tao 2016; Fox and Kline 2016; Khan, Robbins, and Okrent of experiences in computing classes and workplace, statuses 2020). These “outsider” disadvantages provide them with (e.g., as women or racial/ethnic minorities) affect our lived the insider perspective when conducting algorithmic fair- experience and opportunities to pursue a career in science. ness and other socially-oriented machine learning research. When entering science, our lived experience could impact In fact, women (and URM) scientists’ lived experience and our research focus. While women and URMs are not out- consequent insider status place them in a unique position to siders to ML in the sense of being less technically com- formulate questions and conduct research at the intersection petent, they are outsiders as historically underrepresented of machine learning and society. groups that have not been successful in penetrating into the The finding that women and underrepresented minorities inner circle. As a result, they are not in a position of power are more likely to work on machine learning and society but are disadvantaged in various ways. In the meantime, be- research should not be interpreted as that all insiders con- ing insiders to the experience of inequality, they use their duct only machine learning research without the social as- technical expertise to address and provide solutions to per- pect and all outsiders conduct only machine learning and sistent social inequality. In this sense, they are empowering society research. However, this finding is consistent with lit- not only themselves as underrepresented groups but also the erature that reveals women’ and underrepresented minori- ML community by raising awareness and impact of ML re- ties’ preference for conducting and applying research in a search with social implications. broader context—one that goes beyond the technical. As in- siders of social inequality, they bring their lived experience Epistemic Conflict and the new perspective into a field where they have been Outsiders’ entrance into the field could be shaped by ex- outsiders. Now in the late 2010s and early 2020s, machine isting barriers and policed by the insiders. Once outsiders
enter a field, they have another challenge of making legiti- it may not happen soon and there may be backlashes. mate the research that they prefer but somehow diverge from the mainstream. Although research driven by lived experi- 7 Case Study ence (including the third category of FAccT research that Let us see if our boundary work predictions from Section brings in feminist, post-colonial, and other related critical- 6 hold in a specific case study. In June 2020, the soft- theoretic thoughts) may be celebrated within the intersec- ware for an image super-resolution algorithm (Menon et al. tion of machine learning and society, it is questioned out- 2020) was posted on GitHub and soon discovered to alter side of the intersection on epistemic grounds. According the perceived race/ethnicity of individuals whose downsam- to Haraway (1988), knowledge is situated and embodied in pled face images were presented as input (Johnson 2020a; specific locations and bodies, and the multidimensional and Kurenkov 2020; Vincent 2020). Examples of input black, multifaceted views and voices, from both those in power and Hispanic, and Asian face images yielded white-looking re- those with limited voices, combine to make science. Never- sults. About these results, Facebook machine learning re- theless, despite scholarship supporting lived experience not searcher Yann LeCun commented on Twitter: “ML systems being in conflict with scientific objectivity, the common re- are biased when data is biased. This face upsampling system frain summarized by the feminist and postcolonial episte- makes everyone look white because the network was pre- mologist Sandra Harding is as follows: “‘Real sciences’ are trained on FlickFaceHQ, which mainly contains white peo- supposed to be transparent to the world they represent, to ple pics. Train the *exact* same system on a dataset from be value neutral. They are supposed to add no political, so- Senegal, and everyone will look African.” cial, or cultural features to the representations of the world In response, Google machine learning researcher Timnit they produce.” In other words, ways of knowing that do not Gebru pointed to the video of her recently completed tutorial follow the (Western) scientific method are not seen by prac- (with Emily Denton) Fairness Accountability Transparency titioners as scientific (Harding 2006), despite scholarly crit- and Ethics in Computer Vision with the comment: “Yann, icisms to this perspective. The implication in the context of I suggest you watch me and Emily’s tutorial or a number machine learning and society is that critical-theoretic work of scholars who are experts in this are. You can’t just re- based on lived experience as the source of knowledge will duce harms to dataset bias. For once listen to us people from be discounted in mainstream machine learning: insider re- marginalized communities and what we tell you. If not now search by outsiders is precarious. during worldwide protests not sure when.” She also posted: “I’m sick of this framing. Tired of it. Many people have tried Boundary Work Predictions to explain, many scholars. Listen to us. You can’t just reduce Based on the sociology theory, we may predict two possible harms caused by ML to dataset bias.” A back and forth de- futures, both involving boundary work. The first is a sever- bate ensued on Twitter with many interlocutors taking sides ing of the connection between mainstream machine learn- and offering inputs. ing research and societally-relevant applications and gover- Let us analyze what happened using the insider/outsider nance, i.e., the expulsion of machine learning and society understanding of research on machine learning and society from mainstream machine learning. The second is the ex- that we have developed in this paper. LeCun is a white male, pansion of machine learning research to include knowledge Chief AI Scientist at Facebook, and Turing award winner— from lived experience, while overcoming tendencies for ex- a person likely without lived experience of marginalization pulsion and the protection of autonomy that many insiders and a clear insider in mainstream machine learning research. of machine learning research may have. Gebru is a black female, co-lead of the Ethical Artificial In- The future that emerges among the two possibilities could telligence Team at Google at the time—a person with lived depend on what insiders perceive as legitimate, as in the experience of marginalization and thus an insider in algo- case of nanotechnology. (The insiders hold epistemic au- rithmic fairness research, but an outsider in machine learn- thority both due to their entrenched status and the power ing overall. that comes from their identity (race/ethnicity, gender, etc.) Although Gebru et al. (2019) say: “Of particular concern (Pereira 2012, 2019).) In addition, another factor may shape are recent examples showing that machine learning mod- the future of machine learning and society research: sus- els can reproduce or amplify unwanted societal biases re- tainability of the ML field as a thriving site of research to flected in datasets,” which is consistent with LeCun’s argu- continuously attract the next generation of scholars, includ- ment, Gebru’s comments in the debate point to her holding a ing women and underrepresented minorities. When the out- stance consistent with Merton’s insider doctrine of lived ex- siders conduct their machine learning and society research, perience providing privilege (bordering on exclusivity) for they raise awareness of the broader context of technical is- conducting research on machine learning bias. Additionally, sues. While machine learning and society research is still her epistemic perspective appears to be that such lived ex- a small portion of machine learning research overall, it has perience is a valid source of knowledge for “many schol- been growing. Led by “the outsiders,” this line of inquiry is ars.” The repeated call to listen to scholars is an attempt at increasingly being addressed and published. Based on this expansion boundary work. On the other hand, LeCun’s per- trend and considering that the field could benefit from both spective epitomizes a boundary and epistemic authority that insiders and outsiders’ perspectives, we have reasons to be- leaves lived experience out of machine learning research; lieve that machine learning and society research will trans- scientifically-derived knowledge is valid knowledge. At the form and sustain ML knowledge and practice, even though end, some white males in positions of power also joined
the debate and offered allyship, which may have expanded tension over the boundaries of valid knowledge in machine the epistemic boundary of machine learning just a little bit. learning and society. Specifically, the epistemic question that What may have appeared at first glance to be a personal war arises is whether lived experience is a valid source of knowl- of words was in fact an example of boundary work in prac- edge. Instances of expansion and expulsion boundary work tice, manifested as contention between two insiders of their are predicted and verified in a case study. own respective domains. If one takes the normative stance that expansion bound- After we completed the first draft of this paper in Oc- ary work is preferable to expulsion, then the resolution of tober 2020, there was further contention involving Timnit the epistemic contention calls for facilitation by researchers Gebru in December 2020 (Johnson 2020b). She was dis- with ascribed status that is not marginalized and who lean missed from her research position at Google by Jeff Dean towards including knowledge derived from lived experience and Megan Kacholia, who are both white and in positions within the boundary of machine learning research. of power. The dismissal was widely argued in a public man- This paper illuminates a few avenues for future research. ner. Among others, one of the factors was Gebru’s reluctance One such direction is to dive deeper into the topics of situ- to remove Google authors from or withdraw the paper “On ated knowledge, feminist epistemology, and boundary work the Dangers of Stochastic Parrots: Can Language Models to better understand how the field of machine learning and Be Too Big? a ” (Bender et al. 2021) from the ACM FAccT society may evolve and to understand strategies for directing Conference. One of the main parts of this paper conducts a that evolution in a beneficial way. Another future direction is critical analysis of large language models through the lens to study the impact of papers in machine learning and soci- of decolonizing hegemonic views (Srigley and Sutherland ety produced by teams containing only members with lived 2018), which is a prototypical example of the third, social experience with marginalization, containing only members justice, angle to FAccT research mentioned in Section 2. without lived experience with marginalization, and contain- Dean and Kacholia’s criticism of the paper was the inclusion ing both types of researchers, to understand whether work of the decolonial perspective at the expense of a technical- that bridges the epistemic divide of formal science and crit- only analysis that would include a discussion of techniques ical theory is more valuable than other pieces of work. to mitigate representation bias, which correspond to the sec- ond kind of FAccT research mentioned in Section 2. The first 9 Acknowledgments author of the paper, Emily Bender, posted on Twitter: “The The authors thank Delia R. Setola for her assistance and Tina claim that this kind of scholarship is ‘political’ and ‘non- M. Kim for her comments. scientific’ is precisely the kind of gate-keeping move set up to maintain ‘science’ as the domain of people of privilege only.” This case is also one of epistemology and further il- References lustrates how boundary work can engender extreme conflict Arnold, M.; Bellamy, R. K. E.; Hind, M.; Houde, S.; Mehta, and the tendency for expulsion boundary work. S.; Mojsilović, A.; Nair, R.; Natesan Ramamurthy, K.; Together, these cases illustrate a little bit of expansion Olteanu, A.; Piorkowski, D.; Tsay, J.; and Varshney, K. R. boundary work and a healthy dose of expulsion boundary 2019. FactSheets: Increasing Trust in AI Services Through work, both as predicted by the theory of insiders and out- Supplier’s Declarations of Conformity. IBM Journal of Re- siders. search and Development 63(4/5): 6. Barocas, S.; and Selbst, A. D. 2016. Big Data’s Disparate 8 Conclusion Impact. California Law Review 104: 671. In this paper, we have studied the sociology of researchers Bender, E.; Gebru, T.; McMillan-Major, A.; and Shmitchell, creating new knowledge in the area of machine learning. We S. 2021. On the Dangers of Stochastic Parrots: Can Lan- have analyzed the intersection of general machine learning research with FAccT and AI for social good—collectively guage Models Be Too Big? a . In Proceedings of the ACM machine learning and society—using Merton’s concepts of Conference on Fairness, Accountability, and Transparency. insiders and outsiders. Although these concepts are usually Beyer, S.; and DeKeuster, M. 2006. Women in Computer only applied in studying the social sciences and humanities Science or Management Information Systems Courses: A rather than the natural sciences and formal sciences, there is Comparative Analysis. In Cohoon, J.; and Aspray, W., eds., a clear insider in terms of lived experience in machine learn- Women and Information Technology: Research on Under- ing and society: a member of a marginalized group. Just as representation. MIT Press. importantly, these same marginalized groups have low his- Beyer, S.; Rynes, K.; and Haller, S. 2004. Deterrents to torical participation and representation in computer science, Women Taking Computer Science Courses. IEEE Technol- the parent field of machine learning and are thus outsiders ogy and Society Magazine 23(1): 21–28. in a different way. Through an empirical study, we find that researchers from marginalized groups are overrepresented in Bhatt, U.; Xiang, A.; Sharma, S.; Weller, A.; Taly, A.; Jia, conducting research on machine learning and society. There- Y.; Ghosh, J.; Puri, R.; Moura, J. M.; and Eckersley, P. 2020. fore, we have a situation in which the same group takes the Explainable Machine Learning in Deployment. In Proceed- insider role in terms of lived experience and the outsider role ings of the ACM Conference on Fairness, Accountability, in terms of historical participation. This situation leads to and Transparency, 648–657.
Blodgett, S. L.; Barocas, S.; Daumé, III, H.; and Wallach, Dwyer, S. C.; and Buckle, J. L. 2009. The Space Between: H. 2020. Language (Technology) is Power: A Critical Sur- On Being an Insider-Outsider in Qualitative Research. In- vey of “Bias” in NLP. In Proceedings of the 58th Annual ternational Journal of Qualitative Methods 8(1): 54–63. Meeting of the Association for Computational Linguistics, Fisher, A.; Margolis, J.; and Miller, F. 1997. Undergraduate 5454–5476. Women in Computer Science: Experience, Motivation and Bossart, J.; and Bharti, N. 2017. Women in Engineering: In- Culture. ACM SIGCSE Bulletin 29(1): 106–110. sight into Why Some Engineering Departments Have More Success in Recruiting and Graduating Women. American Fox, M. F.; and Kline, K. 2016. Women Faculty in Comput- Journal of Engineering Education 8(2): 127–140. ing: A Key Case of Women in Science. In Branch, E. H., ed., Pathways, Potholes, and the Persistence of Women in Sci- Buolamwini, J.; and Gebru, T. 2018. Gender Shades: Inter- ence: Reconsidering the Pipeline, 41–55. Lexington Books. sectional Accuracy Disparities in Commercial Gender Clas- sification. In Proceedings of the Conference on Fairness, Frieze, C.; and Quesenberry, J. L. 2019. How Computer Accountability, and Transparency, 77–91. Science at CMU is Attracting and Retaining Women. Com- munications of the ACM 62(2): 23–26. Carter, L. 2006. Why Students With An Apparent Aptitude for Computer Science Don’t Choose to Major in Computer Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J. W.; Science. ACM SIGCSE Bulletin 38(1): 27–31. Wallach, H.; Daumé, III, H.; and Crawford, K. 2019. Ceci, S. J.; Ginther, D. K.; Kahn, S.; and Williams, W. M. Datasheets for Datasets. arXiv preprint arXiv:1803.09010v4 2014. Women in Academic Science: A Changing Land- . scape. Psychological Science in the Public Interest 15(3): Gieryn, T. F. 1983. Boundary-Work and the Demarcation 75–141. of Science from Non-Science: Strains and Interests in Pro- Chang, H.-C. H.; and Fu, F. 2020. Elitism in Mathematics fessional Ideologies of Scientists. American Sociological and Inequality. arXiv preprint arXiv:2002.07789 . Review 48(6): 781–795. Chen, J.; Kallus, N.; Mao, X.; Svacha, G.; and Udell, M. Gieryn, T. F. 1999. Cultural Boundaries of Science: Credi- 2019. Fairness under Unawareness: Assessing Disparity bility on the Line. University of Chicago Press. When Protected Class Is Unobserved. In Proceedings of the Ginther, D. K. 2018. Using Data to Inform the Science ACM Conference on Fairness, Accountability, and Trans- of Broadening Participation. American Behavioral Scientist parency, 339–348. 62(5): 612–624. Cheryan, S.; Master, A.; and Meltzoff, A. N. 2015. Cul- Ginther, D. K.; Schaffer, W. T.; Schnell, J.; Masimore, B.; tural Stereotypes as Gatekeepers: Increasing Girls’ Inter- Liu, F.; Haak, L. L.; and Kington, R. 2011. Race, Ethnicity, est in Computer Science and Engineering by Diversifying and NIH Research Awards. Science 333(6045): 1015–1019. Stereotypes. Frontiers in Psychology 6: 49. Cheryan, S.; Ziegler, S. A.; Montoya, A. K.; and Jiang, L. Griffith, A. I. 1998. Insider/Outsider: Epistemological Priv- 2017. Why Are Some STEM Fields More Gender Balanced ilege and Mothering Work. Human Studies 21(4): 361–376. than Others? Psychological Bulletin 143(1): 1. Grodal, S. 2018. Field Expansion and Contraction: How Chui, M.; Harryson, M.; Manyika, J.; Roberts, R.; Chung, Communities Shape Social and Symbolic Boundaries. Ad- R.; van Heteren, A.; and Nel, P. 2018. Applying AI for So- ministrative Science Quarterly 63(4): 783–818. cial Good. Technical report, McKinsey & Company. Gürer, D.; and Camp, T. 2002. An ACM-W Literature Re- Cohoon, J. M. 2006. Just Get Over It or Just Get On With It: view on Women in Computing. ACM SIGCSE Bulletin Retaining Women in Undergraduate Computing. In Cohoon, 34(2): 121–127. J.; and Aspray, W., eds., Women and Information Technol- Haraway, D. 1988. Situated Knowledges: The science Ques- ogy: Research on Underrepresentation. MIT Press. tion in Feminism and the Privilege of Partial Perspective. Collins, P. H. 2015. Intersectionality’s Definitional Dilem- Feminist Studies 14(3): 575–599. mas. Annual Review of Sociology 41: 1–20. Harding, S. 2006. Science and Social Inequality: Feminist Cuny, J.; and Aspray, W. 2002. Recruitment and Retention and Postcolonial Issues. University of Illinois Press. of Women Graduate Students in Computer Science and En- gineering: Results of a Workshop Organized by the Com- Hart, K. L.; and Perlis, R. H. 2019. Trends in Proportion of puting Research Association. ACM SIGCSE Bulletin 34(2): Women as Authors of Medical Journal Articles, 2008–2018. 168–174. JAMA Internal Medicine 179(9): 1285–1287. Dimanov, B.; Bhatt, U.; Jamnik, M.; and Weller, A. 2020. Hayes, C. C. 2010. Computer Science: The Incredible You Shouldn’t Trust Me: Learning Models Which Conceal Shrinking Woman. In Misa, T. J., ed., Gender Codes: Why Unfairness From Multiple Explanation Methods. In AAAI Women Are Leaving Computing, 25–49. John Wiley & Sons. Workshop on Artificial Intelligence Safety, 63–73. Hayfield, N.; and Huxley, C. 2015. Insider and Outsider Per- Du, M.; Liu, N.; and Hu, X. 2019. Techniques for Inter- spectives: Reflections on Researcher Identities in Research pretable Machine Learning. Communications of the ACM with Lesbian and Bisexual Women. Qualitative Research in 63(1): 68–77. Psychology 12(2): 91–106.
Hill, C.; Corbett, C.; and St Rose, A. 2010. Why So Few? Malcom, S. M.; Hall, P. Q.; and Brown, J. W. 1976. The Women in Science, Technology, Engineering, and Mathemat- Double Bind: The Price of Being a Minority Woman in Sci- ics. AAUW. ence. Report of a Conference of Minority Women Scientists, Hind, M. 2019. Explaining Explainable AI. ACM XRDS Arlie House, Warrenton, Virginia. Technical Report 76-R-3, Magazine 25(3): 16–19. American Association for the Advancement of Science. Hoffman, S. F.; and Friedman, H. H. 2018. Machine Learn- Menon, A. K.; and Williamson, R. C. 2018. The Cost of ing and Meaningful Careers: Increasing the Number of Fairness in Binary Classification. In Proceedings of the Con- Women in STEM. Journal of Research in Gender Studies ference on Fairness, Accountability and Transparency, 107– 8(1): 11–27. 118. Menon, S.; Damian, A.; Hu, S.; Ravi, N.; and Rudin, C. Hofstra, B.; Kulkarni, V. V.; Galvez, S. M.-N.; He, B.; Ju- 2020. PULSE: Self-Supervised Photo Upsampling via La- rafsky, D.; and McFarland, D. A. 2020. The Diversity– tent Space Exploration of Generative Models. In Proceed- Innovation Paradox in Science. Proceedings of the National ings of the IEEE/CVF Conference on Computer Vision and Academy of Sciences 117(17): 9284–9291. Pattern Recognition, 2437–2445. Jobin, A.; Ienca, M.; and Vayena, E. 2019. The Global Land- Mercer, J. 2007. The Challenges of Insider Research in Ed- scape of AI Ethics Guidelines. Nature Machine Intelligence ucational Institutions: Wielding a Double-Edged Sword and 1(9): 389–399. Resolving Delicate Dilemmas. Oxford Review of Education Johnson, K. 2020a. AI Weekly: A Deep Learning Pioneer’s 33(1): 1–17. Teachable Moment on AI Bias. VentureBeat . Merton, R. K. 1972. Insiders and Outsiders: A Chapter in the Johnson, K. 2020b. Google AI Ethics Co-Lead Timnit Ge- Sociology of Knowledge. American Journal of Sociology bru Says She Was Fired over an Email. VentureBeat . 78(1): 9–47. Kearns, M.; Neel, S.; Roth, A.; and Wu, Z. S. 2019. An Em- Metcalf, H.; Russell, D.; and Hill, C. 2018. Broadening the pirical Study of Rich Subgroup Fairness for Machine Learn- Science of Broadening Participation in STEM through Criti- ing. In Proceedings of the ACM Conference on Fairness, cal Mixed Methodologies and Intersectionality Frameworks. Accountability, and Transparency, 100–109. American Behavioral Scientist 62(5): 580–599. Kerstetter, K. 2012. Insider, Outsider, or Somewhere Miller, T. 2019. Explanation in Artificial Intelligence: In- Between: The Impact of Researchers’ Identities on the sights from the Social Sciences. Artificial Intelligence 267: Community-Based Research Process. Journal of Rural So- 1–38. cial Sciences 27(2): 7. Min, P. G.; and Jang, S. H. 2015. The Concentration of Khan, B.; Robbins, C.; and Okrent, A. 2020. The State of Asian Americans in STEM and Health-Care Occupations: US Science and Engineering 2020. National Science Foun- An Intergenerational Comparison. Ethnic and Racial Stud- dation, January 15. ies 38(6): 841–859. Kim, K. A.; Fann, A. J.; and Misa-Escalante, K. O. 2011. Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, Engaging Women in Computer Science and Engineering: L.; Hutchinson, B.; Spitzer, E.; Raji, I. D.; and Gebru, T. Promising Practices for Promoting Gender Equity in Un- 2019. Model Cards for Model Reporting. In Proceedings dergraduate Research Experiences. ACM Transactions on of the ACM Conference on Fairness, Accountability, and Computing Education 11(2): 1–19. Transparency, 220–229. Kind, C. 2020. The Term ‘Ethical AI’ is Finally Starting to Mohamed, S.; Png, M.-T.; and Isaac, W. 2020. Decolonial Mean Something. VentureBeat . AI: Decolonial Theory as Sociotechnical Foresight in Arti- ficial Intelligence. Philosophy & Technology 33: 659–684. Kurenkov, A. 2020. Lessons from the PULSE Model and Discussion. The Gradient . Muller, C. B. 2003. The Underrepresentation of Women in Engineering and Related Sciences: Pursuing Two Comple- Labaree, R. V. 2002. The Risk of ‘Going Observationalist’: mentary Paths to Parity. In Pan-Organizational Summit on Negotiating the Hidden Dilemmas of Being an Insider Par- the US Science and Engineering Workforce: Meeting Sum- ticipant Observer. Qualitative Research 2(1): 97–122. mary, 119–126. National Academies Press. Lamont, M.; and Molnár, V. 2002. The Study of Boundaries National Academies of Sciences, Engineering, and in the Social Sciences. Annual Review of Sociology 28(1): Medicine. 2020. Promising Practices for Addressing the 167–195. Underrepresentation of Women in Science, Engineering, Lehman, K. J.; Sax, L. J.; and Zimmerman, H. B. 2016. and Medicine: Opening Doors. National Academies Press. Women Planning to Major in Computer Science: Who Are National Research Council of the National Academies. They and What Makes Them Unique? Computer Science 2010. Gender Differences at Critical Transitions in the Education 26(4): 277–298. Careers of Science, Engineering, and Mathematics Faculty. Malcom, L. E.; and Malcom, S. M. 2011. The Double Bind: National Academies Press. The Next Generation. Harvard Educational Review 81(2): Ong, M.; Wright, C.; Espinosa, L.; and Orfield, G. 2011. In- 162–172. side the Double Bind: A Synthesis of Empirical Research
You can also read