Intersectional Identities and Machine Learning: Illuminating Language Biases in Twitter Algorithms - ScholarSpace

Page created by Alicia Robles

Lifestyle

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Proceedings of the 55th Hawaii International Conference on System Sciences | 2022

Intersectional Identities and Machine Learning: Illuminating Language
Biases in Twitter Algorithms

Aidan Fitzsimons
Duke University
aidan.fitzsimons@alumni.duke.edu

Thanks to Dr. Jay Pearson and Dr. Kenneth Rogerson born out of a multitude of identities, and that stands on
for their guidance, support and editing to make this its own. In other words, the specific identity of being
project possible. a black lesbian has its own set of experiences that
cannot be represented merely by the sum of a woman’s
Abstract experiences, a queer person’s experiences, and a black
person’s experiences.
Intersectional analysis of social media data Intersectionality is not a content specialization, nor
is rare. Social media data is ripe for identity is it automatically synonymous with feminist-, queer-,
and intersectionality analysis with wide or POC-based scholarship [2]. Intersectionality covers
accessibility and easy to parse text data yet the multiplicity of identities that one person, or a
provides a host of its own methodological collective of people, may possess. Intersectionality
aims to represent the intersection of identities in a way
challenges regarding the identification of that respects each of them individually and how each
identities. We aggregate Twitter data that of them interacts with one another. It is the question of
was annotated by crowdsourcing for tags of how specific identities co-act in comparison to other
“abusive,” “hateful,” or “spam” language. groups that makes something intersectionality
Using natural language prediction models, research.
Within intersectionality, in-group diversity will
we predict the tweeter’s race and gender and always be a present factor no matter how many
investigate whether these tags for abuse, different strata we consider. Considering black
hate, and spam have a meaningful lesbians, for example (at the intersection of gender,
relationship with the gendered and racialized race, and sexual orientation), the diversity within the
language predictions. Are certain gender and sample will always be great because of the implicit
differences not covered by the intersectional labels
race groups more likely to be predicted if a which we are considering. In the example case of black
tweet is labeled as abusive, hateful, or spam? lesbians, we fail to consider how that plays a role in
The findings suggest that certain racial and socioeconomic status, ability status, national origin,
intersectional groups are more likely to be immigration status, and so many more strata that affect
associated with non-normal language an individual’s daily life and are core to their being.
This is not to say that this work is futile, but instead to
identification. Language consistent with note that it is essential that the intersectional
white identity is most likely to be considered researcher acknowledge what they fail to capture.
within the norm and non-white racial groups
are more often linked to hateful, abusive, or 1.1. Quantitative intersectionality
spam language.
Weber and Parra-Medina have asked: “How can a
poor Latina be expected to identify the sole—or even
1. Introduction and intersectionality
primary—source of her oppression? How can scholars
with no real connection to her life do so?” [3]. This is
“Intersectionality” was coined by Kimberlé
precisely why intersectionality is important.
Crenshaw [1] while reviewing various legal
Quantitative intersectionality is a subsection of the
precedents that were relevant to both race and gender.
intersectionality literature that looks to use big data
The framework of intersectionality is a complicated
methods and statistical testing to represent
one, but its core concepts are simple: intersectionality
intersectional identities. More specifically,
argues that there is a special kind of identity that is
quantitative intersectionality research in the fields of

URI: https://hdl.handle.net/10125/79690
978-0-9981331-5-7 Page 2880
(CC BY-NC-ND 4.0)

psychology [4], political science [5], and questions. A variety of studies fall into two main
epidemiology [6] have emerged by comparing categories regarding their approach: either they focus
intersectional groups’ experiences with healthcare, on comparing one stratified category (e.g., young
behavioral science, the legal system, and political black woman) to the broader population, or they use a
structures at large. multitude of categories to compare across a variety of
Attempting to quantify intersectionality provides a strata. These studies have employed a variety of
host of methodological challenges. The greatest methods, from a simple bivariate analysis [9, 10] to
challenge comes in the availability of information; regression models [11] to more complex multi-level
when researchers collect demographic information models that use techniques like tree classification to
about their participants, it is often to ensure that analyze intersections based on heterogeneity [12].
demographic factors are not relevant to their research Some methods use even more complex models with
work. In other words, researchers often collect just mixed methods to analyze many high-dimensional
enough demographic information to rule out that interactions to speak to a variety of intersectional
different identities experience a phenomenon in a identities at once [13].
certain way [7]. Most researchers, especially prior to
the last 15 or so years, viewed identities as 2. Social media and intersectional
confounding factors rather than main effects drivers.
Ultimately intersectionality research goes against
analysis
many of the assumptions implicit in numerical
analysis and requires a great deal of care on the part of Since McCall’s initial classification of different
the researcher to collect meaningful variables related intersectionality methods, the emergence of platforms
to lived experience prior to analysis. like Facebook, Twitter, and Instagram have created a
McCall [8] took on the task of separating the broad wealth of data than to analyze the human experience.
field of intersectionality research by methodology. Particularly on platforms like Twitter, users share their
Based on the population studied and the adherence to thoughts about nearly anything, and thus there is a
great deal of information to be gleaned from public
social labels, we have three methodic categories: the
posts on social media platforms.
intracategorical, the intercategorical, and the anti-
A recent focus of technology ethics literature has
categorical. The intercategorical approach, the focus
of this project, is born out of a different central interest been the study of how algorithms may be based on
than the other two approaches; while the biased training data. In a 2019 study published in
intracategorical and anti-categorical question the Science, Ziad Obermeyer and colleagues worked on
validity of the labels used in research, the studying discrimination in a health prediction
intercategorical approach highlights real differences in algorithm that has real implications for millions of
lived experiences quantitatively between multiple Americans [14]. This interest has extended to the
identity categories. It considers relationships of algorithms that big companies use, but since
inequality between different existing social categories, companies like Twitter and Facebook are privately
held corporations, many of their activities are kept
as imperfect as those categories may be, and attempts
behind lock and key.
to quantify differences in lived experiences between
them. The bulk of quantitative intersectionality To do this project, we need to know the gender and
racial identities of the users. This raises some
studies, especially those in the domain of public
health, fall into this category. challenges as users may not share this information or
The study that will follow attempts to quantify may choose to represent or highlight a variety of
differences in multiplicative oppression for different identities. While some researchers would argue that
intersectional identity groups. As such, this work is pursuit of identification is unethical because of privacy
inherently comparative, and thus falls into the concerns, and we would tend to agree for the sake of
intercategorical framework most cleanly. However, people’s opportunity to self-identify, a body of
we fall short of a truly intersectional analysis (section research has developed to use different characteristics
2.2) because of the constraints of data availability; this of text or Twitter profiles to predict race and gender.
failure to aptly address the multiplicity of identities There are both non-machine and machine learning
that one may hold is reflective of the challenges and techniques for gender and race prediction.
constraints of intersectional analysis.
Since the inception of intercategorical analysis, a 2.1 Racial/gender identity on social media
great deal of nuance has been added to the field, both
in the questions that researchers are asking, and in the Using a non-machine learning approach, a group of
methods employed to answer these more complex researchers in 2011 attempted to understand who

Page 2881

makes up the Twitter community in the United States in the coming work to do an analysis of our own.
and found that Twitter usage is highly dependent on Antigoni-Maria Founta and her colleagues used
locational and demographic characteristics [15]. The crowdsourcing techniques to annotate over 80,000
authors note under sampling of certain minority individual tweets for abusive behavior [18]. This
groups, such as black people in the rural south, and dataset provides the basis for a lot of research about
oversampling of urban white people. However, much abusive behavior on Twitter, including this one as
of this analysis was done on an extremely faulty described in the methods section. Of course, human-
assumption: since Twitter users’ demographics are created labels for hateful or abusive speech leaves the
private, these authors used last name to proxy for race. opportunity for human error and bias in the data
This method, while popular, is extremely imprecise. outcomes, and that’s what we’ll be investigating.
Another work, getting closer to the heart of this Training data can have an impact on the systems that
study, tries to quantify privileges for different they train, and if a human group (with ever-present
intersectional identities on Twitter [16]. To identify biases) creates training data, they may train the system
users’ race and gender, they use a system called to reproduce their own biases without even knowing
Face++ to process Twitter users’ profile images. it.
Face++ is an advanced image processing algorithm To address these concerns, researchers [19] have
that can identify race and gender with moderate suggested methodological changes like assessing how
certainty. opinionated an individual worker may be. While much
For machine learning, a variety of studies have of crowd marking research has addressed objective
attempted to quantify intersectional bias in one way or marking criteria (e.g., identifying whether an image
another using Twitter data. The study, titled “White, has a car) and issues of fatigue and lack of interest, this
Man, and Highly Followed: Gender and Race issue is particularly poignant when it comes to
Inequalities in Twitter,” uses the Face++ algorithm to subjective judgements, like whether a tweet may be
determine users’ identity factors, and tries to identify identified as spam, hateful, or abusive rather than
which identity groups receive privilege on Twitter in normal. It is unclear whether Founta and colleagues
the form of followers and interactions. They find that took much precaution in identifying underlying biases
white men are more likely to interact with one another of its crowd marking coders [20, 21].
on Twitter, and as the most populous group on
American Twitter, their interaction with one another 2.2. Challenges in intersectional data analysis
causes higher incidence and follower rates for white
men on Twitter. They also find that racial groups stay There are challenges when attempting to find and
within their own “icebergs,” to an extent, as the analyze data to address intersectional questions. The
greatest interaction for nearly every intersectional availability of identity-linked data on social media is
identity group is within their own identity. This is sparse given concerns of privacy, and when we use
something extremely interesting about the technical methods to address a lack of availability of
characteristics of Twitter users’ behavior, and identity-linked data, we bring in new confounding
something that will inform much of the coming factors.
analysis. This study was limited by timeframe and funding.
Another study [17], closer to what we will seek to Because of these limitations, we sought a dataset with
quantify, uses a “Bag of Words” method to flag and few specific criteria: first, a measure of interest that
mark hateful terms within Twitter data. The study uses describes something meaningful about a person’s
incidences of specific political interest in an identity experience (i.e., health outcomes, technological bias,
factor to track identity discourse on Twitter for hatred etc.); second, a dataset with many identity factors for
espoused against certain groups. For example, they analysis. We failed to find a publicly accessible dataset
used the re-election of Barack Obama to track racial that met both criteria and wondered how other entities
hate and the coming out of pro-Basketball player Jason (in academia, business, and the like) predict identity
Collins to track intersectional black/queer hate. They factors about intersectional identity.
build models for each incident and quantify Using machine learning and natural language
intersectional discrimination on Twitter through a by- processing to predict identity factors is a common,
event lens. easily accessible approach to dealing with these
Another study that looks in detail at the uncertainties. However, NLP techniques have not
discrimination of abusive language on Twitter will be advanced to the point of reliable predictions for
foundational to our work going forward in this paper. identity strata outside of race and gender. A machine
In fact, the scholars that produced this paper published learning approach changed the issues considered in
their annotated dataset, and that’s what we’ll be using three major ways.

Page 2882

First, a lack of predictive methods for different • Predicted gender of tweet author will have a
identity strata has limited our ability to address the meaningful interaction with the likelihood of
issues of intersectionality. Intersectionality aims to being labeled as spam, abusive, or hateful.
provide as full a picture as possible and, even with • Predicted intersectional identity of tweet author
100% identification reliability (which has not been (race and gender) will have a meaningful
achieved), addresses only two major identity factors. interaction with the likelihood of being labeled as
Second, adopting a machine learning approach, the spam, abusive, or hateful.
possibility of inaccurate identification becomes a
major factor. This challenges whether NLP is an 3.1. Why Twitter Data?
effective solution for addressing questions of identity.
Though imperfect, these methods are being used by In our analysis, we decided to use Twitter data for
scholars and potentially corporations and real several reasons. Given its wide accessibility and short-
decisions are being made based on these outcomes. form language, Twitter data is ripe for big data
Finally, how NLP algorithms are used after their analysis. Twitter’s identity within the social media
publication is nearly impossible. While scholars are landscape not only allows for significant free
clearly using these predictive models [22], private expression but also encourages different identity
companies do not publicize their techniques. Because groups to speak to one another in the language most
scholarly work is publicly accessible, we expect that comfortable to their community. The kind of discourse
NLP techniques are being used privately as well. we see on Twitter can be broad or isolated to a specific
In this study, we will be using the gender and racial identity group (see note [16]), and that makes for raw
predictors described in Sap et al. and Blodgett et al. and honest language ideal for our purposes. The
[23] (described in the methodology) and cross- academic field has taken a recent interest in Twitter
referencing that with the training data created by data and how others perceive different tweets with
Founta. We are interested in whether this training data, different language styles, and the publicly accessible
created by humans, has an indication of bias within it data allows for a great deal of statistical power.
and if outcomes are different for folks of a certain The dataset comes from a study that takes
predicted gender and race. While none of these crowdsourcing annotations to a large scale. Founta
systems is perfect, they are being used in practice in sought to create reliable labels based on human
research and almost certainly in business, and thus it interpretation of different tweets. Specifically, the goal
is important to evaluate if the data created by Founta of the study was to 1) provide a reliable labeling with
and the prediction algorithms created by Sap and meaningful differences between different types of
Blodgett could reproduce racist or sexist results. speech that “abuse” the Twitter platform, and 2) create
and model a mechanism for crowdsourced annotation
3. Methodology on a massive scale. To do so, the researchers recruited
a great deal of annotators and trained them in
In this study, we seek to quantify intersectional differences between different kinds of abuse speech
oppression between different demographic groups (abusive, hateful, or spam). For each tweet, between
with respect to twitter behavior. Specifically, we are five and 20 reviewers were tasked with classifying it
interested in evaluating models that use Natural as abusive, hateful, spam, or normal. Once there
Language Processing to predict race and gender and appeared to be consensus among a tag (typically
assess whether tweets that are flagged as abusive, achieved after five reviewers, but sometimes more),
hateful, or spam are more likely to be attributed by the that tag was solidified and attached to a tweet. This
model to certain racial and gender groups. We will procedure was repeated for over 99,000 tweets (N =
construct our dataset using a variety of open-source 99,534) and published for future researchers to gain
research publications who have made their work insight. As noted above, the demographics of a group
accessible to the public for research purposes. As reviewing a specific tweet may not have been
described above, three studies will form the basis for controlled to the degree that the literature recommends
our analysis: Founta, Sap and Blodgett. and could lead to particularly biased labelling.
We will seek to answer questions addressing how
the labeling that Founta produced can help us predict 3.2. Race and gender-predictive modeling
race and gender. Specifically, we will evaluate the
following hypotheses: A model with a variety of racial categories with
• Predicted race of tweet author will have a wide accessibility for short text datasets like Twitter
meaningful interaction with the likelihood of better predicts the race of a tweet’s author. For these
being labeled as spam, abusive, or hateful. reasons, we were drawn to Blodgett et al., 2016.

Page 2883

Blodgett builds on the body of Natural Language evaluating the likelihood that tweets tagged as
Processing studies that attempt to predict race based “abusive,” “spam,” and “hateful” are predicted to be a
on certain lexical behaviors. While their research was certain gender and/or race. Said another way, we’re
originally structured around a racial binary curious if tweets that are flagged for abuse of some
(Black/White), they began to find terminology kind are predicted based on machine learning models
consistent with Hispanic and Asian folks’ unique to be consistent with a certain gender and/or racial
linguistic characteristics that they separated out. In lexica. It’s important to understand here that we’re not
their pursuit of predicting racial identity, Blodgett looking at the other direction: these data do not tell us
provides a rare opportunity to predict folks’ identities whether folks of real identities are more likely to be
more accurately than simply along the Black/White flagged for abusive behavior. Because of the privacy
binary. It is worth noting that four racial categories are limitations of Twitter data, we can only ask how these
still not encompassing of the multitude of racial labels help us predict race and gender based on studied
identities and therefore is less likely to correctly racial and gender lexical dictionaries. We seek to
attribute authorship to correct identities. It can also answer one key question: how might intersectional
completely misattribute the identities of groups not identities play into the way others perceive their
studied. tweets, particularly tweets that are identified as
Blodgett and colleagues do not compare their model abnormal?
to a dataset with known racial identities. Instead, they
turn to well-documented phonological and syntactic 4. Results and discussion
phenomena to address the question of model validity.
Using lexical-level variation, orthographic variation, Figure 1 shows the contingency tables for race,
phonological variation, and syntactic variation, the gender, and intersectional identity compared to the
authors compare results to U.S. Census data and abuse labels. These visualizations show the cross-
Twitter demographics to conclude that the model tabulated counts of each interactive term, for example:
performs effectively for large datasets. Note that there were only 36 tweets marked as hateful that
Blodgett and colleagues only evaluate efficacy along algorithms predicted were written by an Asian woman.
the Black/White binary, despite including other
identities, and cite a lack of sufficient information and Figure 1. Gender and Race Intersection
research regarding other racialized language. Cross-Tabulated Contingency Bubble Chart
Similarly, we applied a gender-predicting model
from Sap et al., 2014, because of its wide applicability
to small text samples, high number of samples, and
accessibility of methodology to be applied to other
datasets. Based on Sap’s own evaluation techniques,
they report a 91.9% accuracy when it comes to
predicting gender. To arrive at this accuracy number,
using a dataset with known demographics of Facebook
message data, they split their data and trained a model
based on roughly 90% of their confirmed data. Next,
they used the model on the remaining 10% of their data
and calculated the number of correct applications of
the model over all applications of the model for a
91.9% accuracy figure. They applied this model to a
variety of social media platforms that use short-form
language, including Twitter, and did not find
significant variation from this 91.9% figure when they
varied platform. Finally, to make this data accessible, A chi-squared test on each of our contingency tables
they published their dictionaries on Github with the – for race, for gender, and for intersectional identity –
code implementing their method and prediction showed that each of our identity interacts significantly
dictionaries. with the labels that Founta’s crowdsourcing method
Following the predictions of race and gender on the provided. Table 1 reports the results of the chi-squared
nearly 100,000 tweets tagged and published by Founta tests for each contingency table.
and colleagues, we are interested in seeing how these
predicted identities correlate with the labels that
Founta added. More specifically, we are interested in

Page 2884

Table 1. Contingency Tables’ P-Values available from authors). We run regressions using
Contingency p-value Interpretation Asian, White, Hispanic, and Black folks as the
Table baselines so that we can compare each identity group
Gender Table 0.003982 Gender interacts with one another.
meaningfully with Additionally, to summarize the performance and
the label provided results of our model, we used an ANOVA to describe
Race Table < 2.2e-16 Race interacts each of the model performances. We recognize that
meaningfully with running an ANOVA on this data partially violates the
the label provided assumption that the outcome of an ANOVA is
Gender & < 2.2e-16 Either gender or normally distributed, running an ANOVA is a
Race Table race, if not the common way to summarize effects of multiple
interaction between characteristics within a categorical variable. Thus, in
them, interacts addition to our regression models, we report ANOVA
meaningfully with results using a likelihood-ratio test to speak on a factor
the label provided level rather than by comparing each individual
demographic subgroup, as is done in the logistic
Table 1 shows the p-values which allow us to reject regression model (full results available from authors).
the null hypothesis for each of our three tables, To showcase the relationships studied, we’ll
meaning that gender, race, and their interaction, have represent our results in several ways. First, we’ll use
meaningful correlation with the label provided by the ANOVA results to ask which identity strata have
Founta and colleagues. There are some interesting meaningful relationships with the labels provided.
things to note here, too: since the size of our dataset is Consider the following significance table:
extremely large, we’re working with a lot of statistical
power. Sample size has a meaningful impact on how Table 2. Tags and Demographic Categories
statistically significant an effect will show up and Gender Race Gender :
seeing that our p-value for the gender table is Race
0.003982, that should indicate to us that the correlation Spam NS *** NS
between predicted gender and the label outcomes that Hateful *** *** NS
we’re interested in is not particularly strong, especially Abusive NS *** **
compared to the predicted race and interaction terms Key: *** > 0.001, ** > 0.01, * > 0.05, NS ≤ 0.05
which both have p-values less than 2.2e-16.
Testing for basic statistical significance using a chi- For all three of our labels, we immediately notice
squared test on contingency tables allows us to reject that race has a meaningful relationship with each of
the hypothesis that demographic predictions have no the labels provided, and that the statistical threshold
interaction with the labels that Founta crowdsourced, for those results is high. In addition to race having a
but it does not allow us to extend much further. If we meaningful relationship with all three terms, gender
wish to comment any further about what is going on, has a meaningful relationship with the hateful tag and
we should perform logistic regressions on each of our the gender race interaction has a meaningful
labels that deviate from the norm (spam, abusive, relationship with the abusive tag. To investigate how
hateful) and look at the results of the logistic different races, genders, and intersectional identities
regression and an ANOVA on the logistic regression. experience likelihood of being proscribed these labels
Next, we used R Studio to create and analyze a differently, we need to look at the logistic regression
logistic regression for each of the non-normal labels analysis results. Because there are a lot of relationships
attached to different tweets. To run logistic regressions to evaluate here, for the sake of simplicity we’ll first
on each of these labelling outcomes, we transform our analyze different racial groups in relation to one
outcome to a binary response. In practice, this looks another, one label at a time. Consider the following
like a binary marker for each of our labels of interest table, which summarizes the different relationships
(spam, abusive, hateful) attached to each tweet. A and our interpretations based on statistical values.
logistic regression seeks to use the factors included to
predict the binary outcome of interest. To include our Table 3. Racial Interaction Summary
two factors of interest (race and gender) and their Relationship Tag Significanc Statistic
interaction, we build three elements into our model: e Comparison
race, gender, and the interaction term. A logistic model Black / Spam *** Z(Black) >
Hispanic Z(Hispanic)
was created for each label of interest and those
outcomes were analyzed separately (full results

Page 2885

Black / Hateful *** Z(Black) >
 Hispanic Z(Hispanic)
 Black / Abusive ** Z(Hispanic) Figure 3. Abusive Logistic Regression
 Hispanic > Z(Black)
 Results
 Black / Spam *** Z(White) >
 White Z(Black)
 Black / Hateful *** Z(Black) >
 White Z(White)
 Black / Abusive *** Z(Black) >
 White Z(White)
 Black / Spam *** Z(Asian) >
 Asian Z(Black)
 Black / Hateful *** Z(Black) >
 Asian Z(Asian)
 Black / Abusive *** Z(Black) >
 Asian Z(Asian)
 Hispanic / Spam *** Z(White) >
 White Z(Hispanic)
 Figure 4. Hateful Logistic Regression Results
 Hispanic / Hateful NS NS
 White
 Hispanic / Abusive *** Z(Hispanic)
 White > Z(White)
 Hispanic / Spam *** Z(Asian) >
 Asian Z(Hispanic)
 Hispanic / Hateful *** Z(Hispanic)
 Asian > Z(Asian)
 Hispanic / Abusive *** Z(Hispanic)
 Asian > Z(Asian)
 White / Spam *** Z(Asian) >
 Asian Z(White)
 White / Hateful *** Z(White) > For each tag (spam, hateful, abusive), we can rank
 Asian Z(Asian) how strongly associated each racial category is with
 White / Abusive *** Z(White) > the label (see Figures 2, 3, 4). Let’s look at the
 Asian Z(Asian) hierarchy for, for example, the spam tag; a tweet that
 Key: *** > 0.001, ** > 0.01, * > 0.05, NS ≤ 0.05 has been tagged as spam is least likely to be predicted
 as written by a Hispanic author and most likely to be
 Nearly every racial interaction in Table 3 is highly predicted as written by an Asian author. The
statistically significant. This outcome could be driven mathematical expressions for each label are, where
by several factors, including our relative statistical P(Race) indicates the probability that a tweet will be
power compared to other studies of this kind or the predicted to be a specific racial identity:
efficacy of the NLP models in comparison to their
predecessors. Even so, these results are surprising and : ( ) < ( ) < ( ℎ ) < ( )
 : ( ) < ( ℎ ) ? ( ) < ( )
very strong. We believe that our process has been : ( ) < ( ℎ ) < ( ) < ( )
accurate, but further studies should approach this
question with scrutiny regarding the extreme There are a few interesting things to note here. First,
significance of our results. almost every relationship is statistically significant:
 except for the probability of White versus Hispanic
 Figure 2. Spam Logistic Regression Results prediction based on the hateful label (represented by a
 question mark), every hierarchical statement above is
 (extremely) statistically significant. But even more so,
 these results are interesting in the way that they
 reinforce and speak to societal inequities.
 The spam tag is defined by Founta as “posts
 consisted of related or unrelated advertising/
 marketing, selling products of adult nature, linking to
 malicious websites, phishing attempts and other kinds
 of unwanted information, usually executed

 Page 2886

repeatedly” [24]. This tag is most strongly connected not much interesting to conclude from these data; the
with the prediction of Asian authorship by comparison interaction terms, however, have yielded a few
to the other racial groups and is least likely to be interesting differences that are worth noting. We’ve
attributed to Hispanic authorship. The idea that represented the results of the interaction terms with
machine learning algorithms attribute what humans significant statistical values in the table below.
consider to be spam to Asian Twitter users more often
is representative of a flaw in our racialized model for Table 4. Race/Gender Interactions with Tags
linguistics. Relationship Tag Significance Statistic
There are notable differences between the abusive Comparison
and hateful labels, however we must again consider Asian Abusive * Z(BM) >
Woman/ Z(AW)
the subjectivity of the labelling scheme. The abusive
Black Man
label is applied more generally, described as “any Asian Abusive ** Z(HM) >
strongly impolite, rude or hurtful language using Woman/ Z(AW)
profanity, that can show a debasement of someone or Hispanic Man
something, or show intense emotion” [25]. By Asian Abusive *** Z(WM) >
comparison, the hateful label is applied specifically to Woman/ Z(AW)
hate speech, defined by Founta and colleagues to be White Man
“Language used to express hatred towards a targeted White Abusive *** Z(WW) >
individual or group, or is intended to be derogatory, to Woman/ Z(AM)
humiliate, or to insult the members of the group, on the Asian Man
basis of attributes such as race, religion, ethnic origin, Black Abusive * Z(BW) >
Woman/ Z(AM)
sexual orientation, disability, or gender” [26]. This Asian Man
specific distinction is difficult to apply uniformly, as Hispanic Abusive ** Z(HW) >
what one individual perceives as generally abusive Woman/ Z(AM)
another may view as specifically hateful. As such, we Asian Man
expected the outcomes for hateful and abusive Key: *** > 0.001, ** > 0.01, * > 0.05
language to be largely similar. Looking at the hateful
tag and how it relates to predictions for different There are several interesting patterns of note shown
racialized groups, we see that it has the strongest in Table 4. First, intersectional identity is only
association with Black-predicted language and the significant in the context of the abusive tag, not the
weakest association with Asian-predicted language. hateful or spam tags. Next, we notice that each
Again, without explicating various stereotypes that are relationship of statistical significance is between an
harmful to different non-dominant ethnic groups, the Asian woman and a non-Asian man or between an
disproportionate treatment of some racial groups Asian man and non-Asian woman. We notice that the
compared to others is alarming. Z statistic for each of our non-Asian intersectional
Finally, looking at the more general abusive tag, we identities is greater than our Asian intersectional
see a similar pattern to the hateful tag. The most identities when compared on a one-to-one basis. When
notable difference between the two is where the we try to understand what to take away from Table 4,
Hispanic-predicted authors lay in comparison to we see that all statistically significant race/gender
others; while Hispanic-predicted authors fall in the interactions have some common threads and are only
middle on probability correlated to hateful language, predictive when it comes to the label of “abusive”.
they are even more likely than Black-predicted authors In summary:
to be correlated with the abusive language tag. We • Race has a meaningful correlation with each of
won’t waste energy laying academic credence to the three tags of interest, and we can form a
discriminatory stereotypes that disparage people of hierarchy of probability to describe the likelihood
color, but the fact that Natural Language Prediction of a flagged tweet being attributed by NLP to a
models predict that abusive and hateful language certain racial group.
comes more often from underrepresented minority • Except for the relationship between Hispanic-
groups is particularly troubling. predicted and White-predicted tweets with a
Beyond the main effect of race that was clearly hateful tag, all racial differentiations were
notable, there were a few interaction terms that were statistically significant.
statistically significant, and we should discuss those • When it comes to abusive tweets specifically,
few notable differentiations. Because the gender terms there is not only a racial effect but an
were inconsistently (and only barely, if at all) intersectional effect when considering Asian-
significant across different baseline groups, there is predicted speech.

Page 2887

• Without getting into the weeds of harmful Bowleg, “when significant main effects exist, the
stereotypes, these differences in attribution to probability of finding significant first order (a two-
certain lexical dictionaries can be troubling. way interaction) or higher order interactions (three,
So, what can we do to combat this attribution of four, and n-way interactions) decreases because the
certain negative characteristics and tags on tweets to significant main effects account for the bulk of the
minority groups? For that discussion, we turn to the variance in the dependent variable” [27]. Perhaps that
question of policy implications and conclusions from is why some of our intersectional conclusions were
this research. limited. Even so, some interesting intersectional
conclusions regarding Asian-identified speech were
5. Conclusions statistically significant as well, and that’s worth
thinking about going forward.
Our findings suggest that there are inherent biases For future research, Twitter and NLP are only one
within either the natural language processing piece of the puzzle that make up algorithm
algorithms or within the training data used to identify development and intersectional discrimination. In a
abnormal language. While our method of analysis general sense, intersectional data collection is
does not allow us to draw conclusions about where this woefully lacking. The challenge of potentially
bias is coming from, bias is being reproduced within misattributed data based only on prediction would be
the construction of the dataset that we have created. mitigated if more information about identity were
The association of white language with normality and collected in studies for the purpose of addressing
non-white language with abuse of the Twitter platform identity effects rather than as a confound to be filtered
may be driven by either the attitudes of the out. More specific to our results, our analysis is
crowdsourced data creators or the predictive correlational and thus does not allow for any causal
dictionaries used to identify race and gender (or, conclusions, we would recommend future directions
likely, some combination of the two). While we cannot of research focus on the development of algorithms
conclude where this bias is coming from, bias is and evaluating their bias from the construction period.
present in our existing systems of identification. This Studying the people that make algorithms and training
bias can lead to problematic associations that drive the data is a future interest of our own research, and
narrative that whiteness is the norm, and anything subsequently we believe that funding this type of
outside of whiteness is a deviation from standards. research would yield interesting and meaningful
What do these analyses show on a broader scale? analyses. Only when we commit to truly and
We cannot parse out the biases of the algorithm’s accurately depicting the inequities in this country, and
creators and trainers compared to real world investigate the algorithms that drive that inequity, can
differences in behavior, but both the biases and real- we start to combat the dangers of the “black box” of
world differences likely impact the outcomes we’ve computing and build algorithms that result in fair and
seen in this study. The directionality of real-world equitable outcomes.
differences would be hard to evaluate given that the
models we use are likely clouded with bias created by References
those who constructed them. While we don’t want to [1] Crenshaw, K. (2015). Demarginalizing the Intersection
of Race and Sex: A Black Feminist Critique of
explicate the harmful stereotypes that our results
Antidiscrimination Doctrine, Feminist Theory and
would reinforce, there are certainly stereotypes at play Antiracist Politics. University of Chicago Legal
in the results that we’ve seen, particularly for non- Forum, 1989(1).
White folks. In the results we’ve seen, tweets that are https://chicagounbound.uchicago.edu/uclf/vol1989/i
flagged as normal (or rather, are not flagged) are most ss1/8
likely to be associated with White linguistic traits; is [2] Hancock, A.-M. (2007). Intersectionality as a Normative
that because dominant American society demands and Empirical Paradigm. Politics & Gender, 3(2),
whiteness as a precursor to normality? We have many 248–254.
more questions than answers here, but the results https://doi.org/10.1017/S1743923X07000062
[3] Weber, L. & Parra-Medina, D. (2003). “Intersectionality
presented today are an interesting step in identifying
and Women’s Health: Charting a Path to Eliminating
biases in NLP predictions and in the flagging that goes Health Disparities” in Gender Perspectives on
into creating training sets like Founta’s. Health and Medicine. Advances in Gender Research,
The main conclusions here are not inherently vol. 7, pp. 204.
intersectional, and do not speak directly to the [4] Warner, L. R. (2008). A Best Practices Guide to
intersectional oppression. As stated in the beginning of Intersectional Approaches in Psychological
this paper, statistical analyses maximizing singular Research. Sex Roles, 59(5), 454–463.
effects minimize their interaction terms: according to https://doi.org/10.1007/s11199-008-9504-5

Page 2888

[5] Hancock, A.-M. (2007). Intersectionality as a Normative [16] Messias, J., Vikatos, P., & Benevenuto, F. (2017).
 and Empirical Paradigm. Politics & Gender, 3(2), White, man, and highly followed: Gender and race
 248–254. inequalities in Twitter. Proceedings of the
 https://doi.org/10.1017/S1743923X07000062 International Conference on Web Intelligence, 266–
[6] Bauer, G. R. (2014). Incorporating intersectionality 274.
 theory into population health research methodology: [17] Burnap, P., & Williams, M. L. (2016). Us and them:
 Challenges and the potential to advance health Identifying cyber hate on Twitter across multiple
 equity. Social Science & Medicine, 110, 10–17. protected characteristics. EPJ Data Science, 5(1), 1–
 https://doi.org/10.1016/j.socscimed.2014.03.022 15. https://doi.org/10.1140/epjds/s13688-016-0072-
[7] Bowleg, L., & Bauer, G. (2016). Invited Reflection: 6
 Quantifying Intersectionality. Psychology of Women [18] Founta, A., Djouvas, C., Chatzakou, D., Leontiadis, I.,
 Quarterly, 40(3), 337. Blackburn, J., Stringhini, G., Vakali, A., Sirivianos,
 https://doi.org/10.1177/0361684316654282 M., & Kourtellis, N. (2018). Large Scale
[8] McCall, L. (2005). The Complexity of Intersectionality. Crowdsourcing and Characterization of Twitter
 Signs: Journal of Women in Culture and Society, Abusive Behavior. Proceedings of the International
 30(3), 1771–1800. https://doi.org/10.1086/426800 AAAI Conference on Web and Social Media, 12(1),
[9] Bambas, L. (2005). Integrating Equity into Health Article 1.
 Information Systems: A Human Rights Approach to https://ojs.aaai.org/index.php/ICWSM/article/view/
 Health and Information. PLOS Medicine, 2(4), e102. 14991
 https://doi.org/10.1371/journal.pmed.0020102 [19] Hube, C., Fetahu, B., and Gadiraju, U. (2019).
[10] Lord, S. M., Camacho, M. M., Layton, R. A., Long, R. Understanding and Mitigating Worker Biases in the
 A., Ohland, M. W., & Wasburn, M. H. (2009). Crowdsourced Collection of Subjective Judgments.
 “Who’s Persisting in Engineering? A Comparative Proceedings of the 2019 CHI Conference on Human
 Analysis of Female and Male Asian, Black, Factors in Computing Systems. Paper 407, 1–12.
 Hispanic, Native American and White Students”. https://doi.org/10.1145/3290605.3300637
 Journal of Women and Minorities in Science and [20] Ibid.
 Engineering, 15(2). [21] Barbosa, N. and Chen, M. (2019). Rehumanized
 https://doi.org/10.1615/JWomenMinorScienEng.v1 Crowdsourcing: A Labeling Framework Addressing
 5.i2.40 Bias and Ethics in Machine Learning. Proceedings
[11] Agénor, M., Krieger, N., Austin, S. B., Haneuse, S., & of the 2019 CHI Conference on Human Factors in
 Gottlieb, B. R. (2014). At the intersection of sexual Computing Systems. Paper 543, 1–12.
 orientation, race/ethnicity, and cervical cancer https://doi.org/10.1145/3290605.3300773
 screening: Assessing Pap test use disparities by sex [22] Sap, M., Park, G., Eichstaedt, J., Kern, M., Stillwell, D.,
 of sexual partners among black, Latina, and white Kosinski, M., Ungar, L., & Schwartz, H. A. (2014).
 U.S. women. Social Science & Medicine, 116, 110– Developing Age and Gender Predictive Lexica over
 118. Social Media. Proceedings of the 2014 Conference
 https://doi.org/10.1016/j.socscimed.2014.06.039 on Empirical Methods in Natural Language
[12] Cairney, J., Veldhuizen, S., Vigod, S., Streiner, D. L., Processing (EMNLP), 1146–1151.
 Wade, T. J., & Kurdyak, P. (2014). Exploring the https://doi.org/10.3115/v1/D14-1121
 social determinants of mental health service use [23] Blodgett, S. L., Green, L., & O’Connor, B. (2016).
 using intersectionality theory and CART analysis. J Demographic Dialectal Variation in Social Media: A
 Epidemiol Community Health, 68(2), 145–150. Case Study of African-American English.
 https://doi.org/10.1136/jech-2013-203120 ArXiv:1608.08868 [Cs].
[13] Evans, C. R., Williams, D. R., Onnela, J.-P., & http://arxiv.org/abs/1608.08868
 Subramanian, S. V. (2018). A multilevel approach to [24] Founta, et. al. (2018), 495.
 modeling health inequalities at the intersection of [25] Ibid.
 multiple social identities. Social Science & [26] Ibid.
 Medicine, 203, 64–73. [27] Bowleg, L. (2008). When Black + Lesbian + Woman ≠
 https://doi.org/10.1016/j.socscimed.2017.11.011 Black Lesbian Woman: The Methodological Challenges of
[14] Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, Qualitative and Quantitative Intersectionality Research. Sex
 S. (2019). Dissecting racial bias in an algorithm used Roles, 59(5), 319. https://doi.org/10.1007/s11199-008-
 to manage the health of populations. Science, 9400-z
 366(6464), 447–453.
 https://doi.org/10.1126/science.aax2342
[15] Mislove, A., Lehmann, S., Ahn, Y.-Y., Onnela, J.-P., &
 Rosenquist, J. (2011). Understanding the
 Demographics of Twitter Users. Proceedings of the
 International AAAI Conference on Web and Social
 Media, 5(1), Article 1.
 https://ojs.aaai.org/index.php/ICWSM/article/view/
 14168

 Page 2889

You can also read