DIGITAL LOCKERS Archiving Social Media Evidence of Atrocity Crimes - Human Rights Center
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
DIGITAL LOCKERS Archiving Social Media Evidence of Atrocity Crimes 2021 Human Rights Center UC Berkeley School of Law
HUMAN RIGHTS CENTER The Human Rights Center at the University of California, Berkeley, School of Law conducts research on war crimes and other serious violations of international humanitarian law and human rights. Using evidence- based research methods and innovative technologies, we support efforts to hold perpetrators accountable and to protect vulnerable populations. We also train students and advocates to research, investigate, and document human rights violations and turn this information into effective action. 2224 Piedmont Avenue, Berkeley, CA 94720 Telephone: 510.642.0965 | Email: hrc@berkeley.edu Humanrights.berkeley.edu | Medium.com/humanrightscenter | @HRCBerkeley Front Cover Photo: A man films a protest against President Sebastián Piñera’s government and police brutality in Santiago, Chile, on February 12, 2021. (Photo by Vanessa Rubilar /SOPA Images/Sipa USA)(Sipa via AP Images). Design and layout: Nicole Hayward
CONTENTS INTRODUCTION / 2 RESEARCH QUESTIONS / 5 METHODOLOGY / 6 BACKGROUND / 7 PART I – THE STAKEHOLDERS / 7 Stakeholders / 7 Stakeholder Relationships / 8 PART II – TYPOLOGY OF DIGITAL ARCHIVES / 10 Social Media Platforms as “Accidental Archives” / 10 Traditional Archives / 11 Digital Archives / 15 Model 1: The Legal Compulsion Model / 17 Model 2: The Voluntary Partnership Model / 23 Model 3: The Independent Collection Model / 29 Model 4: The Hybrid Model / 37 PART III – LEGAL, TECHNICAL, AND OPERATIONAL CHALLENGES / 41 Defining Terms and Scope / 41 Legal Compliance / 43 Automated Detection of Graphic Content / 46 DISCUSSION / 48 RECOMMENDATIONS / 51 CONCLUSION / 53 ACKNOWLEDGMENTS / 54
INTRODUCTION Given the use of social media by people living in minute,4 the challenge is figuring out how to find the areas of armed conflict or severe repression, social “signal” by siphoning out the online “noise,” as well media platforms have become accidental and un- as how to find reliable information buried in a dig- stable archives for human rights content.1 The last ital environment replete with misinformation and two decades have witnessed a fundamental shift in disinformation.5 how people around the world communicate. During While human rights researchers and investiga- this period, the proliferation of smartphones and tors have been pioneering new methods for mining the rise of social media platforms have enabled in- online environments for reliable information,6 they creased identification, collection, and sharing of have frequently found themselves in a race against digital information related to international crimes platforms’ efforts to police their websites.7 Survivors and human rights violations. Whereas human rights and bystanders often post videos and images to so- researchers once struggled to find online content cial media platforms with the hope of alerting the relevant to their investigations, today research- ers may find themselves drowned in a tsunami of content with potential evidentiary value,2 as well 4 “YouTube for Press.” Blog.YouTube, accessed January, 2021, as utility for the documentation of atrocities more https://blog.youtube/press/. generally—including for advocacy, research, and 5 For a helpful overview of various forms of “information disor- development of an historical record of world events. der,” see, Claire Wardle, “Understanding Information Disorder,” With 6,000 tweets generated every second3 and 500 FirstDraft, October, 2019, https://firstdraftnews.org/wp-con- hours of video content uploaded to YouTube every tent/uploads/2019/10/Information_Disorder_Digital_AW .pdf?x76701. 6 Paul Meyers, “How to Conduct Discovery Using Open Source Methods,” in Digital Witness: Using Open Source Infor- mation for Human Rights Investigation, Documentation, and 1 “Removals of Syrian Human Rights Content: May 2019,” Accountability, eds. Sam Dubberley, Alexa Koenig and Daragh Syrian Archive, accessed May, 2021, https://syrianarchive.org/en/ Murray. (New York: Oxford University Press, 2020), 168-199. tech-advocacy/may-takedowns.html. 7 Dipayan Ghosh, “Are We Entering a New Era of Social 2 By evidentiary value, we mean information that may be Media Regulation?,” Harvard Business Review, January, 2021, used to help establish the facts necessary to satisfy the elements https://hbr.org/2021/01/are-we-entering-a-new-era-of-social of crimes or other legal violations in a court of law. -media-regulation. See also, Chloe Mathieu Phillips, “Regu- 3 David Sayce, “The Number of Tweets per Day in 2020.” Da- lating social media: legislation or self-policing?,” The Social vid Sayce, December, 2019, https://www.dsayce.com/social-me- Element, November, 2018, https://thesocialelement.agency/ dia/tweets-day/. regulating-social-media-legislation-or-self-policing 2
world to atrocities on the ground,8 yet companies’ and September 2020, Facebook’s algorithm detected terms, conditions, and community guidelines pro- and “actioned”11 99.5 percent of violent and graphic hibit an important subset of this content, such as vi- content before users reported such content.12 This olent, graphic or sexually explicit imagery.9 In recent pace of detection means that human rights actors years, these platforms have increased their use of are increasingly losing the race to identify and pre- automated tools to detect and remove content that serve information that may have legitimate human violates their terms of service at a rate that outpaces rights and historical value before it is removed.13 human investigators.10 For example, between July Companies have important reasons for removing certain categories of social media content; for ex- ample, propaganda from internationally recognized 8 Sharngan Aravindakshan and Radhika Kapoor, “The Po- terrorist groups or material that sexually exploits tential and Hurdles of Fighting Atrocities in the Age of So- children. Many platforms’ terms of service reflect cial Media,” The Wire, April, 2020, https://thewire.in/tech/ concerns about the privacy and security of platform social-media-atrocities-evidence. See also Belkis Wille, “‘Vid- users, negative user experience, as well as their own eo Unavailable’: Social Media Platforms Remove Evidence of War Crimes,” Human Rights Watch, September, 2020, https:// legal liability.14 However, content removals remain www.hrw.org/report/2020/09/10/video-unavailable/social -media-platforms-remove-evidence-war-crimes. 9 “Community Standards,” Facebook, accessed March, 2021, our extended workforce and the community,” YouTube Cre- https://www.facebook.com/communitystandards/introduc ator Blog, March, 2020, https://youtube-creators.googleblog tion; “Your commitments to Facebook and our community,” .com/2020/03/protecting-our-extended-workforce-and.html Facebook Terms of Service, accessed March, 2021, https://www ?m=1; “Featured Policies: Violent Extremism chapter,” Goo- .facebook.com/terms.php; “Community Standards Enforce- gle Transparency Report, January 2020–March 2020, https:// ment Report,” Facebook Transparency, February, 2021, https:// transparencyreport.google.com/youtube-policy/featured transparency.facebook.com/community-standards-enforce -policies/violent-extremism?hl=en. ment; “The Twitter Rules,” Twitter, accessed March, 2021, 11 Actioned content includes material on Facebook and Face- https://help.twitter.com/en/rules-and-policies/twitter-rules; book Messenger that was covered with a warning label or re- “Transparency,” Twitter, accessed March, 2021, https://transpar moved from the platforms. ency.twitter.com. 12 “Community Standards Enforcement Report,” Facebook 10 For Twitter see, “An update on our continuity strategy Transparency, accessed January, 2021, https://transparency. during COVID-19,” Twitter Blog, March, 2020, https://blog.twit facebook.com/community-standards-enforcement#graphic ter.com/en_us/topics/company/2020/An-update-on-our-con- -violence tinuity-strategy-during-COVID-19.html; “Insights from the 13 See, e.g., Wille, “‘Video Unavailable’: Social Media Platforms 17th Twitter Transparency Report,” Twitter Blog, January, 2021, Remove Evidence of War Crimes.”; Alexa Koenig, “Big Tech https://blog.twitter.com/en_us/topics/company/2020/ttr-17. Can Help Bring War Criminals to Justice,” Foreign Affairs, No- html; Alyssa Newcomb, “Twitter Says A.I. Is Removing Over vember, 2020, https://www.foreignaffairs.com/articles/united Half of the Site’s Abusive Tweets Before They’re Flagged,” For- -states/2020-11-11/big-tech-can-help-bring-war-criminals tune, October, 2019, https://fortune.com/2019/10/24/twitter -justice. -abuse-tweets/. For Facebook see, Jeff King and Kate Gotim- 14 For more information on the type of content subject to er, “How We Review Content,” Facebook Blog, August, 2020, removal from Facebook see, “Community Standards,” Face- https://about.fb.com/news/2020/08/how-we-review-content/; book, accessed January, 2021, https://www.facebook.com/com James Vincent, “Facebook is now using AI to sort content for munitystandards/introduction. See also, “The Twitter Rules,” quicker moderation,” The Verge, November, 2020, https://www Twitter, accessed January, 2021, https://help.twitter.com/en/ .theverge.com/2020/11/13/21562596/facebook-ai-moderation. rules-and-policies/twitter-rules. For more information on Twit- For Instagram see, Jacob Kastrenakes, “Instagram now uses AI ter’s rules on content in violation that are subject to removal. See, to block offensive comments,” The Verge, June, 2019, https:// “YouTube Community Guidelines & Policies—How YouTube www.theverge.com/2017/6/29/15892802/instagram-ai-offen Works,” YouTube Community Guidelines & Policies, accessed sive-comment-filter-launches. For YouTube see, “Protecting January, 2021, https://www.youtube.com/howyoutubeworks/ Digital Lockers | 3
of deep concern to human rights researchers, legal can access the content, as well as a brief overview investigators, and historians, who recognize that of legal obligations, challenges, and end-uses. Part the information posted to social media may include III provides a summary of the broader legal and critical data for proving the elements of crimes and technical context that surrounds the debate around preventing further abuse—and in some cases may what form an evidence locker or other human rights be the only documentation of such events. archive should take, drawing on national and inter- This report provides an overview of various mod- national norms to aid future conversations. We con- els that have previously been used to archive social clude with recommendations for next steps. media content and other digital information. Part I Importantly, this report does not identify or ad- identifies key stakeholders—their missions,values, vocate for a specific model or sketch a single way and interest in the preservation and accessibility forward, but simply illustrates some of the ways that of social media content. Part II establishes a typol- previous digital archives have been constructed. It ogy of archives that have been used as repositories is our hope that this report will advance an ongoing of online digital content. We include at least one and longstanding conversation among human rights case study for each model, and discuss several le- organizations, social media companies, diverse gov- gal and operational considerations that stakehold- ernment actors, researchers and others, and inform ers may want to assess when designing one or more a collaborative and multidisciplinary effort for en- ways forward. Each case study is followed by a brief suring that the preservation of online information summary highlighting who provides, holds, and with human rights value can more effectively serve the goals of legal accountability and justice—as well as the diverse needs of affected communities policies/community-guidelines/ (guidelines on platforms, vid- worldwide. eos, thumbnails, and links that are subject to removal). 4 | Digital Lockers
RESEARCH QUESTIONS Our research was driven by the following questions: 1. What models for archiving digital information—and especially social media content— already exist? 2. How are these models structured, funded, and managed? 3. What lessons can we learn from these models about challenges to and opportunities for pre- serving and archiving online content for evidentiary and other human rights purposes? 4. What legal, political, technical, financial, and operational challenges are likely to arise in the creation of a digital evidence locker or new legal framework—including how to prospectively identify what subset of online information needs saving? 5
METHODOLOGY From January 2020 through June 2020, nine students offer valuable insights into opportunities and chal- from diverse departments on the UC Berkeley cam- lenges associated with creating a digital archive—or pus15 worked with researchers at the Human Rights “evidence locker”—for social media content at risk Center at UC Berkeley School of Law to identify and of deletion. The team identified and analyzed prec- analyze various social media repositories that might edents from disparate but related contexts, such as terrorism and human trafficking, and conducted interviews to fill gaps in the desk research. A sec- 15 The students came from Global Studies, Public Policy, De- velopment Practice, Information Management, Advanced Law ond team of center staff conducted supplemental re- (LLM), Media Studies, Interdisciplinary Studies, Political Sci- search and edited the report between October 2020 ence and a multidisciplinary international program. and May 2021. 6
BACKGROUND Part I – The Stakeholders STAKEHOLDERS 3. Inter-governmental organizations: Inter- governmental organizations (IOs) such as The following are key stakeholders who have an in- United Nations bodies and international courts terest in the preservation and disclosure of social are mandated to investigate violations of inter- media content related to human rights violations: national law. There are several UN investigative mechanisms, fact-finding missions, and com- 1. Content creators, subjects and users: missions of inquiry that investigate violations of Individuals who create and/or upload human international human rights and humanitarian rights content to social media platforms—as law. In addition, international criminal courts well as those depicted in that content—have and tribunals investigate violations of interna- an interest in what happens with the material. tional criminal and humanitarian law. These Importantly, the privacy rights of these individu- entities have an interest in obtaining user-gen- als are likely to be implicated in any sort of digital erated content that can serve as intelligence, locker, as are their interests in legal accountability, lead information, or evidence related to their advocacy, and creating a record of events. investigations. 2. Social media companies: Technology compa- 4. Non-governmental organizations: As the nies that operate social media platforms are largest and most diverse stakeholder group, third-party intermediaries that have an interest non-governmental organizations (NGOs) are in what happens with their users’ data. These interested in the preservation of and access to companies must comply with national and user-generated content on social media plat- international laws, especially those related to forms for use in human rights documentation, data protection and privacy, as well as financial, advocacy, research, and reporting. In some cases, ethical, and operational constraints. These com- NGOs might also be interested in preserving panies have diverse and sometimes conflicting content as potential evidence for future account- obligations to their users and their shareholders. ability processes. Any sort of mechanism or legal Any type of legal mechanism, archive, or digital framework that might serve NGOs will need to locker would likely create obligations for social clearly define which categories of NGOs qualify media companies to preserve human rights as human rights NGOs for purposes of these content and potentially be responsible for its efforts, since the general category of NGO is long-term storage and sharing. very broad. 7
5. Academic researchers: Social science and other At present, content creators and human rights academic researchers have an interest in the practitioners work with social media companies to preservation of and access to human rights address problematic takedowns, but the process is content for the purpose of study and establish- largely informal, inconsistent, and ad hoc. Content ing an historical record. Such research might be creators and human rights practitioners complain conducted for private or public uses. that they are often unable to successfully appeal the removal of human rights content by social media For ease of reference, the term “human rights prac- companies.18 Social media companies complain that titioners” is used to collectively refer to IOs, NGOs, creators and practitioners don’t fully understand and academic researchers, although their interests or appreciate the legal and operational constraints may vary or even conflict. For example, some indi- within which they’re working. Regardless, current viduals and organizations are interested in the con- practice is unsustainable. NGOs often feel as though tent staying on the platform in public view, some are they are functioning as unpaid content moderators interested in it being preserved but not necessarily for some of the most well-resourced companies in made public, and some are interested in having it the world and inequities are created among NGOs, preserved and shared for a range of end uses. some of which have personal relationships with the social media company contacts and some who do not. From their perspective, all too often relevant STAKEHOLDER RELATIONSHIPS human rights content is removed and made perma- nently inaccessible. Social media users and human rights practitioners Adding to earlier challenges, the rate of content have repeatedly reported their dissatisfaction and removal has accelerated as a result of replacing hu- concern with technology companies removing hu- man content moderators with algorithms that auto- man rights-related content from public access with- mate portions of the detection process. The incorpo- out some mechanism for preserving that content ration of machine learning has further increased the outside national law enforcement processes.16 From pace at which content disappears from public view. removing critical documentation of some of the While human rights practitioners may scrape,19 world’s worst atrocities from the public record to si- lencing the voices of survivors, these takedowns may distort the information ecosystem in ways that en- 18 According to Human Rights Watch, “Users [on Facebook hance impunity for perpetrators and minimize the between January and March 2020] appealed takedowns for possibilities of justice for some of the world’s most 180,100 pieces of “terrorist propaganda” content, 479,700 pieces of “graphic violence” content, 1.3 million pieces of “hate speech” egregious crimes—for example, when the original content, and 232,900 pieces of “organized hate” content. Upon or only video or posting of an event is caught in the appeal, Facebook restored access to 22,900 pieces of “terrorist dragnet.17 propaganda” content, 119,500 pieces of “graphic violence” con- tent, 63,600 pieces of “hate speech” content, and 57,300 pieces of “organized hate” content.” See, Ibid. 16 Wille, “‘Video Unavailable’: Social Media Platforms Re- 19 “Web scraping is a process in which machine readable data move Evidence of War Crimes.” is extracted from the HTML lay-out of websites delivered to a 17 Ibid. One example would be videos of extrajudicial killings user’s browser and storing that data locally. This process takes in Libya allegedly perpetrated by al-Werfalli, for whom an arrest data that is meant for display on a user’s device and converts it warrant was issued by the ICC. In that case, several videos were into a format that can be processed. These often require more removed by Facebook from its platform, but preserved by Bell- development time and include less contextual metadata about ingcat prior to removal. each unit, but still make efficient ingestion of large quantities 8 | Digital Lockers
manually download, or otherwise preserve relevant as international judges typically require that social content before its removal, this approach leads to media content be provided directly by the compa- the siloing of content across different groups, mak- nies when used for court purposes to help ensure ing its identification and location by potential end its authenticity. Over the past couple of years, social users extremely difficult.20 From a legal perspective, media companies have begun sharing information even when preserved by external parties, the evi- with various national and intergovernmental in- dentiary value of such information may be diluted, stitutions in order to support international justice processes. of content possible.” Jeff Deutch and Niko Para, “Targeted Mass Archiving of Open Source Information,” in Digital Witness: Us- ing Open Source Information for Human Rights Investigation, Documentation and Accountability, eds. Sam Dubberley, Alexa Koenig, and Daragh Murray, (New York: Oxford University Press, 2020), 273. 20 Of course, there can also be security and other advantages to having a series of dispersed and unconnected archives. Digital Lockers | 9
BACKGROUND Part II – Typology of Digital Archives SOCIAL MEDIA PLATFORMS AS through automation—further exacerbating the “ACCIDENTAL ARCHIVES” clash between the two groups. The deployment of machine learning algorithms to flag and remove People around the world turn to the internet to certain types of posts, such as content deemed vio- share their experiences and bring attention to injus- lent, extremist, or “terrorist,” has resulted in the re- tices, and as a result (and as explained above), social moval of large quantities of content that potentially media platforms have become unintended reposito- offer valuable documentation of alleged human ries of human rights content—both with and with- rights violations, despite human rights advocates’ out potential evidentiary value.21 However, social best efforts to explore options for preserving digital media platforms were not designed to be historical content en masse.22 or evidentiary archives. The companies that oper- It is, of course, in the interest of social media com- ate these platforms must comply with a number of panies and many of their users—including human sometimes conflicting legal and human rights obli- rights advocates—to remove or deprioritize danger- gations and are incentivised by the interests of their ous content or posts, which is the very content that advertisers and shareholders, as well as the differing may be most useful for human rights documenta- perspectives of diverse civil society organizations. tion. Social media platforms do not always pro- These companies also have user guidelines and con- vide human rights practitioners with justifications tent moderation practices that they are expected to or reliable notice of takedowns, further hindering uphold evenhandedly—leading to the need to craft those platforms’ ability to proactively preserve rele- carefully designed policy exceptions. The companies vant content.23 Over the last few years, social media decide whether or not content stays up on their plat- forms and how that content is prioritized for viewing. In recent years, while human rights groups have 22 See e.g., Abdul Rahman Al Jaloud, Hadi Al Khatib, Jeff been searching social media for content relevant to Deutch, Dia Kayyali, and Jillian C. York, “Caught in the Net: atrocities, social media companies have increased The Impact of ‘Extremist’ Speech Regulations on Human Rights the speed at which they remove relevant content Content,” Electronic Frontier Foundation, May, 2019, https:// www.eff.org/wp/caught-net-impact-extremist-speech-regula tions-human-rights-content. 23 Social media companies do not provide warning of take- 21 “Removals of Syrian Human Rights Content,” Syrian Ar- downs when the underlying content violates their policies and chive, accessed January, 2021, https://syrianarchive.org/en/lost legal mandates related to dangerous organizations and child -found/may19-takedowns sexual abuse. In addition, none seem to yet have a comprehen- 10
companies, advocacy and research organizations, In “Theories of the Archive across Disciplines,” for- and international courts have explored whether es- mer MIT Library Collections Strategist and scholar tablishing an external repository or designing a new Marlene Manoff describes the complexity of defin- legal framework could ease these tensions. ing the term archive, and how that definition has There are at least four types of digital archives been both “loosening and exploding”27 as the value that could be used to inform the development of a of archives draws increasing attention for a range of mechanism to preserve digital content for human social functions. Further contextualizing the diverse rights cases and/or other purposes.24 Each offers les- role of archives, Emeritus Professor of Archivistics sons that might inform a constructive way forward. at the University of Amsterdam Eric Ketelaar28 pin- However, before diving into those models and to points the preservation and construction of mem- provide context, we first summarize the traditional ory as key elements of archives. Ketelaar explains role of archives in society and some of the common that shifts in archival policy have allowed societies principles that inform their structure and function. to understand why archives are a critical backbone of collective memory, and as such, have a broader goal than joint legal accountability or journalism. TRADITIONAL ARCHIVES Regardless of end use, however, one of the most im- portant characteristics of an archive is that it holds According to the Society of American Archives, “ar- lasting value as a connecting thread through time. chives . . . are permanently valuable records—such Archives, thus, fulfill a crucial role in society. The as letters, reports, accounts, minute books, drafts, fi- information they safeguard, organize, and make ac- nal manuscripts, and photographs—of people, busi- cessible allows people to exercise their rights, hold nesses and governments.”25 The definition of records institutions and governments to account, establish is “information created, received, and maintained historical narratives, protect evidence for later le- as evidence and information by an organization or gal processes, and preserve information for future person, in pursuance of legal obligations or in the generations. transaction of business.”26 Archives are the result of Archival science is an ever-changing discipline active roles and processes, and archivists are those that builds upon set methodologies,29 while adapt- who manage, maintain, and preserve the human re- ing to new technologies. As Ketelaar puts it, in “lib- cords that are generated as a product of individuals’ erating the file from the one and only context of the and societies’ day-to-day living. record creator” we allow for different perspectives in which “the subject of the record” can also become sive system for identifying human rights defenders or organi- zations. 27 Marlene Manoff, “Theories of the Archive from Across the 24 In this report, we use the term “human rights cases” to Disciplines,” Libraries and the Academy 4, no. 1 (2004), 9-25, mean all legal cases that allege violations of human rights, hu- doi:10.1353/pla.2004.0015. manitarian and international criminal law. 28 Eric Ketelaar, “Archives as Spaces of Memory,” Journal 25 “What Are Archives?,” Society of American Archivists, of the Society of Archivists 29, no. 1, (2008): 9–27, https://doi September, 2016, https://www2.archivists.org/about-archives. .org/10.1080/00379810802499678. 26 “Technical Committee ISO/TC 46, Subcommittee SC 11. 29 See, e.g., Yvonne Ng, “How to Preserve Open Source In- ISO 30300:2011(En), Information and Documentation — Man- formation Effectively,” in Digital Witness: Using Open Source In- agement Systems for Records — Fundamentals and Vocabulary. formation for Human Rights Investigation, Documentation and 30300, 2011, 3.1.7.,” Online Browsing Platform, accessed May, Accountability, eds. Sam Dubberley, Alexa Koenig, and Daragh 2021, https://www.iso.org/obp/ui/#iso:std:iso:30300:ed-1:v1:en. Murray, (New York: Oxford University Press, 2020), 259–287. Digital Lockers | 11
a “party to the record.” As social media platforms take root across disparate geographies, multiple individuals are able to contribute to the documen- tation of events with their stories, testimonies, and experiences. Today, societies face the challenge of preserving digital collections with the same rigor, archival au- thority, access, and meaning attached to analog ar- chives, while using information and communication technologies to reach a broader set of actors and create a more “participatory historical culture”30 than pre- viously possible. These are significant challenges and are a subject of much discussion in archival science. Analog archives raise many ethical and philo- sophical considerations that continue to be relevant in the digital age. For example, critical theory con- tributes an extensive literature to archival studies. Much of this is beyond the confines of this report, documents are included. It is crucial to consider but elements of Philosopher Jacques Derrida’s writ- who is selecting the materials for preservation, ings are instructive for creating an archive or other whose voices are represented, and whose are ob- repository that honors human dignity and upholds scured. Historically, archiving was a means to con- human rights. Lecturer and historian of science solidate power,33 and archives were generally created Elizabeth Yale explains that for Derrida “violence and controlled by powerful individuals, groups, and [is] at the heart of archiving: when memories and institutions. Whenever an archive is created, power stories are recorded in the archive, alternate possi- dynamics are at play—a fact that the broader hu- bilities, other ways of telling the story, are repressed man rights community should center in its on-going or suppressed.”31 By necessity, archiving involves conversations.34 the exclusion of certain documents, and with that, Yale asserts that “no archive is innocent,”35 that the exclusion of particular voices and understand- regardless of the original intent behind an archive, ings.32 Determining which documents should be archives can be harnessed in multiple and unantic- kept external to the archive is as important as which ipated ways. The same records can be used for ter- ror or justice, depending on who mobilizes them.36 An example of this is the archive of the Ministry of 30 Digital storytelling started in the 1990s at the Center for Digital Storytelling in California which inspired, among others, State Security of the former German Democratic the BBC’s Capture Wales project and BBC’s Northern Ireland Republic (East Germany), known colloquially as Story Finders. Ketelaar, “Archives as Spaces of Memory,” 9–27. the “Stasi” records. In that context, records that 31 Elizabeth Yale, “The History of Archives: The State had been gathered by East Germany’s secret police of the Discipline,” Book History 18, (2015): 334. doi:10.1353/ bh.2015.0007. 32 For a discussion of the ways in which digital technologies and their preservation and use for investigations implicate power, see, 33 Yale. “The History of Archives: The State of the Discipline,” 332. e.g., Alexa Koenig and Ulic Egan, “Power and Privilege: Investigat- 34 Ibid. ing Sexual Violence with Digital Open Source Information,” Jour- 35 Ibid. nal of International Criminal Justice, (2021). 36 Ibid., 346. 12
later helped build the historical record and collec- including international crimes in Sudan, Bosnia, tive memory of life in East Germany from the 1950s Guatemala, East Timor, Iraq40 and Rwanda.41 through the 1980s—including revealing the fate of These examples demonstrate the challenges that people terrorized during that period. may arise when documents created for one purpose Similarly, in 2005, delegates from the Guatemalan (for example record-keeping related to employment Procurator for Human Rights discovered the conflicts with later uses, such as accountability, but Guatemalan National Police Archives in an aban- doned warehouse.37 Originally developed for book- keeping purposes, these archives are now being 40 The Iraq Memory Project was founded by Kanan Makiya used to identify the role of the national police in the and funded by the U.S. government to collect and preserve doc- Guatemalan Civil War.38 Today, forensic teams are uments from Saddam Hussein’s Ba’athist government of 1968 to still archiving the documents to use for historical 2003. In 1991, founder Makiya and a BBC filmmaker traveled to Iraq to collect and archive documents that had been seized by and legal purposes. Iraqi rebels relating to the Ba’ath party, including the Iraqi gov- Similar examples have played out in relation to ernment’s campaign of ethnic cleansing of Iraqi Kurds. The Iraqi the case against Democratic Kampuchea’s (pres- Memory Project consists of the following: A Documentation ent-day Cambodia’s) 1975–1979 genocide during Project, an Oral History Project, a Public Outreach Program, the Khmer Rouge regime. The Yale Cambodian a Research Program, a Liaison and Coordinating Center, and a Genocide Project (GCP), started by Yale’s History “Placing the Iraqi Experience” project. The entire collection is Professor Ben Kiernan in 1994, holds more than available to the public at Stanford’s Hoover Institute, including approximately ten million digitized pages and one hundred vid- 100,000 documents, photographs, and maps to sup- eo files of the Ba’ath Arab Socialist Party of Iraq. A controversy, port trials of top Khmer Rouge leaders. The intent however, is who should have custody over the archived material. of the documentation was “to determine who was In 2008, the director of the Iraq National Library and Archive primarily responsible for the tragedy.”39 in Baghdad and acting Minister of Culture stated that the docu- Collections of testimonial and documentary evi- ments of the Ba’athist government were unlawfully seized from dence have also informed other international trials, Iraq and should be returned. Additionally, the Society of Amer- ican Archivists and Association of Canadian Archivists agree including those related to more recent atrocities, that the archived materials should not be held by a private or- ganization. The two organizations issued a joint statement say- ing that the gathering of those documents was an act of pillage forbidden by 1907 Hague Convention and “should be returned to the government of Iraq to be maintained as part of the offi- cial records in the National Library and Archives.” See, “ACA/ SAA Joint Statement on Iraqi Records,” Society of American Archivists, April, 2008, https://www2.archivists.org/statements/ acasaa-joint-statement-on-iraqi-records. For more informa- tion about the Iraq Memory Project, see, “About,” Iraq Mem- 37 Kate Doyle, “The Guatemalan Police Archives: National ory Project, accessed March, 2021, http://www.iraqmemory Security Archive Electronic Briefing Book No. 170,” The Na- .com/en/about; Renee Montagne, “Iraq’s Memory Foundation: tional Security Archive, November, 2005, https://nsarchive2. Content in Culture,” NPR, March, 2005, https://www.npr.org/ gwu.edu/NSAEBB/NSAEBB170/index.htm. templates/story/story.php?storyId=4554528; Hugh Eakin, “Iraqi 38 Kirsten Weld, Paper Cadavers: The Archives of Dictatorship Files in U.S.: Plunder or Rescue?,” The New York Times, July, 2008, in Guatemala, (Durham: Duke University Press, 2014), https:// https://www.nytimes.com/2008/07/01/books/01hoov.html; www.dukeupress.edu/paper-cadavers. “Iraq,” Hoover Institution, March, 2021, https://www.hoover. 39 Alaina Varvaloucas, “Three decades later, justice for geno- org/library-archives/collections/iraq. cide victims,” The Yale Herald, September, 2008, http://gsp.yale. 41 “Interview: Documenting Year Zero,” POV, July, 2003, http:// edu/sites/default/files/varvaloucas_yale_herald_09.26.08.pdf. archive.pov.org/thefluteplayer/interview-documenting-year-zero/ 13
the original documentation wasn’t conducted in information packages (the information shared with ways that were “fit for [later] purpose.” This disjunct users). sometimes surfaces questions around the ethics, le- In addition, various properties related to the gality and/or utility of using information gathered or content must be protected and preserved, includ- created for one purpose, for another. This “dual use” ing the item’s authenticity (ensuring that an item also raises the concern: how to keep any archival re- remains unchanged), availability (through ongoing pository from becoming a means of inappropriate existence and retrievability), identity (using a sys- surveillance and other overreach by government ac- tem to make the items identifiable and distinguish- tors and others who might be hostile to the human able from other items, as with a unique identifier), rights concerns of those whose data is included? persistence (the technical integrity and viability of From a logistical perspective, Yale advocates for a digital item), renderability (the ability of humans consistently following archival science principles and / or machines to use the digital item), and un- when creating any repository,42 or “thinking archi- derstandability (a human’s ability to interpret or vally.”43 This is a process that requires understand- “understand” the digital item). For legal purposes, ing the context and conditions under which each archivists should also consider the need to maintain document was created.44 Archival thinking also chain of custody (logging who has had access to the “demands that we see archives not only as sources digital item and when, and what precautions have of data to be mined by researchers but also as more been taken to avoid alteration), as well as the impor- than the sum of their parts.”45 tance of keeping working copies that can be mod- Yvonne Ng of WITNESS has explained the basic ified separate from evidentiary copies that cannot, components of digital preservation, including how and the feasibility and sufficiency of both long and archival principles relate to digital documentation. short-term storage.47 Citing the Reference Model for an Open Archival Those engaging in the practice of digital pres- Information System, which establishes an interna- ervation must think through every aspect of the tional standard for archivists, she notes that “while process of archiving. In addition to the items listed an archive’s preservation strategies must be cus- above, this includes maintaining a sensitivity to tomized to its circumstances, the nature of its col- context and adherence to diverse principles, and re- lections, and the needs of its intended users, there specting the relative fragility of digital information. are established guidelines that describe [those basic They should also consider what might happen if the components].”46 These include thinking about how archive is co-opted by those with malintent, prob- information is bucketed into one of three “pack- lematically deployed by malicious actors, or used by ages,” including the submission information pack- those with the best of intentions who may find that age (which is used for transporting information into the archiving of information results in unintended an archive), archival information packages (the in- consequences. formation stored in the archive), and dissemination 42 Ibid. 43 Yale. “The History of Archives: The State of the Discipline,” 345. 44 Weld, Paper Cadavers: The Archives of Dictatorship in Gua- 47 See, e.g, UC Berkeley Human Rights Center and UN Human temala, 13. Rights Office, “Berkeley Protocol on Digital Open Source Investi- 45 Ibid. gations,” UN Office of the High Commissioner for Human Rights, 46 Ng, “How to Preserve Open Source Information Effectively.” 2020, 202. 14 | Digital Lockers
DIGITAL ARCHIVES b. Whether the social media company provides the content to an external organization or Digital archives contribute additional qualitative repository; and quantitative challenges to the continuity and c. Whether an external organization down- security challenges of traditional archives. The loads the content and holds the content on first difference is the volume of potential content. their own servers or servers to which they Because of the extraordinary scale of potentially have access; and relevant data, archivists need tremendous resources and storage space to identify and preserve relevant d. What content is included, since the type data. In addition, digital archives must adjust for of content will often dictate where it goes shifting dynamics between the producers of content, (for example, child sexual exploitation ma- the subjects of content, the holders of content (e.g. terial being held by the National Center tech companies, journalists, state actors), and codes for Missing and Exploited Children versus of conduct for processing human rights-related con- hashes of violent extremist content being tent in ways that respect privacy laws. shared with the Global Internet Forum to In this section, we build off the background pro- Counter Terrorism). vided above to provide a typology of four different types of digital “archives” that have previously been 3. Legal obligation: This variable focuses on whether used to aggregate and/or preserve digital content— social media companies are legally required to including but not restricted to social media content. preserve and/or share the content and/or protect We illustrate those examples with a series of case the privacy of the content, versus whether partic- studies, briefly describing each archive’s history, ipation is voluntary. ownership, structure, financing, and user-accessi- bility, to explore whether any of these examples may 4. Who may access the content: The scope of access inform the feasibility of creating one or more digital varies significantly between the different models. evidence lockers. The case studies are discussed and However, the range of possibilities tends to clus- categorized based on the following criteria: ter into the three “buckets” identified below: a. Private: Only the social media company 1.`Who provides the content: This can be social me- and/or law enforcement can access the us- dia companies using their own internal processes er-generated content; or NGOs and other external actors who may scrape or manually download content from plat- b. Subscribers: Members of the public can re- forms. This also includes content creators (for ex- quest and/or otherwise qualify for access ample, those who take video recordings or pho- (e.g. by registration, payment, or other pro- tographs of human rights content) and uploaders cess); or who act as intermediaries between the content c. Public: Anyone from the general public can creator and repository. access the user-generated content so long as 2. Who holds the content: This varies based on: they have access to the internet. a. Whether the social media company holds the information on servers it controls; In some instances, a single repository may pro- vide differential access to each of these groups, with varying requirements for how the information can be used. Digital Lockers | 15
16 | Digital Lockers
After considering these variables, we found that Model. There is considerable overlap between the the digital archives we examined tended to fall into case studies we present, and as such, these models one of the following types. Each type varies in terms are neither exclusive nor exhaustive. of how it maintains the integrity of the content, Ultimately, the strengths and weaknesses of each which can have a bearing on the ability to authenti- of these models for human rights evidentiary pur- cate the content at trial. For clarity, we organized the poses can only be assessed on the basis of articulated case studies into four models: the Legal Compulsion end goals, and are thus context-specific. Below, we Model, the Voluntary Partnership Model, the provide examples of each of the models. Independent Collection Model, and the Hybrid The National Center for Missing & children, reduce child sexual exploitation, and prevent Exploited Children (NCMEC) child victimization.49 NCMEC has been heavily sup- ported by the Office of Justice Programs (OJP), a divi- The National Center for Missing & Exploited Children sion of the United States Department of Justice. To give (NCMEC) exemplifies a model where social media a sense of funding and scale, OJP awarded NCMEC $33 companies are required to share content with an exter- million in the 2019 fiscal year to assist its operations.50 nal repository under the force of law. The private, non- profit organization was established by the United States Congress in 198448 and is mandated to help find missing 49 U.S. Department of Justice, “The National Center for Miss- ing and Exploited Children,” Office of Juvenile Justice and De- linquency Prevention, September, 2019, https://ojjdp.ojp.gov/ 48 NCMEC is mandated by 42 U.S.C. §§5771 et seq.; 42 U.S.C. funding/awards/2019-mu-mu-k012. §11606; 22 C.F.R. §94.6. 50 Ibid. Digital Lockers | 17
Given its design and public-private partnership such occurrence. Second, electronic communication structure, NCMEC has access to several databases service providers that detect child sexual abuse im- of content provided by its external partners, in- agery on their services can report that information cluding social media companies and the Federal to NCMEC.54 After NCMEC receives the report, an Bureau of Investigation (FBI).51 The following case analyst “reviews, augments, and deconflicts” the re- study focuses on NCMEC’s core responsibilities of port.55 For U.S. cases, NCMEC then sends the infor- managing its CyberTipline and Child Victimization mation to United States federal or state law enforce- Identification Program (CVIP), both of which pre- ment. For cases outside the United States, NCMEC date the Internet and had to be adapted to a social works with Interpol, Europol, and national police.56 media context. The CyberTipline acts as a clearing- In 2018, the CyberTipline handled 18.4 million house for complaints of child sexual exploitation reports, yet the number of individual videos and and child pornography. The CyberTipline collects images that were reported reached as high as 70 information regarding child sexual abuse imagery million.57 The majority of the reports came from and distributes this data to relevant law enforce- electronic communication service providers. Many ment agencies. CVIP is a central database of images companies rely on software such as Microsoft’s depicting identified child victims.52 Electronic com- PhotoDNA to find and remove images of child ex- munication service providers are legally required to re- ploitation on their platform.58 Electronic service port images of identified child victims to NCMEC and providers are, however, not mandated to search their retain related child sexual abuse imagery information platforms for such content, so participation varies for approximately 90 days under the Our Children significantly. For example, in 2019 Facebook volun- Act of 2008.53 tarily submitted over 85 percent of all CyberTipline There are two ways in which child sexual abuse reports, in part because they are one of few social material can enter the CyberTipline. First, members media companies that actively and voluntarily in- of the public who observe someone accessing or dis- spect uploaded content for imagery related to child seminating child sexual abuse imagery can report sexual abuse.59 51 Elie Bursztein, Travis Bright, Michelle DeLaune, David Eliff, Nick Hsu, Lindsey Olson, John Shehan, Madhukar Thak- 54 Bursztein et. al., “Rethinking the Detection of Child Sexual ur, and Kurt Thomas, “Rethinking the Detection of Child Sexu- Abuse Imagery on the Internet.” al Abuse Imagery on the Internet,” Proceedings of the 2019 World 55 Ibid. Wide Web Conference (WWW ’19), May, 2019, https://elie.net/ 56 “National Strategy for Child Exploitation Prevention and static/files/rethinking-the-detection-of-child-sexual-abuse Interdiction,” U.S. Department of Justice, April, 2016, https:// -imagery-on-the-internet/rethinking-the-detection-of www.justice.gov/psc/national-strategy-child-exploitation -child-sexual-abuse-imagery-on-the-internet-paper.pdf. -prevention-and-interdiction. 52 “CyberTipline: Is a Child Being Exploited Online?,” Na- 57 Caren Harp, “Visiting Our Partners at the National Center tional Center for Missing & Exploited Children, accessed Oc- for Missing & Exploited Children,” Office of Justice Programs, tober, 2020, https://www.missingkids.org/gethelpnow/cyber August, 2019.; Gabriel J.X. Dance and Michael H. Keller, “Tech tipline.; “Privacy Impact Assessment (PIA) Child Victim Iden- Companies Detect a Surge in Online Videos of Child Sexual tification Program (CVIP) Innocent Images National Initiative Abuse,” The New York Times, February, 2020, https://www.ny (IINI),” Federal Bureau of Investigation, May, 2003. times.com/2020/02/07/us/online-child-sexual-abuse.html. 53 Legislation Sponsored by Senator Biden, Joseph, “S.1738 https://www.ojp.gov/news/ojp-blogs/2019/visiting-our-part - 110th Congress (2007-2008): PROTECT Our Children Act ners-national-center-missing-exploited-children. of 2008,” Legislation of Library of Congress, October, 2008, 58 Ibid. https://www.congress.gov/bill/110th-congress/senate-bill/1738. 59 Ibid. 18 | Digital Lockers
The Child Victim Identification Program (CVIP) More than 80 percent of this material is reported is a component of the Innocent Images National only once, with a rate of one million reports submit- Initiative, which is part of the FBI’s Cyber Crimes ted per month.62 Scale is a problem, especially given Program.60 CVIP seeks to identify the victims of NCMEC’s use of relatively outdated technology those who commit sexual exploitation of children, and the limitations of manual review of collected and serves as a central repository for images de- content. Another complexity that NCMEC faces is picting child victims. Through this program, CVIP offenders’ adoption or use of new technologies to assists field offices in their efforts to identify new mask their identities. Offenders have been cover- child pornography victims in CyberTipline re- ing their digital footprints by connecting to virtual ports to reduce duplicate investigative efforts. Any private networks (VPNs), deploying encryption evidence obtained by the field offices is compared techniques, and using the Dark Web. For example, with existing datasets via hash values. Since 2002, Facebook reported nearly 60 million photos and this program has processed more than 149 million videos of child sexual abuse imagery in 2019, most pieces of digital content of alleged child pornogra- of which was found in its private Messenger App.63 phy,61 securing the following data associated with These hurdles present an overwhelming pressure on victims (if available): identification number, inter- NCMEC’s manual review capabilities and, in return, net nickname, date of birth, age at the time of the law enforcement investigations.64 photograph, gender, citizenship, nationality, iden- tifying officer name, and identifying officer contact Summary: details. Other data may include physical characteris- • End-uses: Help find missing children, reduce child tics such as height, weight, hair color, and eye color. sexual exploitation, prevent child victimization, With the advent of social media platforms, and distribute information to law enforcement NCMEC has been grappling with several challenges agencies for potential prosecution. that are overwhelming its capabilities. One of the • Who provides the content: The FBI and elec- major obstacles is the fast-growing number of child tronic communication service providers, including sexual abuse imagery reports and content. In 2017 alone, 9.6 million reports of child sexual abuse imag- ery were sent to NCMEC, which constituted around 62 Bursztein et al., “Rethinking the Detection of Child Sexual 40 percent of NCMEC’s total cases across its history. Abuse Imagery on the Internet.” Also worth grappling with are This challenge is compounded as new child sex- the differences in how civil society organizations conceptualize ual abuse imagery content is constantly surfacing. “scale” versus social media companies and what that means for pragmatic responses to the preservation of information at risk of removal that has important human rights value. 63 Kate Duffy, “Facebook’s Encryption Plans Will Make It 60 “Privacy Impact Assessment (PIA) Child Victim Identi- Harder to Catch Child Sex Abusers, Governments Warn,” Business fication Program (CVIP) Innocent Images National Initiative Insider, October, 2020, https://www.businessinsider.com/face (IINI),” Federal Bureau of Investigation Services, May, 2003, book-encryption-harder-catch-criminals-child-abuse-2020-10. https://www.fbi.gov/services/information-management/foipa/ See also Andy Greenberg, “Facebook Says Encrypting Mes- privacy-impact-assessments/cvip. senger by Default Will Take Years,” WIRED, January, 2020, 61 Melissa Stroebel and Stacy Jeleniewski, “Global Research https://www.wired.com/story/facebook-messenger-end-to-end Project: A Global Landscape of Hotlines Combating Child Sex- -encryption-default/. In its current version Facebook Messen- ual Abuse Material on the Internet and an Assessment of Shared ger offers the Secret Conversations feature which allows users Challenges,” National Center for Missing & Exploited Children, to opt into end-to-end encryption. 2015, 3, https://www.missingkids.org/content/dam/missingkids/ 64 Bursztein et al., “Rethinking the Detection of Child Sexual pdfs/ncmec-analysis/grp.pdf. Abuse Imagery on the Internet.” Digital Lockers | 19
You can also read