Opening up and Sharing Data from Qualitative Research: A Primer
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
WEIZENBAUM SERIES \17 HANDOUT \ JULY 2021 \ Isabel Steinhardt et al. Opening up and Sharing Data from Qualitative Research: A Primer Results of a workshop run by the research group „Digitalization and Science“ at the Weizenbaum Institute in Berlin on January 17, 2020
Opening up and Sharing Data from Qualitative Research: A Primer Isabel Steinhardt \ International Centre for Higher Education Research (Incher-Kassel), University of Kassel, steinhardt@incher.uni-kassel.de Caroline Fischer \ University of Potsdam, caroline.fischer.ii@uni-potsdam.de Maximilian Heimstädt \ Weizenbaum Institute and Berlin University of the Arts, m.heimstaedt@udk-berlin.de Simon David Hirsbrunner \ Freie Universität Berlin, hirsbrunner@zedat.fu-berlin.de Dilek İkiz-Akıncı \ Research Data Centre for Higher Education Research and Science Studies (RDC- DZHW), ikiz@dzhw.eu Lisa Kressin \ University of Luzern, lisa.kressin@unilu.ch Susanne Kretzer \ Research Data Center (RDC) Qualiservice, skretzer@uni-bremen.de Andreas Möllenkamp \ University of Hamburg, andreas.moellenkamp@uni-hamburg.de Maike Porzelt \ DIPF, Leibniz-Institut for Research and Information in Education, porzelt@dipf.de Rima-Maria Rahal \ Department of Social Psychology, Tilburg University, r.m.rahal@tilburguniversity.edu Sonja Schimmler \ Weizenbaum Institute and Fraunhofer FOKUS, sonja.schimmler@fokus.fraunhofer.de René Wilke \ Technische Universität Berlin, rene.wilke@tu-berlin.de Hannes Wünsche \ Weizenbaum Institute and Fraunhofer FOKUS, hannes.wuensche@fokus.fraunhofer.de ISSN 2748-5587 \ DOI 10.34669/WI.WS/17 EDITORS: The Managing Board members of the Weizenbaum-Institut e.V. Prof. Dr. Christoph Neuberger Prof. Dr. Sascha Friesike Prof. Dr. Martin Krzywdzinski Dr. Karin-Irene Eiermann Hardenbergstraße 32 \ 10623 Berlin \ Tel.: +49 30 700141-001 info@weizenbaum-institut.de \ www.weizenbaum-institut.de TYPESETTING: Roland Toth, M.A. COPYRIGHT: This series is available open access and is licensed under Creative Commons Attribution 4.0 (CC-BY_4.0): https://creativecommons.org/licenses/by/4.0/ WEIZENBAUM INSTITUTE: The Weizenbaum Institute for the Networked Society – The German Internet Institute is a joint project funded by the Federal Ministry of Education and Research (BMBF). It conducts interdisciplinary and basic research on the changes in society caused by digitalisation and develops options for shaping politics, business and civil society. This work has been funded by the Federal Ministry of Education and Research of Germany (BMBF) (grant no.: 16DII121, 16DII122, 16DII123, 16DII124, 16DII125, 16DII126, 16DII127, 16DII128 – “Deutsches Internet-Institut”).
Opening up and Sharing Data from Qualitative Research: A Primer Table of Contents 1 Foreword 5 2 Introduction 5 3 Opening up and sharing data 7 3.1 What is qualitative data? 7 3.2 What is actually meant by opening up and sharing? 7 3.3 What arguments speak in favor of opening up and sharing my data? 8 3.4 What arguments speak against opening up and sharing my data? 8 4 Data management plan 9 4.1 What is a data management plan? 9 4.2 What role does a data management plan play in opening up data? 9 4.3 Where can I find help creating a data management plan? 10 5 Data generation 10 5.1 Where can I find qualitative research data for re-use? 10 5.2 Who is the author of primary, contextual data or collections? 10 5.3 Can the idea of opening up and sharing data distort my data? 11 5.4 What do I have to consider if I want to open up and share my data? 11 5.5 What should I pay attention to, in terms of data protection? 11 5.6 What should I consider regarding obtaining consent? 12
Opening up and Sharing Data from Qualitative Research: A Primer 6 Contextualization and data preparation 12 6.1 Why should I contextualize my data? 12 6.2 How can I contextualize my data? 13 6.3 How should I prepare my data for reuse? 13 6.4 Can I publish contextual data that is generated during data collection and analysis? 14 7 Research data centers and repositories 14 7.1 Where can I archive my data? 14 7.2 What do research data centers (RDCs) do? 14 7.3 How do I find the right RDC for my qualitative data? 15 7.4 When should RDCs ideally be contacted? 15 8 Conclusion 16 9 Literature 16 9.1 Methodological discussion on opening and sharing qualitative data 16 9.2 Qualitative data 19 9.3 Data privacy and copyright 19 9.4 Referencing data 20 9.5 Statements 20 9.6 Biases of the research process 20
Opening up and Sharing Data from Qualitative Research: A Primer \5 1 Foreword The call for free access to research data and mate- primarily to qualitatively researching scientists in rials is becoming louder and louder from the polit- Germany. For this reason, it was initially written ical and scientific communities in Germany. More in German. One year later, we have now decided and more researchers are facing demands to open to translate the handout into English as well. The up qualitative research data for scientific purposes. reasons are twofold: first, we want to make it acces- They often have a general interest in sharing their sible to researchers in Germany with little knowl- data, but are unsure how to proceed. This handout edge of German. Second, we also want to give in- was developed to provide an initial introduction to terested people outside Germany an insight into the opening and sharing qualitative data. It was devel- German system and the German discussion about oped at a workshop held in Berlin in January 2020, opening up and sharing qualitative data. Due to the organized by the research group „Digitization of objectives and the history of its development, the Science“ of the Weizenbaum Institute, together handout focuses on the German context. This in- with its associate researcher Dr. Isabel Steinhardt cludes the literature references and further sources, from the University of Kassel. The workshop in- and the references to research data centers as well volved staff from German research data centers as as legal issues. We have deliberately not included a well as mentees and mentors from the Fellow Pro- contextualization of the German situation in inter- gram Open Science1 who already have experience national discussions in order to keep the handout as with Open Science, qualitative research, and in- short as possible. terdisciplinary research. The handout is addressed 2 Introduction The demand for research data and materials to be qualitative data types, field-specific methodolog- made accessible has become louder in recent years, ical approaches and the relationship between re- driven by political actors and funders, but also searcher and participant. An increasing number of by science itself, e.g. through the Open Science researchers now face demands to open up qualita- movement. The term open science includes vari- tive research data for scientific purposes, but even ous sub-areas such as Open Access, Open Source if they have a general interest in opening up their or Open Educational Resources. In the following, data, they also face the associated challenges, mak- we will focus specifically on the sub-area of Open ing it difficult to know how best to proceed. Data, i.e., the accessibility of data. While the prin- ciple of accessible data has already taken root in The aim of this primer is to answer basic introducto- quantitative social research, qualitative research is ry questions on the subject of opening up and sharing faced with other challenges in making data acces- qualitative research data. By opening up and sharing sible, for which solutions must be found in cooper- data, we mean making data accessible for second- ation with all the relevant actors. These solutions ary use for other researchers. In this guide, we will must, for example, do justice to the specifics of address the differences between different research 1 https://en.wikiversity.org/wiki/Wikimedia_Deutschland/Open_Science_Fellows_Program/Program
Opening up and Sharing Data from Qualitative Research: A Primer \6 fields and their specific requirements in dealing with In addition, the various types of data (e.g. images, qualitative research data, taking into account the dif- videos, interviews, observations, documents) and ferences between different types of data (e.g., inter- the different data collection and evaluation proce- view transcripts, video recordings, observation logs, dures in the various fields must be taken into ac- primary and context data). For a fundamental discus- count if research data is to be opened up and shared. sion of the concept of (qualitative) research data, we The individual researcher also faces the challenge refer the reader to the relevant specialist literature of deciding whether and which of their own data (e.g., Corti 2000; DGS 2019; Hirschauer 2014; Hol- can be shared and be made available for reuse, and lstein & Strübing 2018; Kretzer 2013). how this should be done. To ensure that data is safe to be reused, it is also crucial to be able to fall back Generally, there are reasons that speak in favor of on technical expertise, practical experience, rou- opening up and sharing, but also concerns and risks tines and corresponding infrastructures for open- (Akademie für Soziologie 2019; Birke & May- ing up and sharing (e.g., via research data centers), er-Ahuja 2017; Corti & Thompson 2006; Dunkel, which, however, have not yet been developed for Hanekop & Mayer-Ahuja 2019; Gebel, Rosenbohm all data types and research processes. Qualitative & Hense 2017; Kühn 2006; Laudel & Bielick 2019; data cannot simply be made available for second- Richter & Mojescik 2019; Rosenbohm, Gebel & ary use by following the existing and established Hense 2015; Steinhardt 2018; von Unger 2018). infrastructures for quantitative research data (DGS Arguments in favor of opening up and sharing data 2019; Knoblauch 2013; RatSWD 2015; von Ung- for primary research are citation boosts for citable er 2018). Rather, they require specific (including data sets and research results, the increased visibil- methodological, data protection and research eth- ity of the empirical data collection process, and the ical) requirements for subsequent use, which we possibility of networking. Secondary researchers will discuss in more detail below. It should be em- can use open data to find contrasting data or addi- phasized that the opening up and sharing of qual- tional data for their own research without dedicat- itative research data must not be a unilateral deci- ing resources to the generation of new data for these sion by funders and political actors. purposes. The secondary use of research data can also contribute to further development for the entire In recent years, studies about attitudes towards and research field, for example with regard to method- the feasibility of opening up qualitative data have ological questions, by avoiding “over-questioning”, emerged increasingly, as have examples of second- by reusing data for teaching, and by gradually ac- ary use of qualitative data (including Corti, Day cumulating more extensive and systematic databas- & Backhouse 2000; Corti 2007; Helfen, Hense & es. Since qualitative data cannot be replicated, the Nicklich 2015; Huschka, Knoblauch, Jagodzinski, long-term preservation of this data is of enormous Schumann & Witzel 2015; Oellers & Solga 2013; importance, also for historical reasons. Krügel & Ferrez 2013; Laudel & Bielick 2019; Med- jedović 2011; Medjedović & Witzel 2010; Smioski Despite these benefits, the archiving, provision and 2013). It is clear from these studies that, in addition reuse of data is not yet a common practice in quali- to clarifying the legal requirements, the particular tative social research (Hollstein & Strübing 2018). challenges for opening up and sharing qualitative This is due, on the one hand, to the interwoven na- data are creating as comprehensive a documentation ture of the data collection, processing, analysis and as possible and contextualizing the data to enable interpretative steps of qualitative research and, on reuse (Corti 2000; Hollstein & Strübing 2018). the other hand, to the iterative and alternating epis- temological process that goes hand in hand with This primer is intended to help with the question of the ongoing contextualization of the collected data. how qualitative data can be made openly available
Opening up and Sharing Data from Qualitative Research: A Primer \7 and shared. It was developed by researchers from questions. This was also done to keep the manual various disciplines in cooperation with employees concise and clear. In addition, the primer includes of research data centers. The focus is on answering references, links and an annotated list of literature fundamental and very practical questions, which is at the end of the document as additional resources why this primer has been structured as a catalog of addressing the topics raised here. 3 Opening up and sharing data 3.1 What is qualitative data? process, e.g., socio-demographic questionnaires, postscripts, interview guidelines, and category sys- In principle, it can be debated whether one can and tems. Data management plans can also be referred should speak of “qualitative” and “quantitative” to as context data. These context data are of great data at all. A curve diagram on a sheet of paper importance for an understanding of the collected can, for example, represent a quantitative data set research data and their re-use, but can also be of in- as well as representing a qualitatively understood terest for opening up and sharing in themselves, in- symbol. Whether research data are viewed as qual- dependent of the collected data, which is why they itative or quantitative in the context of a specific are also included in the following explanations. research design does not depend so much on their physical appearance, but on their specific use in the research process (Leonelli 2015). Neverthe- 3.2 What is actually meant by opening up and less, there are types of data that are commonly as- sharing? sociated with qualitative or quantitative research (see, e.g., Baur & Blasius 2019; Flick 2018; Flick, By opening up and sharing qualitative research Kardoff & Steinke 2015). data, we mean making them accessible for sec- ondary use. However, this does not mean simply In this guide, we refer to qualitative data as data that posting as much data as possible on the internet, is produced and used in the context of qualitative re- but rather making it available in a controlled and search processes. Such research data are typically documented manner to the respective research characterized by a high level of situational context in- community, taking into account specific method- formation. They are often less structured, standardized ological and content-related criteria. According- and metrized than is the case with data from quantita- ly, research data centers (RDCs) and repositories tive research (Kitchin 2014). Qualitative data are gen- are used to archive and make available reusable erated, for example, in the context of interviews and research data, usually with different access meth- participant observations and are manifest, among oth- ods (download, remote desktop, on-site) and ac- er things, in the form of audio recordings, transcripts, cess levels (e.g., data access only for scientific field notes, video recordings or photographs. purposes), which go hand in hand with different anonymization requirements and corresponding- Furthermore, we distinguish between raw data (e.g., ly varying potentials for re-use and analysis. For the audio file of an interview or the handwritten re- many primary researchers as well as participants cordings of participant observation) and processed (or field participants, informants, etc.), controlled data (e.g., anonymized transcripts, notes, images). access to the data is a basic requirement in order to In addition, context data is generated in the research even consider opening up the research data.
Opening up and Sharing Data from Qualitative Research: A Primer \8 the context of research projects funded by 3.3 What arguments speak in favor of open- these agencies to be archived and made ing up and sharing my data? available for secondary use. In EU-funded projects such as the Horizon 2020 program \\ Personal benefit: Publicly shared data are of the European Commission, archiving seen as a type of publication that can be has even become a mandatory criterion (the cited. Research data centers, for example, possibility of subsequent use can, however, work with the allocation of Digital Object be excluded under certain circumstances). Identifiers (DOI) in order to guarantee that Researchers who cannot archive their data the data can be permanently referenced. and make them reusable must increasingly In addition, cooperation and collaboration justify why this is not possible. projects can emerge when data is used by other researchers. Opening up and sharing, accordingly, can facilitate the collegial ex- change and learning process and thus also increase the quality of one’s own research. 3.4 What arguments speak against opening \\ Practical research benefit: Secondary use up and sharing my data? of research data has many advantages for the respective research communities, even \\ Sensitivity of the data: Qualitative data can though this argument is still largely ignored be highly sensitive. If, for example, it can- in many fields. In particular, so far only rare not be ensured that the data is fully ano- advantage is taken of the potential gains in nymized, researchers must consider wheth- knowledge from comparative research on er it is possible to open up and share the different data sets, the possibility of longi- processed data, especially when vulnerable tudinal analyses with secondary data or the groups (e.g., refugees, migrants, ill or crim- possibility of combining data sets for a sec- inalized participants, children and adoles- ondary research project with the possible cents, etc.), are concerned. In addition, field addition of new data, but all these have great access can be denied for sensitive topics potential. Using data that has already been if researchers announce that they want to collected can also prevent over-researching share the data. a field (a known problem, e.g., from school \\ Intervention into the research process: If research). Particularly in the case of vul- opening up and sharing data would lead to nerable groups, attempts should be made to an intervention into the research process, protect the field as much as possible. then the advantages and disadvantages \\ Scientific quality and research quality: must be weighed against each other. Usu- Sharing data makes the research process ally, opening up and sharing data does not transparent. This can increase the quality constitute an intervention into the research of research and stimulate method devel- process. An intervention into the research opment. Furthermore, most data collection process could take place, for example, if is publicly funded, which is why the data field access is not possible or if interview should also be made available to the (scien- partners announce that they will not an- tific) general public again (see also the Pub- swer certain questions if the answers can be lic Money, Public Code campaign). opened up and shared by the researchers. An intervention could also arise when re- \\ Demand from funders: Research funders searchers feel inhibited and sense that their increasingly expect the data collected in freedom of research is being obstructed by
Opening up and Sharing Data from Qualitative Research: A Primer \9 the “controlling” influence of subsequent involve a suitable RDC when submitting the data publication, and consequently proceed application (See: When should RDCs ideal- differently with their research, for example ly be contacted?) In the case of projects for by refraining from asking certain questions. which no additional funding can be applied \\ Time and cost: Preparing data so that it can for, e.g. PhD projects, investing the time to be opened up and shared can be time-con- make the data usable still makes sense. On suming and costly. For third-party funded the one hand, one can demonstrate one’s projects, there is the possibility to apply for research process transparently and make a additional funds for the preparation of the contribution that is beneficial to the research data for reuse. It makes sense to contact and community. On the other hand, the data then count as an independent publication. 4 Data management plan 4.1 What is a data management plan? 4.2 What role does a data management plan play in opening up data? A data management plan (DMP) contains informa- tion about the collection, processing, storage, ar- DMPs are a useful tool for any type of empirical chiving, publication and re-use possibilities of re- research project. For qualitative projects, and espe- search data in the context of a research project. When cially for opening up qualitative data, DMPs offer creating a DMP, the aim is to determine these activ- several advantages: ities in as target-oriented, systematic and efficient a manner as possible. The benefit of such formalized \\ Creating a DMP helps researchers consid- planning is to avoid restrictions in the later use of er in advance how the data will be handled data through early and comprehensive planning. during the project. \\ A DMP can and should be adapted over A DMP can span anything from a few paragraphs the course of the project. This openness to several pages. Many third-party funders (BMBF, and flexibility correspond to the iterative DFG, HBS, EU Horizon 2020) require a DMP as knowledge and data collection process in part of a funding application for the allocation of qualitative research projects. funds from certain funding lines. \\ A DMP facilitates work with a repository or Especially with regard to qualitative data, DMPs RDC, through which qualitative data can be are sometimes controversial. It is not uncommon shared. It is therefore advisable to involve for the obligation to create and submit a DMP to an RDC when submitting the application. be seen as too tight a corset, which does not do \\ The DMP makes it easier for primary re- justice to the iterative and corrective qualitative searchers to write a data and methods re- research process. However, DMPs do not have to port or study report, which most RDCs explain every step of the data collection in detail make available to secondary users in order and there is also no need to view them as a static to contextualize the data (Contextualization work product. Instead, DMPs can and should be and Data Preparation). adapted in the course of the research process.
Opening up and Sharing Data from Qualitative Research: A Primer \ 10 4.3 Where can I find help creating a data Tools for creating a data management plan: management plan? \\ DMPonline General information: \\ DMPtool \\ Research Data Centers \\ Research Data Management Organiser \\ Forschungsdaten.info \\ Exemplary plan from the German Institute for Economic Research (DIW Berlin) \\ Research data management at the HU Berlin \\ Checklist for creating a DMP in empirical educational research 5 Data generation Data generation is used here to refer to both the 5.2 Who is the author of primary, contextual collection of primary data (e.g. through interviews, data or collections? video recordings, participant observation) and to the compilation of data (e.g. creating a corpus from Whether research data are copyrighted works is newspaper articles). difficult to judge and must be considered on a case- by-case basis. According to German copyright law, copyright protection arises when a work is created. 5.1 Where can I find qualitative research data As a rule, mental effort that manifests itself in a for re-use? concrete work and shows certain individual char- acteristics is sufficient to meet this definition. Pure In addition to contacting researchers directly, there are information and facts are not protected by copy- a number of other ways to search for available data. right, but can carry ancillary copyrights. In the best- case scenario, questions of copyright and ancillary \\ Accredited research data centers (RatSWD) copyright should be clarified at the beginning of that specialize in qualitative research data the research project (as well as establishing who \\ Repositories of your own institution the copyright co-holders are). Furthermore, con- tractual aspects with the employing institution can \\ General repositories, e.g., Zenodo, OSF, also become relevant at this point if, for example, SocArxive they hold the rights of use to the generated data via \\ Meta-portals in which repositories can be the employment contract. Finding out which claus- found: re3data.org es or contractual agreements with one’s employer refer to data is therefore important. The answers to \\ Meta-portals in which data can be found: these questions largely determine who is entitled or Google Dataset Search, Open Knowledge potentially obliged to archive the data collected or Maps, Science Open, OpenAire make it available for reuse. \\ Subject-specific meta-portals in which data can be found: Verbund Forschungsdaten Bildung \\ DOI providers: Datacite, Crossref
Opening up and Sharing Data from Qualitative Research: A Primer \ 11 5.3 Can the idea of opening up and sharing of the collection and processing of the data to ensure data distort my data? that the data is interpretable. Strictly speaking, surveys such as interviews or even In addition to documentation, data protection (ano- participatory observations are not natural situations nymization) and access to the data (curation) are par- and are therefore always “artificial”, insofar as they ticularly relevant for sharing qualitative primary data are not non-invasive and more or less completely and/or context data (such as interview guidelines, unnoticed observations of natural situations (such transcription rules or video manuals for observation as unobtrusively observing the behavior of people situations). it is therefore essential to clarify in ad- in a shopping street). Announcing to participants vance whether the degree of anonymization required that the data to be collected will later be opened up for publication would still enable a meaningful re- and shared can add to the artificiality of the situation analysis, and whether the data can actually be contex- and trigger a specific reaction by participants. The tualized sufficiently, as well as which access route and researcher must therefore decide whether to obtain access level would be most suitable for the data. consent from the interviewed / observed persons to reuse the data before data collection, or whether this is only done after the survey (so-called “debrief- 5.5 What should I pay attention to, in terms of ing”). Obtaining consent after data collection (and, data protection? if necessary, transcription or processing of the data) has the additional advantage that participants can Data protection applies to personal data of natural then fully appraise which data they consent to shar- persons. The European General Data Protection ing. There may also be research projects in which Regulation (GDPR) applies in Germany. In addi- opening up and sharing must be generally ruled out, tion to the GDPR, other legal bases must be ob- e.g., due to data protection law or research ethics. served. The Federal Data Protection Act (BDSG) is relevant for private institutions as well as for fed- eral public institutions and, to a limited extent, for 5.4 What do I have to consider if I want to public institutions in the federal states. State data open up and share my data? protection laws (LDSG) are relevant for public in- stitutions in the federal states, such as universities. If the decision has been made to make the collected In addition, there are data protection regulations in data available for re-use, it is helpful to decide where special laws such as the school laws of the respec- and how the data will be shared early on in the re- tive federal states, the Federal Statistics Act or the search process. In general, it is recommended that Social Code, which must be observed if applicable. data preparation for archiving and sharing be carried The first thing to do always, however, is to check the out during the research process, so that complex doc- European GDPR, as it takes precedence over other umentation and reconstruction work does not have to legal sources. The data protection law of the federal be carried out shortly before the end of the research government or the federal states is only applicable project. It helps to consult the data management if the GDPR contains escape clauses or loopholes. officers at one’s institution or at an RDC, who can At the beginning of a research project, therefore, it provide helpful information on research data man- should be clarified which data protection require- agement that facilitates data sharing and archiving. ments must be observed during the project. Most Central points that must be considered to enable data universities also have data protection officers and re-use are: Compliance with legal aspects (especial- ethics committees who can be consulted on ques- ly data protection and copyright), consultation with a tions relating to the handling of personal data. research data center / repository, and documentation
Opening up and Sharing Data from Qualitative Research: A Primer \ 12 require parental consent in addition to or in- 5.6 What should I consider regarding obtaining stead of the consent of the child. National consent? escape clauses allow a lower age limit to be set, but no lower than 13 years. To our A declaration of consent means that the interviewed knowledge, the general national laws in or observed person gives informed consent to the Germany (BDSG, LDSG) do not specify collection and use of their data. Informed means age limits lower than the GDPR. However, that the research goal, the research procedure, the more explicit requirements can sometimes rights of the participant and, if applicable, other be found in special laws such as the school uses of the collected data have been explained and laws. When research is carried out with that the participant has understood this information. people under guardianship, the consent of Consent to data collection and consent to opening the guardian must be obtained. up and sharing data do not have to be given in one step. For example, some research data centers rec- \\ If the (anonymized) primary data are to ommend obtaining consent about data sharing after be made available for subsequent use, this data collection, since participants can then better must be listed in the consent form. assess what information they have revealed during \\ Usually, consent for collecting primary data data collection and which information they want to must be obtained before data collection. release for reuse. Consent for sharing (anonymized) primary data can sometimes be obtained after data \\ Obtaining informed consent to open up and collection. This depends on the research share the collected data is necessary ac- context, the sensitivity of the data, and the cording to the GDPR and required from a trust placed in researchers. perspective of research ethics. Consent can also be given orally. Obtaining written con- \\ The data collection and the associated sent is recommended, however, due to the documents used in the field (such as the obligation to provide evidence. informed consent form) are subject to ap- proval in certain fields of investigation \\ Sufficient time should be allowed before (e.g., school). These procedures can be very data collection to inform the participants. It time-consuming (especially in cross-border is not enough to briefly present the infor- projects) and should be planned according- mation on data protection and the consent ly. Often, the informed consent must also sheet. The participants must have under- be presented to the institutional data protec- stood the actual situation. tion officer as part of ethical and data pro- \\ According to the GDPR (see Art. 8 Para. tection review processes. 2 GDPR), participants under the age of 16 6 Contextualization and data preparation 6.1 Why should I contextualize my data? contextualization. Contextualization helps second- ary researchers to understand and use the data in the In order to be able to open up and share qualita- best possible way, so to speak “to slip into the shoes tive data, the data must be enriched with the neces- of the primary researchers”, even though they lack sary contextual information. This process is called direct experience of the primary data collection.
Opening up and Sharing Data from Qualitative Research: A Primer \ 13 6.2 How can I contextualize my data? 6.3 How should I prepare my data for reuse? To facilitate re-use, it is helpful for the secondary In addition to the contextualization of research data researchers to receive the contextual information in through documentation, data preparation is central a bundle, which is why research data centers usu- for its sharing and secondary use. Data preparation ally require a study report, data and methods re- refers to the steps that are necessary to share re- port, and contextualization reports in addition to search data in conformity with research ethics and the primary data, which they then have ready for data protection law, and in a way that is understand- subsequent users. The documentation of the data able and usable. When it comes to interview data, collection context can contain, for example, the fol- this concerns mainly transcription and anonymiza- lowing information: tion. Anonymization is necessary in almost all re- search contexts, not only to protect the people inter- \\ Project structure (e.g., research question, viewed or involved, but also to protect third parties research design) about whom, for example, someone has spoken. In \\ Project details (e.g., research funders, mem- the case of some data, however, anonymization is bers of the research team and difficult to achieve without destroying the analy- sis potential of this data for secondary analyses or \\ professional background and status) for teaching purposes (e.g., in the case of video re- \\ The survey method and the survey instrument cordings). In this case, RDCs offer access channels (e.g., interview guide or experimental setup) with different restriction levels to make such data available in a controlled and legally conform man- \\ Details on how the survey was carried out ner. This possibility should be discussed with the (e.g., how were participants recruited, who respective repository / RDC at an early stage. Then, carried out the data collection, location of a database can be made available with different ac- the data collection, circumstances of the data cess levels (e.g., regarding anonymized transcripts collection, such as the type of data collection as well as underlying audio recordings). or type of protocol, disturbances during the data collection, length of the session). To anonymize raw data (e.g., from interviews, vid- \\ Details on participants (e.g., status, func- eo recordings, observation protocols, images), the tion, socio-demographic data, inclusion of data are put into writing according to certain stan- respondents in the research process through dards and rules (transcription). Documenting these sharing the results with them) transcription processes improves the reusability of \\ Data for connecting data points, e.g., the data, so the following information is helpful: chronological order in the case of several sessions with the same person \\ Which standards and rules were used for the transcription (e.g., which transcription \\ Transcription rules and anonymization pro- rules have been applied – see also bibliog- cedure (see also the following section) raphy) Some RDCs (e.g., RDC Education at DIPF or \\ Who performed the transcription? (e.g., re- Qualiservice) provide researchers with recommen- searchers, student assistants, professional dations for contextualizing qualitative survey data. transcription service) \\ According to which rules or heuristics were parts of the raw data excluded from tran- scription? (e.g., small talk before the start of the official interview)
Opening up and Sharing Data from Qualitative Research: A Primer \ 14 \\ What rules were used for anonymization? or category systems using software such as This does not mean making the anonymiza- MAXQDA or Atlas.ti. Such data can also be tion protocol available, which would en- opened up and shared. Some research communi- able de-anonymization. Rather, the aim is ties are currently discussing whether disclosing to show which identifiers have been ano- “code trees” or “category systems” should be re- nymized in what way, i.e., the anonymiza- quired for theses or during the review processes tion procedure should be described. for publication in scientific journals. This information can be described in the data and methods report/ study report/ contextualization report. However, closed technical standards of the ex- isting software solutions can be problematic. For example, the evaluation programs MAXQDA or 6.4 Can I publish contextual data that is gen- Atlas.ti are not open source, which is why coding erated during data collection and analysis? schemes that are saved and made available in such programs cannot be reused by those who do not On the one hand, contextual data is generated have access to a license. When publishing memos, during data collection, such as when interview care must also be taken to ensure that anonymiza- guidelines, observation protocols or experimen- tion is maintained (e.g., not referring to the par- tal setups are designed. On the other hand, con- ticipant as “he” or “she” if the data ought to be textual data is generated when analyzing qualita- anonymized in a gender-neutral manner). tive data such as memos, codes, code structures 7 Research data centers and repositories 7.1 Where can I archive my data? 7.2 What do research data centers (RDCs) do? Researchers can choose between different options RDCs are facilities at universities or other scien- for archiving their data and making it reusable (e.g., tific institutions that act as service providers to repositories of their own institution, free reposito- improve science‘s access to research data. RDCs ries such as Zenodo, or RDCs). When selecting an advise researchers on questions of data prepara- option that is suitable for the researchers and the tion (e.g., documentation and anonymization), but data, there are a few things to consider: How and they do not usually conduct data preparation them- where can long-term archiving and easy retrieval of selves. They back up research data and ensure their the data be guaranteed? Is the data shared in a data permanent preservation and reusability (long-term protection compliant manner, can access to sensi- archiving). Long-term accessibility is ensured by tive data be controlled, and can participants revoke regularly migrating the data to current data for- the consent they have originally given? Is field or mats, archiving the data with backup copies in a data type-specific support guaranteed? Are my data professional IT infrastructure, and assigning digi- protected from commercial use? In this primer, we tal object identifiers (DOIs) so that the data is per- focus primarily on the functions and offers of RDCs manently referenceable. In addition, RDCs ensure in German-speaking countries, which specialize in transparent and standardized regulation of access to qualitative research data. the data. To do this, on the one hand, when submit- ting the data, they check the sensitivity of the data
Opening up and Sharing Data from Qualitative Research: A Primer \ 15 \\ Research Data Centre for Higher Education and the degree of anonymization carried out by the Research and Science Studies (RDC-DZHW) primary researchers. Then, in consultation with the primary researchers, the appropriate access routes \\ Interdisziplinäres Zentrum für qualita- are selected, matching the sensitivity of the data . tive arbeitssoziologische Forschungsdaten The sensitivity of the data is measured by how high (RDC eLabour) the chance is that the participants could be identi- \\ RDC Qualiservice fied and how great the potential damages would be if the data were to become de-anonymized. On the \\ Research Data Center of the Socio-Eco- other hand, RDCs usually conclude data usage con- nomic Panel tracts with secondary users, in which they obligate themselves, among other things, not to attempt any 7.4 When should RDCs ideally be contacted? de-anonymization of the data. One advantage of us- ing RDCs when opening up and sharing primary Ideally, RDCs should be contacted while the re- research data is that they can provide clarity and search idea is being developed, to discuss data certainty about legal requirements in the process. management and, if necessary, a DMP, and to apply for appropriate resources in the project application. RDCs establish controlled access to the data in dif- During the preparation phase of the data collection, ferent ways. In the case of less sensitive data, re- RDCs can be contacted to check the consent forms searchers can download them after submitting an with regard to data archiving and secondary data application. In the case of sensitive data, some- use. RDCs do not intervene in the research process, times only on-site access is possible and research- but can advise researchers on the steps for publish- ers can only access the data on the premises of the ing data. Seeking this advice early can help avoid RDC. In addition, additional security measures are problems with data archiving and secondary data often taken, e.g., there is no USB access to the on- use (e.g., lack of resources, consent forms making site computers. Cell phones or laptops are often not secondary use difficult or impossible, uncertainty allowed into the room. regarding the required documentation). 7.3 How do I find the right RDC for my qual- itative data? In Germany, the Council for Social and Economic Data is a central body for the accreditation of re- search data centers. A total of seven RDCs are cur- rently accredited and have specialized in different qualitative data types: \\ Research Data Center for Business and Or- ganizational Data (RDC-BO) \\ Research Data Centre for Education (RDC Bildung) at DIPF \\ Forschungsdatenzentrum Archiv für Gesprochenes Deutsch am Institut für Deut- sche Sprache (RDC-AGD)
Opening up and Sharing Data from Qualitative Research: A Primer \ 16 8 Conclusion Heterogeneous data collections of various kinds been established in Germany over the past ten are possible these days, using digital photo or film years, which, in addition to (providing opportuni- cameras, digital sound recording devices and / or ties for the) reuse of research data often also offer through digital research on the computer. Older advice about research data management. data sets, e.g., historical ethnological material col- lections or film recordings, can be photographed or Technical innovations and the growing accessi- transferred to new, digital formats, old texts can be bility of advice about research data management scanned and then searched for using digital tools. improve the ways that data are archived and can Digital and digitized data and materials can thus be be re-used. This is also the case for qualitative made reusable to a large extent. research data. At the same time, research data infrastructures have transformed the originally Against this background, and in the context of often snubbed act of long-term archiving into a mandatory long-term archiving, the (provision promising endeavor for the research communi- of opportunities for the) reuse of research data is ty: by preparing and publishing often laboriously also becoming increasingly relevant. This devel- collected data corpora, both primary researchers opment can be seen to harbor promising poten- (additional attention) and subsequent users (as tial for digital, modern research methods and data. new data collection can be (partially) avoided) as This assessment is mirrored in the initiative for the well as the entire research community (sustain- development of a national research data infrastruc- ability) can benefit. Against this background, we ture (NFDI) by the federal and state governments, hope to have made a practical contribution with which is currently being implemented by the Joint this primer, which helps qualitative researchers Science Conference (GWK) and the German Re- understand the prerequisites, options and the po- search Foundation (DFG). In addition, research tential of opening up and sharing data generated data infrastructures and research data centers have in their research. 9 Literature As already stated in the preface, we wrote this hand- 9.1 Methodological discussion on opening and out from and in the German context. Accordingly, sharing qualitative data we predominantly used German literature in the ori- ginal and did not changed for the English translation. Bambey, Doris, Louise Corti, Michael Diepenbro- ek, Wolfgang Dunkel, Heidemarie Hanekop, Be- tina Hollstein, Sabine Imeri, Hubert Knoblauch, Susanne Kretzer, Christian Meyer, Alexia Mey- ermann, Maike Porzelt, Marc Rittberger, Jörg Strübing und René Wilke. 2018. Archivierung und Zugang zu qualitativen Daten. 267. Berlin: Rat für Sozial- und Wirtschaftsdaten (RatSWD).
Opening up and Sharing Data from Qualitative Research: A Primer \ 17 Birke, Peter und Nicole Mayer-Ahuja. 2017. Se- Eberhard, Igor und Wolfgang Kraus. 2018. Der kundäranalyse qualitativer Organisationsdaten. Elefant Im Raum. Ethnographisches For- In: Liebig, Stefan, Matiaske Wenzel und Sophie schungsdatenmanagement als Herausforderung Rosenbohm (Hrsg.): Handbuch Empirische Or- für Repositorien. Mitteilungen der Vereinigung ganisationsforschung. Wiesbaden: Springer. S. Österreichischer Bibliothekarinnen Und Biblio- 105-126. thekare. 71(1). S. 41–52. Bishop, Libby. 2012. Using Archived Qualitative Gebel, Tobias, Sophie Rosenbohm und Andrea Data for Teaching: Practical and Ethical Consid- Hense. 2017. Der zweite Blick auf qualitative erations. International Journal of Social Re- Interviewdaten. Neue Perspektiven in der Indus- search Methodology. 15(4). S. 341-50. trial Relations-Forschung. Industrielle Bezie- Dunkel Wolfgang, Heidemarie Hanekop und hungen. 24(1). S. 7-30. Nicole Mayer-Ahuja. 2019. Blick zurück nach Gebel, Tobias und Sophie Rosenbohm. 2017. vorn: Sekundäranalysen zum Wandel von Arbeit Datenmanagement und Data Sharing in der Or- nach dem Fordismus. Campus Verlag. ganisationsforschung. In: Liebig, Stefan, Ma- Corti, Louise. 2000. Progress and problems of pre- tiaske Wenzel und Sophie Rosenbohm (Hrsg.): serving and providing access to qualitative data Handbuch Empirische Organisationsforschung. for social research. The international picture of Wiesbaden: Springer. S.157-183 an emerging culture. Forum: Qualitative Sozial- Gebel, Tobias und Alexia Meyermann. 2020. Se- forschung. 1(3). http://dx.doi.org/10.17169/fqs- kundäranalyse qualitativer Interviews – Eine 1.3.1019 Metaanalyse zur Praxis sekundäranalytischer Corti, Louise, Annette Day und Gill Backhouse. Forschung zu Arbeitsorganisationen. In: Rich- 2000. Confidentiality and informed consent: Is- ter, Caroline und Katharina Mojescik (Hrsg.): sues for consideration in the preservation of and Vom Geben und Nehmen. Die Praxis der Aufbe- provision of access to qualitative data archives. reitung und sekundäranalytischen Nutzung von Forum: Qualitative Sozialforschung. 1(3). http:// qualitativen Daten in der Sozialwissenschaft dx.doi.org/10.17169/fqs-1.3.1024 und ihren Nachbardisziplinen. Wiesbaden: Corti, Louise und Paul Thompson. 2006. Second- Springer (in press). ary analysis of archived data. In: Seale, Clive, Gebel, Tobias, Jakob Köster und Mariam Khucha. David Silverman, Jaber F Gubrium und Giampi- 2020. Archivierung und Nachnutzung qualita- etro Gobo (Hrsg.): Qualitative Research Prac- tiver Forschungsdaten im Spannungsfeld von tice: Concise Paperback Edition. SAGE Publi- Nutzbarkeit und Datenschutzanforderungen. Er- cations Ltd, London. S. 297-313. fahrungen und Konzepte aus dem Verbundpro- Corti, Louise. 2007. Re-using archived qualitative jekt eLabour. In: Richter, Caroline und Kathari- data - where, how, why? Archival Science. 7(1). na Mojescik (Hrsg.): Vom Geben und Nehmen. S. 37-54. Die Praxis der Aufbereitung und sekundärana- Dunkel, Wolfgang, Heidmarie Hanekop. 2020. For- lytischen Nutzung von qualitativen Daten in der schungsdatenmanagement und sekundäranalyti- Sozialwissenschaft und ihren Nachbardiszipli- sche Nutzung qualitativer Daten aus der Arbeits- nen. Wiesbaden: Springer (in press). und Industriesoziologie: das Kompetenzzentrum Heaton, Janet. 2008. Secondary analysis of qual- eLabour. In: Richter, Caroline und Katharina itative data: an overview. Historical Social Re- Mojescik (Hrsg.): Vom Geben und Nehmen. Die search. 33. S 33-45. Praxis der Aufbereitung und sekundäranalyti- Heaton, Janet. 2004. Reworking Qualitative Data. schen Nutzung von qualitativen Daten in der So- London: Sage. zialwissenschaft und ihren Nachbardisziplinen. Wiesbaden: Springer (in press).
Opening up and Sharing Data from Qualitative Research: A Primer \ 18 Helfen, Markus, Andrea Hense und Manuel Ni- Kretzer, Susanne. 2013. Arbeitspapier zur Kon- cklich. 2015. Organisierte Ungleichheit in der zeptentwicklung der Anonymisierung/ Pseudo- Leiharbeit? In: Industrielle Beziehungen. 22. S. nymisierung in QualiService. 282-304. Krügel, Sybil und Eliane Ferrez. 2013. Sozialwis- Heuer, Jan-Ocko, Susanne Kretzer, Kati Mozy- senschaftliche Infrastrukturen für die qualitative gemba, Elisabeth Huber und Betina Hollstein Forschung - Stand der Integration von qualita- 2019. Kontextualisierung qualitativer For- tiven Daten bei DARIS (FORS). In: Huschka, schungsdaten für die Nachnutzung – eine Hand- Denis, Hubert Knoblauch, Claudia Oellers und reichung für Forschende zur Erstellung eines Heike Solga (Hrsg.). 2013. Forschungsinfra- Studienreports [für qualitative Interviewstu- strukturen für die qualitative Sozialforschung. dien]. Working Paper 1 (in press). Berlin: SCIVERO Verlag. S. 113-124. Hirschauer, Stefan. 2014. Sinn im Archiv? Sozio- Kühn, Thomas. 2006. Soziale Netzwerke im Fo- logie - Forum der Deutschen Gesellschaft für kus von qualitativen Sekundäranalysen – Am Soziologie. 43(3). S. 300-312. Beispiel einer Studie zur Biografiegestaltung Hollstein, Betina und Jörg Strübing. 2018. Archi- junger Erwachsener. In: Hollstein, Bettina, Flo- vierung und Zugang zu Qualitativen Daten. rian Straus (Hrsg.): Qualitative Netzwerkanaly- RatSWD Working Paper 267/2018. Berlin: Rat se. VS Verlag für Sozialwissenschaften. für Sozial- und Wirtschaftsdaten. https://doi. Kratzke, Jonas und Vincent Heuveline. 2017. For- org/10.17620/02671.35 schungsdaten managen. E-Science-Tage 2017. Huschka, Denis, Hubert Knoblauch, Claudia Oel- Heidelberg: Universitätsbibliothek Heidelberg. lers und Heike Solga. 2013. Forschungsinfra- Kuula, Arja. 2011. Methodological and Ethical Di- strukturen für die qualitative Sozialforschung. lemmas of Archiving Qualitative Data. IASSIST Berlin: SCIVERO Verlag. Quarterly 34(3-4). S. 12-17. Imeri, Sabine. 2018. Ordnen, archivieren, teilen. Laudel, Grit und Jana Bielick. 2019. Forschungs- Forschungsdaten in den ethnologischen Fä- praktische Probleme bei der Archivierung von chern. Österreichische Zeitschrift für Volkskun- leitfadengestützten Interviews. Forum: Qualita- de 72(2). S. 213–243. tive Sozialforschung. 20(2). Jagodzinski, Wolfgang, Karl F. Schumann und Medjedović, Irena (2011). Secondary Analysis of Andreas Witzel. 2015. Archivierung und Sekun- Qualitative Interview Data: Objections and Ex- därnutzung qualitativer Interviewdaten – eine periences. Results of a German Feasibility Study. Machbarkeitsstudie. Abschlussbericht für das Forum: Qualitative Sozialforschung. 12(3). gemeinsame Projekt der Graduate School of So- Medjedović, Irena. 2014. Qualitative Sekundär- cial Sciences an der Universität Bremen und des analyse: zum Potenzial einer neuen Forschungs- Zentralarchivs für Empirische Sozialforschung strategie in der empirischen Sozialforschung. an der Universität zu Köln. DFG-Geschäftszei- Wiesbaden: Springer VS. chen: SCHU 348/8-1 und -2. Medjedović, Irena und Andreas Witzel. 2010. Knoblauch, Hubert. 2013. Einige Anforderungen Wiederverwendung qualitativer Daten. Archi- an Forschungsinfrastrukturen aus der Sicht der vierung und Sekundärnutzung qualitativer qualitativen Forschung. In: Huschka, Denis, Interviewtranskripte. Wiesbaden: VS Verlag für Hubert Knoblauch, Claudia Oellers und Heike Sozialwissenschaften. Solga (Hrsg.): Forschungsinfrastrukturen für die qualitative Sozialforschung. Berlin: SCIVERO Verlag. S. 27-34.
Opening up and Sharing Data from Qualitative Research: A Primer \ 19 Mozygemba, Kati und Susanne Kretzer. 2020. von Unger, Hella. 2018. Forschungsethik, digitale Datenvielfalt im Data-Sharing - eine kooperati- Archivierung und biografische Interviews. In: ve Aufgabe von Forschenden und Forschungs- Lutz, Helma, Martina Schiebel und Elisabeth datenzentrum, In: Lohmeier, Christine und Tuider (Hrsg.): Handbuch Biographieforschung. Thomas Wiedemann (Hrsg.): Datenvielfalt: Wiesbaden: Springer VS. S. 681-693. Potenziale und Herausforderungen. Wiesbaden: Wilke, René, Willi Pröbrock und Helen Pach. Springer VS (in press). 2019. Infrastrukturen für Forschungsdaten der Opitz, Diane und Reiner Mauer. 2005. Experi- qualitativen Sozialforschung. Soziologie. 48(4). ences With Secondary Use of Qualitative Data S. 467-486. - First Results of a Survey Carried out in the Context of a Feasibility Study Concerning Ar- 9.2 Qualitative data chiving and Secondary Use of Interview Data. Forum: Qualitative Sozialforschung. 6(1). Baur, Nina, und Jörg Blasius. 2019. Handbuch Rat für Sozial- und Wirtschaftsdaten (RatSWD). Methoden der empirischen Sozialforschung. 2015. Archivierung und Sekundärnutzung von Wiesbaden: VS Verlag. Daten der qualitativen Sozialforschung. Eine Flick, Uwe. 2018. The SAGE Handbook of Quali- Stellungnahme des RatSWD. Berlin. tative Data Collection. London: Sage. Richter, Caroline und Katharina Mojescik. Vom Flick, Uwe, Ernst v. Kardorff und Ines Steinke. Geben und Nehmen. Die Praxis der Aufberei- 2015. Qualitative Forschung - Ein Handbuch. tung und sekundäranalytischen Nutzung von 11. Auflage. Reinbek: Rowohlt. qualitativen Daten in der Sozialwissenschaft Kitchin, Rob. 2014. The data revolution: Big data, und ihren Nachbardisziplinen. Wiesbaden: open data, data infrastructures and their conse- Springer (in press). quences. Sage. Rosenbohm, Sophie, Tobias Gebel und Andrea Leonelli, Sabina. 2015. What Counts as Scien- Hense. 2015. Potenziale und Voraussetzungen tific Data? A Relational Framework. Philoso- für die Sekundäranalyse qualitativer Interview- phy of Science 82(5). S. 810-821. https://doi. daten in der Organisationsforschung. SFB 882 org/10.1086/684083 Working Paper Series. 43. Bielefeld: DFG Re- search Center (SFB) 882 From Heterogeneities 9.3 Data privacy and copyright to Inequalities. Schaar, Katrin. 2017. Die informierte Einwilligung Gebel, Tobias, Matthis Grenzer, Julia Kreusch, als Voraussetzung für die (Nach-)nutzung von Stefan Liebig, Heidi Schuster, Ralf Tscherwin- Forschungsdaten. Beitrag zur Standardisierung ka, Oliver Watteler und Andreas Witzel. 2015. von Einwilligungserklärungen im Forschungs- Verboten ist, was nicht ausdrücklich erlaubt ist. bereich unter Einbeziehung der Vorgaben der Datenschutz in qualitativen Interviews. Forum: DSGVO und Ethikvorgaben. RatSWD Working Qualitative Sozialforschung. 16(2):23. Paper Series 264. Grenzer, Matthis, Ines Meyer, Heidi Schuster und Smioski, Andrea. 2013. Archivierungsstrategien Tobias Gebel. 2017. Rechtliche Rahmenbedin- für qualitative Daten. Forum: Qualitative So- gungen der Organisationsforschung. In: Wenzel, zialforschung.14(3). Matiaske, Sophie Rosenbohm und Stefan Liebig Steinhardt, Isabel. 2018. Open Science-Forschung (Hrsg.). Handbuch Empirische Organisations- und qualitative Methoden - fünf Ebenen der forschung. Wiesbaden: Springer. S.129-156. Reflexion. In: MedienPädagogik. 32. https://doi. Kuschel, L. 2020. Urheberrecht und Forschungs- org/10.21240/mpaed/32/2018.10.22.X daten. Ordnung der Wissenschaft. S. 43-52.
You can also read