How much is our data worth? - Polski Instytut ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
ISBN 978-83-66698-61-1 WARSAW JANUARY 2022 data worth? How much is our
Citations: Grzeszak, J., Łukasik, K., Święcicki, I. (2022), How much is our data worth?, Polish Economic Institute, Warsaw. Warsaw, January 2022 Authors: Jacek Grzeszak, Krystian Łukasik, Ignacy Święcicki (PIE) Cooperation: Agnieszka Wincewicz-Price, Paweł Śliwowski (PIE), Michał Bylicki, Ewa Zawojska (WNE UW) Editing: Annabelle Chapman Graphic design: Anna Olczak Graphic collaboration: Tomasz Gałązka, Joanna Cisek Text and graphic composition: Sławomir Jarząbek Polish Economic Institute Al. Jerozolimskie 87 02-001 Warsaw, Poland © Copyright by Polish Economic Institute ISBN 978-83-66698-61-1
3 Table of contents Key numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Key findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The value of data in the contemporary economy . . . . . . . . . . . . . . . . . . . 7 What is data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Digital platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Review of the literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 The value of data for digital platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Analysis of the value of data for digital platforms . . . . . . . . . . . . . . . . . . . . . . . 16 What value do Poles assign to their data? . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Comparison of the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Poles’ views on digital platforms and services . . . . . . . . . . . . . . . . . . . . . 23 Overview of the responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Awareness of the transaction – data for access to services . . . . . . . . . . . . . . 25 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Appendix 1. Assessing value using other methods . . . . . . . . . . . . . . . . 33 Appendix 2. Results of the DCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 List of charts, diagrams, infographics, boxes and tables . . . . . . . . . . 36
4 Key numbers 87% of respondents claim that tech companies know too much about us. of Poles believe that tech 84% companies’ activity should be subject to greater control. PLN 4.025 billion value of Polish users’ data for Google in 2020. PLN 2.196 billion value of data from Poland for Facebook in 2020. 81% of Poles care about what tech companies are doing with their data. PLN 17.07 how much the average Pole is willing to pay so that Facebook does not have access to data aggregated per month on the platform and from other sources. PLN 14.10 how much we are willing to pay so that Google cannot access our data, including that concerning our per month activity on other portals. of respondents believe that no website or app should charge fees 69% for access, and 43% do not want to pay online service providers for better privacy protection.
5 Key findings I n this report, we sought to measure the is significantly higher than that reported by value of the data that Polish Internet us- these companies’ branches for the purpose ers generate on digital platforms (social of statistics and for the tax authorities. For media and Internet search engines). This value Google, monthly revenue from a single Polish was estimated from two sides: firstly, in terms user’s data amounts to PLN 10.16. We therefore of the revenue that Polish users’ data generates estimate that its total revenue in 2020 from all for digital platforms (Facebook and Google) its Polish users’ data was PLN 4.025 billion. and, secondly, in terms of the value that the us- For Facebook, monthly revenue from a Polish ers themselves assign to the data and privacy user’s data amounts to PLN 8.52. This means online. In addition, we conducted a survey that that – according to our calculations – total aims to check Poles’ knowledge of and views on revenue in 2020 from all its Polish users’ data digital platforms. was PLN 2.196 billion. Above all, respondents’ replies indi- In the survey, over half of Internet users cate that the average user expects monetary (63%) agrees with calls for a ban on display- compensation for the current situation, in ing ads based on private persons’ data. This which digital platforms have access to all our step would put an end to behavioural target- data and display personalised adverts. This ing by ads. This would block platforms’ current means that Poles consider the status quo, in business model, in which users pay for a ser- which we pay for digital services using our data, vice with their privacy. In addition, during the undesirable. empirical part of the study, it turned out that According to PEI’s study, the average Pol- respondents value personalized ads over ish user is inclined to pay PLN 17.07 per month non-personalised ones and, in certain cas- to prevent Facebook from having access to es, would expect compensation for the lack data aggregated on the platform and from other of personalized ads. sources. In the case of Google, Poles would be In addition, most respondents are con- ready to pay PLN 14.10 per month to prevent it cerned about digital giants’ growing influence from accessing their data, including their ac- – 84% believe that tech companies’ activity tivity on other Internet portals. should be subject to greater control. Moreo- At the same time, the platform’s business ver, as many as 87% believe that digital plat- model is based on processing this data and gen- forms know too much about us. 77% of Poles erating revenue from personalized adverts that are aware that they are paying for free services are displayed to users. In the case of Facebook, with their data. However, just 38% of respond- ad revenue accounts for 98% of the company’s ents are ready to pay the providers they use for revenue: USD 84 billion globally in 2020 (SEC, better privacy protection. This may be linked to 2020a). At Google, this share is 80%: USD 146 their distrust towards these companies – 76% billion globally in 2020 (SEC, 2020b). do not believe that a paid version of Facebook According to PEI’s calculations, for Google would better protect their rights. In the case of and Facebook, revenue from Polish users’ data Google, this is 73%.
6 Introduction T his study by PEI is the first compre- repeatedly presented with a set of choices be- hensive effort to approach the prob- tween various versions of a service, from which lem of the value of data from two they chose their preferred option. We studied sides. On the one hand, we examined the value users’ inclination to pay for a service in a modi- generated by the largest digital platforms – that fied version – without platforms tracking their is, Facebook and Google – from Polish users’ online activity, with a total lack of access and data. We concentrated on these two companies the inability for data left of the platform to be because they are widely used by Polish Internet processed, without targeted advertising, and users and also constitute a point of reference without creating a psychometric profile of the for many other digital services. Almost 97% of user. Juxtaposing these two perspectives ena- Polish Internet users use Google (GlobalStats, bled us to examine the distribution of benefits 2021) and as many as 89% use Facebook (Data- for the platform and its users. Reportal, 2020). The two most-visited domains The study was conducted in October 2021 in Poland are “google.com” and “facebook.com” in the form of a survey using the CAWI method (Interaktywnie.com, 2020). on a sample of 944 people1. As part of the study, The two companies’ revenue largely we also collected information on Poles’ knowl- comes from using the raw material that is the edge of the ways in which digital platforms work, data generated by users and processed by the their beliefs on given services’ social utility and platform. The writing of posts and rating, com- private usefulness, their opinions on regulation, menting on or searching for content by users their sense of control over their own data, and providers the information the platform needs to their readiness to pay for online services. sell its advertising products. The data generated In the first chapter, we present the value by users is a guarded good and, with a few ex- of data in today’s economy, especially in the ceptions, cannot be exchanged on the market. context of the platform business model’s sharp This is why they do not have a clearly-defined rise in the popularity. We include a review of the price. Yet the lack of price does not mean that literature on the value of data. In the second they have no value. One of the aims of our study chapter, we present the results of analysis of the is to estimate the value of the data that Polish value of Polish users’ data for digital platforms. users generate for global digital platforms. In chapter three, we present the methodology On the other hand, we decided to juxta- and result of our study of the subjective evalu- pose the value that users generate for the plat- ation of data and privacy by Polish Internet us- form with a study of the value that the users as- ers. The fourth chapter contains the results of sign to their own privacy, broadly understood. the survey on Poles’ knowledge and opinions on For this purpose, we carried out a discrete digital platforms. In the final chapter, we discuss choice experiment, in which respondents were the consequences of the PEI’s study. 1 Representative with regards to gender, age group and level of education.
7 The value of data in the contemporary economy What is data? The rapid development of ICT technol- technology progress and computers’ exponen- ogy means that the subject of data occupies tially-increasing computing power, the cost of a central place in analysis of contemporary storing, processing and transferring data has economies. Above all, advancing digitalization fallen dramatically. The resource’s greater avail- has meant that an ever growing number of hu- ability and decrease in the cost of processing man activities leave behind traces in the form it has contributed to the increased demand for of quantified information – data. Moreover, with data-based services. ↘ Box 1. Data as the new oil? The growing role of IT resources means that analogies to traditional and familiar resources are often used to describe them. The most popular comparison is to oil. • Like oil, data needs to be processed (refined) or turned into other raw materials. Raw data does not offer much value to its owner. Data also drives the contemporary economy, and controlling it is becoming almost as important as controlling access to hydrocarbons. • Yet these resources have more differences than similarities: above all, data is irreplaceable. One set of data cannot be replaced by another, as it contains different information. Data is also a non-rivalrous good. This means that., unlike oil, consuming it does not reduce its quan- tity. Data can also be duplicated without losing its quality. With the digitisation of the economy and social life, data resources are constantly growing, unlike the limited and shrinking amount of natural resources. To continue the energy analogy, data can also be compared to renewable materials. There is also an excess of these; the challenge is to collect and process them appropriately and to match supply and demand (in time and space). The key limitation of this comparison is above all the variety of data; although it is available in ever larger amounts, its irreplaceability means that we need to speak of growing diversity. Most probably, there is no good analogy for data based on familiar resources. The ongoing leg- islative work in many countries seeks to define the rules for obtaining, processing and using data shows that new frameworks are needed to describe the new economy, corresponding to its unique needs.
8 The value of data in the contemporary economy Treating data as a factor of production, or resources, the value stems from pos- one of the types of capital, we need to draw at- sessing them and their properties can be tention to a few of its properties: checked before acquiring them; → irreplaceability – each set of data contains → its price tends towards zero – this is be- different information, and sets cannot be cause the platform can estimate the data replaced without losing value. This differ- of every successive user “of the same type” entiates them from traditional resources, based on the data of users who have al- such as energy commodities; ready made their data available (Acemoglu → non-rivalry – a given set of data can be et al., 2019). used multiple times and simultaneously by Data is highly differentiated, not only be- various entities, without losing their prop- cause it comes from different entities. In Table erties. Physical (machines, raw materials) 1, we present the classification of data based on or human resources can only be used in its source, owner and the type of access. Each one place at once; of these types of data is subject to different reg- → its value can only be assessed once it ulations and can be used in different ways in the has been used – when deciding to invest economy. In this report, we focus primarily on in data (collecting and analysing it), there data generated by users and processed by the is no certainty about if and how it will pro- private sector. vide benefits. In the case of traditional ↘ Table 1. Selected kinds of data, based on type Source Personal data Information making it possible to identify an individual who is alive, such as: name and surname, email address, IP address, national identification number, police and medical records. Organisational data Information collected and processed by organisations (in both the public and private sector). It reflects a given institution’s nature. Owner Public sector data Information collected, stored, created and processed by public institutions, such as open public data. Private sector data Information collected, stored, created and processed by private enterprises. Type of access Proprietary data Information protected by intellectual property law, such as patents, trade secrets and copyrights. Public data Information not protected by intellectual property law. Source: prepared by PEI based on: Śledziewska, Włoch (2020).
The value of data in the contemporary economy 9 Data has become a key factor of produc- including software and data, already accounted tion in the contemporary economy, reflected for 84% of the value of companies in the S&P in the ranking of the world’s most valuable 500 index (MIT Technology Review, 2016). The companies. Tech companies – including ones ability to analyse large datasets to optimise whose value is based on collecting, processing a company’s internal processes, increase sales, and monetizing data, like Alphabet (Google) and plan the use of resources better or improve the Meta (Facebook) – currently occupy the most quality of customer service determines a com- places in the top ten companies with the highest pany’s competitive position on the market market capitalisation (CompaniesMarketCap, (Śledziewska, Włoch, 2020). More importantly, as 2021). Two or three decades ago, tech compa- data has become a key factor of production, the nies’ key resource was still hardware, and the platform business model – whose development companies with the top market capitalisation is based on extracting datasets – has become included General Electric or Cisco. Today, hard- more important. In 2020, the total revenue of ware is available as a service, and advantage is Google and Facebook amounted to around USD conferred by intangible assets, which of course 266 billion, which accounts for around 0.3% of includes data. In 2015, intangible resources, global GDP. ↘ Chart 1. Annual revenue of Google (Alphabet) and Facebook (in billions of USD) 200 150 100 50 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Google Facebook Source: prepared by PEI.
10 The value of data in the contemporary economy Digital platforms The platforms’ business model is based model. For example, tractor producer John on acting as an intermediary between two sep- Deere has created a platform connecting pro- arate but complementary groups of custom- ducers of seeds, producers of chemical sub- ers (Doligalski, 2013). The platform not only stances, farmers and hardware sensors. The positions itself between the two sides of the data that the company acquires while acting as market, but also constitutes the infrastructure an intermediary between these groups is used needed for interaction between them to come to improve its products and services provided about. In the cast of digital platforms, this kind to clients, among other things (Srnicek, 2017). of architecture ensures privileged access to The move towards platforms results from the the stream of data constantly generated by fact that they make better use of economies of the platform’s customers. The growing role of scale, are “lean”, create new sources of value the platform model may be visible in how more and use data effectively to create positive feed- non-tech companies are moving towards this back, among other things. ↘ Table 2. Selected types of digital platforms Type of platform Parties How it operates Advertising platforms Users, Obtaining platform users’ data to sell advertising (e.g. search engine, advertisers space. The cost is borne by advertisers purchasing social network) personalised ads. Service platforms Users, service Matching users with service providers. Customers (e.g. Uber, Airbnb) providers (who pay for the platform’s operation) have more choice and service providers have access to a larger customer base and flexibility in managing resources. E-commerce Buyers, Extending the existing market for exchange between (e.g. Amazon, Allegro) sellers sellers and buyers. They earn a percentage of the and sales platforms transaction commission, from the seller (e-commerce) (OLX, Vinted) or the buyer (certain sales). Source: prepared by PEI. The platform model’s popularity stems scaled up at almost no marginal cost. The plat- from characteristics of an economy based on forms use network effects – both direct and in- software and data, connected by the Internet. direct – and skilfully use data to engage users. The latter means that more and more people By combining these effects, platforms can grow and institutions are connected via a network rapidly and monopolise (or oligopolise) the mar- that allows all sorts of communication. The ket. The winner (the monopolist or oligopolist) use of software and data enables activity to be receives a sizeable reward.
The value of data in the contemporary economy 11 ↘ Box 2. Network effects In the case of services, a network effect refers to when the user’s benefit from a given service depends (positively) on the number of other users of the same service. In the case of digital plat- forms, network effects can be direct or indirect. A direct network effect means that, as the number of users increases (in one of the two groups served by the platform), the utility of users in the same group increases. The best examples are a telephone network or social media platform – with every successive user, a given platform’s utility for the other users increases, as they are able to form a larger number of connections. An indirect network effect occurs when the increase in the number of users of one of the groups on the platform increases the utility of another group. For example, as the number of users of a platform offering software increases, the utility for developers putting their solutions on it in- creases. In contrast, the impact of each additional user on the programmes’ users is minimal or zero. In other cases, it can even be negative – on an auction platform, each additional seller increases the benefit for buyers (more choice), but can lower the benefit for other sellers (more competition). Among the top ten companies in terms of Another feature characterising platforms market capitalisation, as many as four [Alphabet are the sources of financing, which are con- (Google), Amazon, Meta (Facebook), Tencent] are structed differently than on the traditional mar- companies that largely owe their position to the ket. Some platforms only charge one of the par- skilful use of the platform model and of data. ties; the one with less price elasticity or that is The change in paradigm when it comes to how more dependent on the availability of users on the private sector generates value may also be the other side (such as Facebook, Google and visible in the number of terms that seek to make the free version of Spotify). The service is there- sense of the changes taking place in the digital fore subsidised for some of its users. Platforms economy. Terms such as “the gig economy”, “the also benefit from unique knowledge about de- sharing economy”, “the attention economy” or mand for products and can modify prices in real “surveillance capitalism” all draw attention to time – like Uber, whose drivers have no influence various aspects of the domination of the plat- over the price of a journey and, by implication, form model of organising a business. their revenue. Chart 2 shows the sharp increase in plat- Here, it is worth noting that the lack of fi- forms’ importance. In Q3 2021, platforms had nancial charges does not mean that the users a 42% share in the top 10 publicly listed com- do not bear any expense for using the service panies with a highest market capitalization in offered by the platform. That expense is the the world. For comparison, ten years ago, this data that they transfer directly to the platform share was zero. Digital platforms are marked in (such as posts on a social media platform, or the red, tech companies in blue, financial institu- choice of link in their search results) or informa- tions in green, the energy and mining industry tion that the platforms obtain without the users in grey, and production and FMCG companies having to do anything (such as their location or in yellow. the model of their device). This data is then used
12 The value of data in the contemporary economy to create a profile of the user, often using sta- not want to reveal (for example, sensitive data tistical techniques that enable him or her to be concerning his or her sexual orientation, politi- assigned characteristics that he or she would cal views or health). ↘ Chart 2. Cumulative values of the top 10 publicly listed companies in the world by stock market capitalisation (as a percentage) Berkshire 100 Hathaway Nvidia TSMC Tencent 80 Facebook Tesla 60 Amazon Google 40 Microsoft 20 Apple 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 RED - Google, Amazon, Facebook, Alphabet, Alibaba Group, Tencent BLUE - China Mobile, Tesla, IBM, Nvidia, TSMC, Samsung Electronics, Microsoft, General Electric, Apple YELLOW - Procter & Gamble, Johnson & Johnson, Walmart, Novartis, Hoffmann-La Roche, Nestlé GREEN - JP Morgan Chase, Wells Fargo, ICBC, China Construction Bank, Berkshire Hathaway, Visa, AT&T GRAY - Exxon Mobile, PetroChina, Petrobras, Chevron Corporation, BHP Billiton, Royal Dutch Shell Source: prepared by PEI. Access to this kind of information (includ- For the purposes of this report, we will ing intimate information) enables the platform examine how platforms in which the service to create a psychometric profile of the user and is free for users are financed from advertising adapt advertising to him or her – not only based revenue. The Diagram 1 illustrates this mecha- on demographic data or location, but also based nism: the flow of data and financial resources on on a given person’s views, values and fears. Be- the platforms. havioural products (understood as ads or other Diagram 1 shows the flow of data and mon- messages that use information about the user) ey on advertising platforms: based on sensitive data about the users were 1. By sharing photos, reviewing restaurants, also used during significant political events, liking posts or using an online search en- such as the Brexit referendum or elections in gine, the user generates data for the plat- countries around the world (Cadwalladr, 2020). form. Each time he or she is active online,
The value of data in the contemporary economy 13 he or she produces additional raw un- accuracy of Google search results). On the structured information that results from other hand, the platform uses it to sell ad- any kind of actions online (such as a series vertisers behavioural products. of clicks or moving the cursor). The ser- 3. The advertisers bear the financial cost of vice on the platform is seemingly free for displaying an ad, which depends on the the user; that is, he or she does not cover scope of the data used to define the tar- the cost of the service with money. In this get group. This cost is in principle the plat- sense, the platform subsidises the service form’s only source of income. These kinds on the user’s side. of products allow advertisers to reach 2. On the one hand, this data, which Zuboff groups of users that fit very narrow criteria (2020) calls behavioural data, is used with their marketing message. When users by the platform to improve the quality buy the products being advertised, the ad- of the service it is providing (such as the vertisers make money. ↘ Diagram 1. Flow of data in the platform economy 2 Platform ta m da r Be g tfo ) rm ) ha in la fo pl e vi at p th ic at ou er on n erv ra en ty e e to es l (g tivi ic pr en th rv Ch Ac od se sp of a uc rg e n ts fre es tim tio a fo e e in is th r is tim ad g in g s im p is ax O id bs Su (m Purchase of products and services 1 3 User Advertiser Profiled ads Flow of data Flow of money Source: prepared by PEI.
14 The value of data in the contemporary economy Review of the literature In the literature, the analysis of the value data from their bank account) per month (Prince, of data concentrates on attempts to estimate Wallsten, 2020). Maciej Sobolewski and Michał the impact of free digital services on the size of Paliński (2017) estimate that the value resulting GDP (Bukht, Heeks, 2018; IMF, 2018; Nakamura, from the increase in the level of privacy thanks Samuels, Soloveichik, 2016) or to quantify the to the GDPR amounts to EUR 6.5 per month. impact of free digital services on social welfare From companies’ perspective, the analy- (Brynjolfsson, Collis, Eggers, 2019; Brynjolfsl sis of the value of data focuses on calculat- et al., 2018; Bughin, Manyika, 2013). Another pop- ing the value that stems from open data (Ben- ular approach is estimating the value that users nett Institute, 2020), the higher market value assign to their data or privacy. These goods can- of companies that invest in data (PWC, 2019) or not be exchanged on the market, so survey and the income that comes from using behavioural empirical methods are used to estimate their data to target advertising (Acquisti, Marotta, Ab- value. According to this approach, the average hishek, 2019). On the business size, Shapiro and American is inclined to pay USD 5 per month Aneja (2019) and Facebook (SEC, 2020a) have to protect his data and would want to receive used an approach similar to ours. Yet unlike the USD 80 for access to this day (Winegar, Sunstein, above, this analysis takes into account different 2019). Overall, for the use of different types amounts of revenue, depending on geographical of data, users would demand from USD 1.82 region, and does not generalise by applying the (access to their location) to USD 8.44 (access to American context to other regions.
15 The value of data for digital platforms In our study, when estimating the value advertising market, which is their main source of that the data of a single user in Poland gener- income (eMarketer, 2021) (Chart 3). ates for online search engines and social media The business model of Facebook social platforms, we concentrated on Google and Fa- platform and Google services analysed by us cebook. As noted already, Google and Facebook is based on generating revenue from ads dis- reach the vast majority of Polish Internet users: played to users. In the case of Facebook, ad- almost 97% of them use the Google search en- vertising revenue constitutes 98% of its rev- gine (GlobalStats, 2021) and 89% use the Fa- enue – USD 84 billion globally in 2020 (SEC, cebook social media platform (DataReportal, 2020a). In the case of Google, this share is 80% 2020). Moreover, globally, these two companies – around USD 147 billion globally in 2020 (SEC, together have an almost 50% share of the online 2020b). ↘ Chart 3. Selected companies’ (Internet platforms’) share in global online advertising revenue (as a percentage) 80 70 60 50 40 30 20 10 0 2016 2017 2018 2019 2020 2021 Google Facebook Alibaba Amazon Tencet Source: prepared by PEI based on: Marketer (2021).
16 The value of data for digital platforms Analysis of the value of data for digital platforms In this analysis, we concentrate on calcu- users who use the platform. Here, we used data lating the average monthly revenue generated on the average revenue per user (ARPU) for us- by a Polish user for a given platform. After con- ers in Europe and the total revenue from this re- sidering the approaches available, we chose gion to estimate the number of Facebook users the method used both in independent studies in Europe (the company does not directly report (Shapiro, Aneja, 2019) and in analyses conducted these numbers) (SEC, 2020a). Then we divided by digital platforms (SEC, 2020a). revenue from the region by the number of us- We estimated the average monthly rev- ers and, as in the case of Google, adjusted it for enue generated for Google and Facebook by relative differences in wealth (measured in terms a single user’s data in the same way. of GDP per Facebook user in a given region). In both cases, we obtained the data on Based on these assumptions, we calcu- advertising revenue from the annual financial lated that: reports summitted to the US Securities and Ex- → For Google, average monthly reve- change Commission (SEC). This enabled us to nue from a single Polish user’s data is separate revenue from users’ data (in our study, PLN 10.16. Total revenue from all its Pol- equated with revenue from advertising) from ish users’ data amounted to PLN 4.025 other sources. For Facebook, we were able billion in 2020. to use data on its revenue from Europe2 (SEC, → For Facebook, average monthly rev- 2020a). For Google, we used global data (SEC, enue from a single Polish user’s data is 2020b). PLN 8.52. Total revenue from all its Pol- Next, for Google, this amount was divided ish users’ data amounted to PLN 2.196 by the global number of Internet users (World billion in 2020. Bank, 2020a) to estimate the average revenue Facebook itself used a similar method to from a single Internet user’s data. Google does estimate the annual monthly revenue from a us- not reveal how many people use its services; er’s data. It reports that average monthly rev- we used the number of Internet users as a good enue per user in Europe is USD 4.25 per month proxy. In the third step, we adjusted the reve- (PLN 8.69)3 (SEC, 2020a). However, this number nue from a single user for differences in individ- takes into account all kinds of revenue, not just ual countries’ wealth. For this, we used data on that from advertising activity. the GDP per Internet user in Poland compared Chart 4 shows the increase in average an- to the global average (World Bank, 2020b). On nual revenue from a user’s data globally. The a global and Polish scale, the proportion of In- method used to calculate the revenue in a giv- ternet users who use Google is similar, hence en year is analogous to the one used in our re- we equated the number of Google users with port. For Google, the compound annual growth the number of Internet users in general. rate (CAGR) over the course of the whole dec- In the case of Facebook, there are signifi- ade is 8.16%; for Facebook, it is as high as cant differences in the percentage of Internet 25.23%. 2 In its 10-K report, Facebook includes Turkey and Russia in its Europe region; this was taken into account in all the calculations. 3 Averaged amount from Q1-Q4 2020, adjusted for GDP per Polish Internet
The value of data for digital platforms 17 ↘ Infographic 1. Facebook and Google’s revenues from Polish users’ data in 2020 Number of Facebook and Google users Annual revenues from data in 2020 (in PLN billion) as a proportion of all Internet users in Poland Facebook 2.196 Google 4.025 21.5 million 28.3 million Ad revenues for Facebook and Google in comparison to their total revenues 10.16 PLN Average monthly revenue from a single Polish user’s data for Google 8.52 PLN Facebook Google Average monthly revenue from a single Polish Revenue: USD 85.968 million Revenue: USD 182.527 million user’s data for Facebook Advertisements: 98% Advertisements: 80% Source: prepared by PEI. ↘ Chart 4. Average annual revenue from the user’s data (globally, in USD) 40 35 30 25 20 15 10 5 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Google Facebook Source: prepared by PEI.
18 What value do Poles assign to their data? Methodology Unlike market goods, the price of which Experiment (DCE). The questions in the sur- can be seen in commercial transactions, user vey reveal the value that a given person attrib- preferences for the protection of their privacy utes to goods (or services), where the good is remain hidden. The purpose of the study con- characterised by separately-valued attributes, ducted for the purpose of this report was to and each respondent is asked several times measure how users value the hypothetical pri- to choose the best option available, taking vacy protection on Facebook and in Google’s into account its features and cost. The sec- services. To this end, we used non-market ond part of the study was a survey on Poles’ stated preferences survey in our study. We beliefs about digital services. This was sup- used data obtained from specially-designed plemented by questions testing respondents’ surveys that contain hypothetical choice knowledge of how digital services work and situations in the form of Discrete Choice demographic data. ↘ Diagram 2. Example of a set of DCE options used in our study (version for Facebook) Variant A B C Has access Does not have access Has access to data, Privacy to the data placed and does not analyse also from activity on the platform any of your data beyond Facebook Profiling Does not profile you Does not profile you Profiles you Ads based Ads Ads No ads on your data Cost 5 PLN per month 20 PLN per month 0 PLN per month Source: prepared by PEI.
What value do Poles assign to their data? 19 The presented variants of choice con- goods that they already have (Thaler, 1980). cerned four attributes: privacy, described as The method selected (DCE) allowed us to re- the service provider’s access to data (both duce the measurement inaccuracies in the de- that placed directly on the platform and that clared preferences (Prince, Wallsten, 2020), as obtained from other websites visited); profiling well as to extract values for individual levels of (determining the user’s hidden features based selected attributes. For better results, further, on statistical analysis and his or her activity on in-depth empirical research should be carried the platform); the presence of advertisements out. Introducing an actualising stimulus (for (in general and targeted ones), as well as the example, money) into the study would prob- monthly cost of a given variant. ably reduce the estimates presented. In other When assessing the value of non-market words, when respondents have to bear the goods using models based on declared prefer- real cost of protecting their data, the amount ences, one can test users’ willingness to pay they are willing to pay is usually lower than (WTP) or willingness to accept (WTA) for access that in research based solely on declarations. to or restriction of access to a given good. In Nevertheless, the results of studies compar- our study, WTP was selected partly due to the ing survey methods with empirical ones prove inflated results generated by the WTA method, that the values obtained in both cases corre- which are caused by the “endowment effect”, spond (Bizon, 2016). people’s tendency to assign higher value to ↘ Box 3. Comparison of WTP and WTA WTA (willingness to accept) – the limit sales price, the minimum monetary amount of compen- sation that a person is willing to accept in exchange for giving up a certain good. WTP (willingness to pay) – the limit purchase price, a given person’s inclination to buy a certain good for a certain amount. A possible way to estimate the value of a non-market good in one of the two above approaches is a survey in which respondents are presented with a number of service variants with variation in the values of individual parameters. In our experiment, a total of 18 variants were tested for Facebook and Google. Each participant was shown six sets of three variants each for Facebook and same number for Google. In each case, there was also a choice of the status quo variant, in which we assumed that the platform (Facebook or Google) obtains data both from the service offered and from other sources, creates a user profile, shows him or her profiled ads, and the entire service is free. The DCE survey was conducted in com- a representative group of 944 participants. puter assisted web interview (CAWI) format Extensive qualitative preliminary studies (in- around the end of the third quarter of 2021 on depth interviews, pilot study) helped refine the
20 What value do Poles assign to their data? attribute descriptions and optimise the options A similar methodology is commonly used in re- available. Econometric preference analysis search on the valuation of non-market goods is based on random utility theory (McFadden, (Budziński, 2015; Paliński, 2021; Carson, Czaj- 1974). A random parameters (mixed) logit model kowski, 2014; Train, Weeks, 2005; Sobolewski, (MLX), assuming a variety of preferences among Paliński, 2017). For detailed results of the model, respondents, was used to analyse the data. see Appendix 1. Results Our study shows that the average Pole is of discovering features that the user had not inclined to pay PLN 17.07 per month for Face- previously shared on the platform (for exam- book not to have access to data aggregated ple, about their sexual orientation) based on on the platform and from other sources. Poles their online activity – users were ready to pay would be ready to pay PLN 14.10 per month PLN 3.60 per month in the case of Facebook for Google not to have access to private data, and PLN 1.92 in the case of Google. To avoid including activity on other websites. These advertising on Facebook, they would be willing amounts can be interpreted as the value that to pay PLN 3.81 per month; for Google, this is the average Polish Internet user assigns to ac- PLN 4.34. Interestingly, the results of the sur- cess to his or her data. For partial privacy – that vey show that users attribute positive utility is, only allowing the websites to access data on to targeted ads. If they were to receive a non- the platform or in users’ search history, without personalised advertisement on the platform, access to data on their activity on other web- they would expect compensation of PLN 1.28 sites–respondents were willing to pay PLN 12.35 from Google and PLN 1.04 from Facebook for Facebook and PLN 6.71 for Google. To avoid (although, in the latter case, the result was sta- profiling by the platform – that is, the process tistically insignificant). ↘ Table 3. Willingness to pay a monthly charge for a given attribute (in PLN) Attribute Facebook Google Full privacy 17.07 14.10 Partial privacy 12.35 6.71 Lack of ads 3.81 4.34 Lack of targeted ads -1.04* -1.28 Lack of profiling 3.60 1.92 Note: * statically insignificant result. Source: prepared by PEI.
What value do Poles assign to their data? 21 Significantly, respondents’ replies point the way to a discussion on a change in the to the negative value of the status quo. In oth- model of how digital platforms function. er words, users expect compensation for the In this context, it is also worth drawing current situation, in which a platform has ac- attention to the charges incurred by users of cess to all our data, creates a profile of its us- digital services in Poland (Chart 5). More and ers and displays personalised ads.4 more services are available based on a sub- In this way, the results of our study show scription model, in which users pay for access that Polish Internet users would be willing to and ads are not displayed while they are using pay a monthly charge for a service in the form of the service. Examples include VOD platforms, a search engine or social network (on a similar music streaming services (though a “free” ver- basis to the monthly charge for a streaming ser- sion financed from advertising is often available, vice) if, in exchange, the platforms did not col- too), and – increasingly often – news services lect information about their users. This opens (Grzeszak, 2021). ↘ Chart 5. Monthly charge for selected digital services in Poland (in PLN) Netflix Polsat Box Go Premium Play Station Plus Apple TV+ HBO GO Spotify Premium Tidal Premium "Private Facebook" Player Gazeta Wyborcza Premium "Private Google" Amazon Prime 0 10 20 30 40 50 Source: prepared by PEI. 4 The status quo variant was defined in this way. Users currently have the ability to change their privacy settings, but the vast majority maintain settings that allow platforms to access a wide range of their data.
22 What value do Poles assign to their data? Comparison of the results The value of Polish users’ data for Google case of Facebook even more than half the value) and Facebook estimated based on these two that the survey respondents would be willing to companies’ revenue is significantly lower (in pay for maximum privacy protection. ↘ Table 4. Results of two types of analyses: revenue and DCE Value of Polish users’ data For the companies For Internet users Monthly value of the Value of the data made Facebook average Polish user for PLN 8.52 PLN 17.07 available to Facebook by Facebook the average user Monthly value of the Value of the data made Google average Polish user for PLN 10.16 PLN 14.10 available to Google by the Google average user Source: prepared by PEI. In addition, this is a situation in which plat- Comparing the results of these two analy- forms generate significant revenue from users’ ses opens the way to a discussion on changes data, while at least some of these users obtains in the model that digital platforms use to func- negative utility from the current settings con- tion, which is outlined in the final part of this cerning the use of their data. report.
23 Poles’ views on digital platforms and services Overview of the responses The Polish Economic Institute’s survey that digital services, online ads, and paying for online accompanied the study on the value that users content and services. assign to their data concerned their views on ↘ Chart 6. Respondents’ attitude to selected statements concerning digital services and the companies that provide them (as a percentage) 15 17 29 14 25 A world without social media would be better 44 40 5 2 8 Tech companies’ activity should be subject to greater control 47 40 42 7 Tech companies know too much about us Even if a better, paid version of Google were to be 32 41 8 3 16 established, I do not believe that it would protect my rights better Even if a better, paid version of Facebook were to be 38 38 5 4 14 established, I do not believe that it would protect my rights better 6 8 31 50 6 I don’t care what big tech companies are doing with my data 0 20 40 60 80 100 Definitely agree Probably agree Probably disagree Definitely disagree Hard to say Source: prepared by PEI.
24 Poles’ views on digital platforms and services Most respondents are concerned about of Facebook would better protect their rights (73% digital giants’ growing influence: 84% believe that in the case of Google). Most of respondents (81%) tech companies’ activity should be subject to are not indifferent to what is happening to their greater control, and 87% that these companies data. It should be noted that this study was con- know too much about us. This may be related to ducted before former Facebook employee Franc- the distrust towards specific companies: 76% of es Haugen drew attention to Facebook’s approach respondents does not believe that a paid version to problems generated by social media. ↘ Chart 7. Respondents’ attitude to selected statements concerning paying for digital services and online content (as a percentage) 37 40 6 4 14 I pay for ‘free online services’ with my data 11 27 27 15 20 I am ready to pay the services that I use for better privacy protection 38 31 14 5 12 No website or app should charge for access 0 20 40 60 80 100 Definitely agree Probably agree Probably disagree Definitely disagree Hard to say Source: prepared by PEI. ↘ Chart 8. Respondents’ attitude to selected statements concerning online ads (as a percentage) 30 33 13 4 19 There should be a ban on displaying ads based on private individuals’ data 18 25 27 19 12 I don’t understand the basis on which I am being shown ads for products that I discussed with someone recently 11 32 25 13 19 Online ads recognise my needs well 0 20 40 60 80 100 Definitely agree Probably agree Probably disagree Definitely disagree Source: prepared by PEI.
Poles’ views on digital platforms and services 25 Most Internet users know about the transac- Internet uses agree that there should be a ban tion involving exchanging data for access to digital on displaying adds based on private individuals’ services described in this report. 77% of respond- data. This ban would put a stop to the behav- ents agree with the statement that they actually ioural targeting of ads. As a result, the only way pay for free online services with their data. 10% to adjust ads to users’ needs would be to use disagree and, for 14%, it is hard to say. This state that location from which someone goes online, of affairs seems to be widely accepted. 69% of re- or contextual advertising. However, 43% of re- spondents believe that no website or app should spondents say that the current ads addressed charge for access, and 43% do not want to pay at them respond to their needs, and 19% strug- websites for better privacy protection. gled to answer this question. 46 percent of re- The responses to statements concern- spondents understand the basis on which they ing online ads may seem paradoxical. 63% of are shown ads for products. Awareness of the transaction – data for access to services ↘ Chart 9. Respondents’ attitude to selected statements, by age group (differences with regards to average for the whole population, in percentage points) No website or app should charge I pay for ‘free online services’ for access with my data 12 10 10 8 8 6 6 4 4 2 2 0 -2 0 -4 -2 -6 -4 -8 -10 -6 18-34 years 35-44 years 45-54 years 55-64 years 18-34 years 35-44 years 45-54 years 55-64 years Agree Disagree Hard to say Source: prepared by PEI.
26 Poles’ views on digital platforms and services Older people are more convinced that the services using our data. It is worth emphasising Internet is free: 81% of respondents in the 55-64 here that people in the younger age groups use age group, 11 pp more than in the population as paid online services (such as VOD, music servic- a whole. At the same time, however, older people es, and so on) more often (Grzeszak, 2021), which were more likely to say that they understand the is in line with the lower percentage of respond- mechanism via which we pay for access to online ents who expect the Internet to be entirely free. ↘ Chart 10. Respondents’ attitude to the statement “I pay for ‘free online services’ with my data”, by level of education and earnings (as a percentage) 45 36 5 5 9 Master’s degree 43 32 8 5 12 Bachelor's/ Engineer Other post- 37 41 5 4 13 secondary education 31 45 6 1 16 Secondary education 20 34 16 8 22 Vocational education 32 34 4 9 21 Primary education 0 20 40 60 80 100 35 46 5 3 11 Upper (over 4500 PLN per person) 35 36 8 5 16 Middle (1500-4500 PLN per person) 37 36 6 3 19 Lower (up to 1500 PLN per person) 0 20 40 60 80 100 Definitely agree Probably agree Probably disagree Definitely disagree Hard to say Source: prepared by PEI.
Poles’ views on digital platforms and services 27 People with secondary or university educa- member), are more likely to be aware that we tion, as well as people in the upper class based pay for digital services with our data. on revenue (over PLN 4500 net per household ↘ Chart 11. Respondents’ attitude to the claim “A world without social media would be better”, by declared time per day using Facebook (as a percentage) 10 18 28 14 30 I use Facebook for more than 3 hours a day 42 17 16 10 15 I use it for 1.5-3 hours a day 19 16 23 20 21 I use it for 45 minutes - 1.5 hours a day 9 15 33 18 24 I use it for 15-45 minutes a day 14 24 28 12 22 I use it for 15 minutes a day 10 15 33 15 28 I use it, but not every day 13 15 31 11 30 I do not use Facebook 0 20 40 60 80 100 Definitely agree Probably agree Probably disagree Definitely disagree Hard to say Source: prepared by PEI. Interestingly, people who say they spend memory and that it is possible that people who 1.5-3 hours per day on Facebook tend to have use social media a lot, while having a negative a more critical attitude towards social media. In attitude towards it, systematically underesti- this group, 42% of respondents said that they mated the perceived time they spent on the strongly agree with the statement that a world platform (this would explain the big difference in without social media would be better. However, responses between people who use it for more it is worth emphasising that respondents re- than 3 hours a day and those who use it between ported the time spent using the websites from 1.5 and 3 hours a day).
28 Discussion O ur study draws attention to three key per month for a Facebook service that would issues that should become the foun- protect their privacy and PLN 14.10 for the anal- dation for a discussion on how digital ogous service from Google. At the same time, platforms operate in Poland. both companies generate on average PLN 8.52 Firstly, according to this study, the value and PLN 10.16 per month from a single Polish of the data generated by Polish users is sig- user’s data. This means that, theoretically, an nificantly higher than the amount reported by alternative model for managing platform ser- the companies Facebook Poland and Google vices could be created, with Internet users pay- Poland in their financial reports in the National ing a monthly subscription for a service that pro- Court Register (KRS). In its financial report for tects their privacy and does not display ads. For 2020, Facebook Poland recorded PLN 724.14 example, for a fee of around PLN 10 per month, million in revenue in 2020 and paid PLN 5.2 mil- both sides would feel the benefits; this amount lion in income tax. Meanwhile, in accordance is lower than that declared by respondents, but with our calculations, the value of Polish users’ higher than the platforms’ average monthly rev- data for Facebook amounted to PLN 2.2 billion enue from a single user. in 2020. In its financial report, Google Poland5 Thirdly, Polish Internet users’ knowl- recorded PLN 546.52 million in revenue and edge of how the platform economy works paid PLN 12.8 million in income tax. Our calcula- is surprisingly large. As many as 77% of re- tions show that the value of Polish users’ data spondents know that they are actually pay- for Google amounted to PLN 4 billion in 2020. ing for free online services with their data. At We are therefore speaking of around a three- the same time, as the results of the DCE study fold and sevenfold difference, respectively. It is show, they are dissatisfied with this state of af- worth noticing here that the values calculated fairs. This leads us to the paradox visible in the by us cannot, at the moment, be equated with results of the study: asked directly, most re- the companies’ revenue for the purposes of de- spondents think that websites should be free termining the amount of tax due. The difference and would be unwilling to pay to protect their between the declared revenue and the revenue privacy. Juxtaposed with respondents’ lack of arrived at in our study results from the fact that conviction that paid versions of the services the Polish Economic Institute calculated the would better protect their privacy and sense of value that Polish users’ data generates for each threat caused by online surveillance, the trans- of the companies. In contrast, revenue from ad- formation of the platform model may require vertisers may come from all over the world, not more radical change than introducing a sub- only from Poland. scription model. Secondly, Poles are dissatisfied with the The discussion on this matter remains status quo, in which they pay for digital services open and the subscription model mentioned with their data. They are willing to pay PLN 17.07 earlier is not the optimal solution. Assigning 5 The companies Google Cloud and Google Partners operate in Poland, too, but were not taken into account in this study.
Discussion 29 ownership to the data that users generate can their content. In contrast – apart from the cost be problematic, as it is often difficult to identify of maintaining and developing the service – clearly whom the information generated should Facebook does not produce its own content; it belong to. Moreover, the low bargaining power is attractive because of the content created by of a single user compared to a global corpora- its users. Moreover, not all Poles could afford to tion would mean that data could be sold at over- subscribe to digital services that are currently ly low prices. Privacy can also be thought of as free. In this sense, the advertising-based model an inalienable right that should not be subject to is more democratic: it provides the same service market operations. In addition, in the subscrip- to both richer and poorer users. For these rea- tion model, users would pay to access content sons, further research, based on the Polish Eco- that they themselves produce. This solution dif- nomic Institute’s findings, should consider pos- fers from that currently used by streaming ser- sible models for maintaining digital services in vices, where the fee serves to cover the costs of a way that does not violate Internet users’ right film production or paying musicians who present to privacy.
30 Bibliography Acemoglu, D., Makhdoumi, A., Malekian, A., Ozdaglar, A. (2019), Too Much Data: Prices and Inefficiencies in Data Markets, NBER Working Paper Series, National Bureau of Economic Research, Cambridge. Acquisti, A., Marotta, V., Abhishek, V. (2019), Online Tracking and Publishers’ Revenues: An Empirical Analysis, Workshop on the Economics of Information Security, Boston. Bennett Institute (2020), The Value of Data, Cambridge. Bizon, W. (2016), Ankieta a eksperyment w kontekście WTA i WTP – rola bodźca urealniającego w procesie wyceny danych osobowych, “E-mentor”, nr 5(67). Brynjolfson, E., Collis, A., Diewert, W., Eggers, F., Fox, K. (2018), The Digital Economy, GDP and Consumer Welfare: Theory and Evidence, The Sixth IMF Statistical Forum, http://research.economics.unsw.edu.au/ kfox/assets/session-4_kevin-fox_bcdef__digitaleconomy_imf_nov2018_presentation.pdf [accessed: 29.10.2021]. Brynjolfsson, E., Collis, A., Eggers, F. (2019), Using massive online choice experiments to measure changes in well-being, Proceedings of the National Academy of Sciences, No. 116(15), https://doi.org/10.1073/pnas.1815663116. Budziński, W. (2015), Znaczenie zmienności krańcowej użyteczności kosztu ponoszonego przez konsumenta dla wyceny dóbr nierynkowych, “Ekonomia”, nr 43. Bughin, J., Manyika, J. (2013), The mobile Internet’s consumer dividend, McKinesy, https://www.mckinsey.com/industries/technology-media-and-telecommunications/ our-insights/the-mobile-internets-consumer-dividend [accessed: 29.10.2021]. Bukht, R., Heeks, R. (2018), Defining, Conceptualising and Measuring the Digital Economy, “International Organisations Research Journal”, No. 13, https://www.researchgate.net/publication/ 327356904_Defining_Conceptualising_and_Measuring_the_Digital_Economy [accessed: 29.10.2021]. Cadwalladr, C. (2020), Fresh Cambridge Analytica leak ‘shows global manipulation is out of control’, The Guardian, https://www.theguardian.com/uk-news/2020/jan/04/cambridge-analytica -data-leak-global-election-manipulation [accessed: 29.10.2021]. Carrière-Swallow, Y., Haksar, V. (2019), The Economics and Implications of Data, International Monetary Fund, Washington. Carson, R., Czajkowski, M. (2014), The discrete choice experiment approach to environmental contingent valuation, (in:) Hess, S., Daly, A. (ed.), Handbook of Choice Modelling, Edward Elgar Publishing, Cheltenham. CompaniesMarketCap (2021), Largest Companies by Market Cap, https://companiesmarketcap.com/ [accessed: 29.10.2021]. DataReportal (2020), Digital 2020: Poland, https://datareportal.com/reports/digital-2020-poland [accessed: 12.10.2021]. Doligalski, T. (2013), Platformy wielostronne: istota i rozwój relacji z klientami, “E-mentor”, nr 1(48). eMarketer (2021), Worldwide Digital Ad Spending 2021, https://www.emarketer.com/content/ worldwide-digital-ad-spending-2021 [accessed: 12.10.2021].
You can also read