2021 Price setting in chile: Micro evidence froM consuMer on-line Prices during the social outbreak and covid-19
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Price setting in Chile: Micro evidence from consumer on-line 2021 prices during the social outbreak and Covid-19 Documentos de Trabajo N.º 2112 Jennifer Peña and Elvira Prades
Price setting in Chile: Micro evidence from consumer on-line prices during the social outbreak and Covid-19
Price setting in Chile: Micro evidence from consumer on-line prices during the social outbreak and Covid-19 (*) Jennifer Peña (**) Central Bank of Chile Elvira Prades (***) Banco de España (*) We are very grateful for useful comments to internal seminar participants, both in BoS and the CBCh, and to participants of the Conference on “Non-traditional data and statistical learning with applications to macroeconomics” organized by the Bank of Italy and the Federal Reserve Board and to Isabel Sánchez García. The views expressed herein are those of the authors and do not necessarily reflect the views of Banco de España nor the Central Bank of Chile. (**) Central Bank of Chile (CBCh). E-mail: jpena@bcentral.cl. (***) Bank of Spain (BoS). This paper was written while visiting the Central Bank of Chile under the agreement signed with Banco de España. The hospitality of the Central Bank of Chile is gratefully acknowledged. E-mail: elvira.prades@bde.es. Documentos de Trabajo. N.º 2112 2021
The Working Paper Series seeks to disseminate original research in economics and finance. All papers have been anonymously refereed. By publishing these papers, the Banco de España aims to contribute to economic analysis and, in particular, to knowledge of the Spanish economy and its international environment. The opinions and analyses in the Working Paper Series are the responsibility of the authors and, therefore, do not necessarily coincide with those of the Banco de España or the Eurosystem. The Banco de España disseminates its main reports and most of its publications via the Internet at the following website: http://www.bde.es. Reproduction for educational and non-commercial purposes is permitted provided that the source is acknowledged. © BANCO DE ESPAÑA, Madrid, 2021 ISSN: 1579-8666 (on line)
Abstract In this paper we analyze the price setting behavior in Chile by using scraped data from public websites of the main retailers including supermarkets, a pharmacy retailer and car dealerships. The data collection started in July 2019 and the dataset covers two major recent events: (1) the social outbreak and (2) the state of emergency declaration due to Covid-19, both episodes led to disruptions in the economy. With information on product varieties that accounts for 22 % of the CPI basket, we document several empirical findings as regards price setting behaviour in terms of stickiness, that is, frequency, implied duration and the size of price adjustments. We find that in spite of facing large shocks, prices adjusted very little, at a lower frequency and at a smaller size than prior to these two events. We also find that there was a reduction on product variety availability on-line, a typical feature that also has been found during natural disasters such as earthquakes. The reduction in product availability poses additional difficulties to construct CPI indexes and to properly capture price rigidities, which are relevant for monetary policy. Keywords: on-line price data, CPI, prices stickiness, retail distribution. JEL classification: E01,E31,L81.
Resumen En este trabajo analizamos la fijación de precios en Chile utilizando técnicas de webscraping, donde recolectamos los precios de productos de consumo a través de las páginas web de los principales establecimientos de comercio minorista del país. La base de datos incluye precios de productos de supermercados, farmacias y comercializadoras de automóviles. La recolección de datos se inició en julio de 2019 y contiene información de precios mientras ocurrían dos hechos relevantes: 1) el estallido social, y 2) la declaración del estado de emergencia por el COVID-19. Ambos episodios conllevaron disrupciones en el funcionamiento normal de la actividad. Con información de productos que representan aproximadamente el 22 % de la canasta del índice de precios de consumo (IPC), documentamos el comportamiento de los precios en términos de frecuencia y duración y el tamaño de estos ajustes, indicadores que informan sobre la flexibilidad o la rigidez del ajuste ante perturbaciones de esta magnitud y comparamos con respecto a tiempos normales. Encontramos que, durante estos episodios, la rigidez en el ajuste aumentó, ya que los precios se ajustaron con menor frecuencia y las variaciones, en promedio, fueron de menor tamaño. También encontramos que se produjo una importante reducción en la variedad de productos disponibles durante la pandemia, un hecho que suele ocurrir durante eventos disruptivos, como los desastres naturales. Esta reducción en las variedades supone un reto adicional para la elaboración del IPC, así como una mayor complejidad para capturar las rigideces nominales, que son relevantes para la política monetaria. Palabras clave: precios on-line, índice de precios de consumo, rigideces de precios, distribución minorista. Códigos JEL: E01,E31,L81.
1 Introduction The ability of monetary policy to influence in the short-run real economic activity it much depends on nominal rigidities. This paper intends to contribute to understand from a micro perspective the behavior of price setting in Chile during the 2019 social outbreak and the Covid-19 pandemic.1 The price setting behavior is even more relevant in a context with heightened uncertainty as regards how inflation will 1 Introduction behave over time.2 TheToability of monetary this end policyuse we will make to influence in the of a dataset short-run collected overreal economic internet sitesactivity of mainit much depends on nominal rigidities. This paper intends to contribute to understand retailers in Chile. The use of Big Data by policy makers and researchers became more from a micro popular perspective and accepted the behavior in recent years. Theof main price source settingisinInternet Chile during the Since data sets. 2019 1 social outbreak online shopping and the Covid-19 substitutes, pandemic. or at least The price supplements, offlinesetting behavior shopping, onlineisprices even more relevant in a context with heightened uncertainty as regards how inflation can also be used as a substitute, or supplement, of offline prices. One concern may will 2 behave lie overdifferences on the time. between on-line and off-line prices, the discrepancy between these two channels is expected to be small as in Chile third-party delivery services are To this used widely end weandwill make have use ofpenetration a higher a dataset collected comparedover internet to other sites of countries. 3 main retailers in Chile. The use of Big Data by policy makers and researchers became more popular Onlineanddata accepted in recent have many years. The advantages in main source is the context of Internet data sets. the Covid-19 Since Pandemic online shopping where there are substitutes, or at least lockdowns, mobility supplements, restrictions and offline shopping, online social-distancing rules. prices Data can also be used as a substitute, or supplement, of offline prices. One concern collection over the internet is called web scraping. This technique provides flexibility may lie andon the differences extreme automation.between Thereon-line and off-line is abundant prices, anecdotal the discrepancy evidence that scraped between prices these two channels is expected to be small as in Chile third-party delivery is a potentially useful tool for nowcasting and short-term forecasting of CPI inflation services 3 are widely as well usedsales as retail and variables. have a higher Onlinepenetration compared CPI provides to other with policy-makers countries. a reasonably good “pulse” as to the direction being taken by inflation in real time and at a higher Online data frequency. have many The Billion Prices advantages Project (BPP)in theatcontextMIT isofa the solidCovid-19 evidencePandemic that has where there are lockdowns, been functioning for more than a decade. mobility restrictions 4 and social-distancing rules. Data collection over the internet is called web scraping. This technique provides flexibility andThe extreme automation. objective Thereis istoabundant of this paper anecdotal provide evidence onevidence the behavior that of scraped prices prices/price is a potentially setting in Chile useful based tool fordataset on the nowcasting and short-term collected by the “Online forecasting Prices of CPI inflation Project” by the as well as retail sales variables. Online CPI provides policy-makers Central Bank of Chile. The dataset covers two episodes of particular interest that led with a reasonably good “pulse” asintothe to disruptions theeconomy, directionparticularly being takeninbyterms inflation of thein degree real time of and priceatrigidities, a higher frequency. by using online Theprices. BillionWe Prices wantProject to answer(BPP) at MIT questions suchis as: a solid How evidence that has nominal rigidities 4 been functioning for more than a decade. have changed during the social outbreak and the pandemia episode in Chile? Do we observe product/sectoral differences? What is the role of seasonal/sales discounts in observe 1 product/sectoral differences? What is the role of seasonal/sales discounts in a context iswhere Thereobjective The a of a renewed substantial interest thisshare this paper front of there is to as products provide are haveinitiatives recent evidence beenthedropped to studyoutof from pricing on-line behaviour afrom context a micro where a substantial perspective such as share the of products Price-setting have on Microdata been Analysis behavior dropped out Network prices/price from (PRISMA) on-line and retailers? setting in Chile based on the dataset collected by the “Online Prices Project” by the retailers? the Inflation Persistence Network (IPN) by the European Central Bank. PRISMA is a new ESCB Central Bank ofestablished research network Chile. The in dataset 2018 withcovers two episodes a mandate from the of particular Governing interest Council, thisthat led network to disruptions will In June study in the economy, 2019 thebehaviour price-setting particularly “OnlineofPrices Project” individual in terms firms and of the wasininitiated degree the retail at of price theusing sector Centralrigidities, microBank price In June 2019 the “Online Prices Project” was initiated at the Central Bank by using datasets. of Chile. online The prices. purpose Wewaswanttotofollow-up answer questions price such as: Howfocused developments nominalon rigidities goods of 2Chile. See The (2020) Blanchard purpose was to follow-up price developments focused on goods https://voxeu.org/article/there-deflation-or-inflation-our-future. have typically 3 changed duringatthe available social outbreak supermarkets andweb using the scraping pandemiatechniques; episode in Chile? Do we this provides typically availableonatthesupermarkets For a discussion representativenessusing web scraping of on-line techniques; versus off-line this provides data and implications see flash Cavallo 1 estimates (2017). of CPI available onasathere daily basis, that is, to with higher frequency flash estimates There of is a renewed CPI available interest this fronton a daily are basis, that is, recent initiatives with higher study pricingfrequency behaviour and 4 more timely that the one provided by the national statistical office. fromFor more details amore micro see http://www.thebillionpricesproject.com/. perspective This and timely thatsuch theasonethe provided Price-settingbyMicrodata Analysis the national Network (PRISMA) statistical office. Thisand article describes the Inflation the Network Persistence behavior(IPN)of online prices inCentral by the European Chile, Bank. taking a deeper PRISMA lookESCB is a new into article research describes the behavior network established ofwith online prices from in Chile,Governing taking aCouncil, deeper look into the micro-data that we in 2018collected have a mandate on a dailythebasis 3on between Julythis2019 network till the will micro-data study that price-setting we have behaviour ofcollected individual firms a daily and in basis the between retail sector July using 2019 micro till price November 2020, from the two of the largest retailers of fast moving consumer goods November datasets. 2020, from the two of the largest retailers of fast moving consumer goods (groceries, 2 personal See Blanchard care (2020)care and household equipment), a health retailer and auto https://voxeu.org/article/there-deflation-or-inflation-our-future. (groceries, personal and household equipment), a health retailer and auto dealers 3 in Chile. For aindiscussion We provide on provide evidence on the the representativeness key features on price-setting behaviour. dealers Chile. We evidence on of the on-line versus off-line key features data and implications on price-setting behaviour. see Cavallo (2017). 4 For more details see http://www.thebillionpricesproject.com/. The dataset collected includes products corresponding to expenditure categories The dataset collected includes products corresponding to expenditure categories that cover around 22% of the official Chilean CPI. In spite the short-time span, due that cover around 22% of the official Chilean 3 CPI. In spite the short-time span, due to7the recent start of this project, our sample BANCO DE ESPAÑA DOCUMENTO DE TRABAJO N.º 2112 covers two major events that took to the recent start of this project, our sample covers two major events that took place in the Chilean economy. Along this period two major shocks occurred which place in the Chilean economy. Along this period two major shocks occurred which are unambiguous examples of exogenous and unanticipated aggregate shocks, which are unambiguous examples of exogenous and unanticipated aggregate shocks, which makes them very interesting to analyze how prices are being set and to anticipate
In In June June 2019 2019 the the “Online “Online Prices Prices Project” Project” waswas initiated initiated atat the the Central Central Bank Bank of Chile. The purpose was to follow-up price developments of Chile. The purpose was to follow-up price developments focused on goods focused on goods typically typically available available at at supermarkets supermarkets using using web web scraping scraping techniques; techniques; this this provides provides flash estimates of CPI available on a daily basis, that is, with higher flash estimates of CPI available on a daily basis, that is, with higher frequency frequency and and more more timely timely that that the the one one provided provided byby the the national national statistical statistical office. office. This This article article describes the behavior of online prices in Chile, taking a deeper look describes the behavior of online prices in Chile, taking a deeper look into into the the micro-data that we have collected on a daily basis between July 2019 till micro-data that we have collected on a daily basis between July 2019 till November November 2020, 2020, from from the the two two of of the the largest largest retailers retailers of of fast fast moving moving consumer consumer goods goods (groceries, (groceries, personal personal care care and and household household equipment), equipment), aa health health retailer retailer and and auto auto dealers in Chile. We provide evidence on the key features on price-setting dealers in Chile. We provide evidence on the key features on price-setting behaviour. behaviour. The The dataset dataset collected collected includes includes products products corresponding corresponding to to expenditure expenditure categories categories that that cover around 22% of the official Chilean CPI. In spite the short-time cover around 22% of the official Chilean CPI. In spite the short-time span, span, due due to the recent start of this project, our sample covers two major to the recent start of this project, our sample covers two major events that tookevents that took place place in in the the Chilean Chilean economy. economy. Along Along this this period period two two major major shocks shocks occurred occurred which which are are unambiguous examples of exogenous and unanticipated aggregate shocks, unambiguous examples of exogenous and unanticipated aggregate shocks, which which makes them very interesting to analyze how prices are being set makes them very interesting to analyze how prices are being set and to anticipateand to anticipate what what could could bebe the the implications implications for for CPI CPI developments. developments. Firstly, Firstly, the the 18 18 of of October October (18-O) (18-O) the the social social outbreak outbreak started, started, during during aa week week aa severe severe curfew curfew was was set set and and from from this date on regular demonstrations took place with different intensities. this date on regular demonstrations took place with different intensities. All these All these actions actions led led to to some some disruptions disruptions in in economic economic activity, activity, mainly mainly in in the the retail retail sector. sector. Strikes Strikes and demonstrations dramatically reallocate store traffic and opening hours and demonstrations dramatically reallocate store traffic and opening hours which led to an increase of on-line sales and delivery which led to an increase of on-line sales and delivery services. services. Then, Then, aa second second shock shock took took place, place, this this time time ofof aa global global nature, nature, the the Covid-19 Covid-19 pandemia. pandemia. The first Covid-19 case was registered in Chile in early March The first Covid-19 case was registered in Chile in early March andand by mid-March, in order curb the spread of the virus and to avoid by mid-March, in order curb the spread of the virus and to avoid the collapse the collapse of of the the health health system, system, thethe government government introduced introduced some some measures measures in in order order to to maintain maintain social social distance. distance. To To this this end, end, the the government government announced announced dynamic dynamic andand selective selective quarantines, quarantines, curfews, curfews, the the closure closure of of restaurants, restaurants, malls, malls, schools, schools, universities universities and, and, when possible, stay at home work was encouraged. Few activities when possible, stay at home work was encouraged. Few activities such such as as supermarkets supermarkets and pharmacies where allowed to function normally. And, again, it and pharmacies where allowed to function normally. And, again, it was registered a heightened demand for online shopping. Both episodes was registered a heightened demand for online shopping. Both episodes implied the implied the closure closure of of some some retail retail stores stores and and an an increased increased demand demand forfor delivery delivery services. services. The The pandemia pandemia also put some pressures from the supply side. Under this episode we can also put some pressures from the supply side. Under this episode we can also also expect expect some some supply supply disruptions disruptions asas some some manufacturing manufacturing activities activities require require high high social social interaction. interaction. We We end end up up with with aa dataset dataset following following prices prices onon aa daily daily basis basis for for approximately approximately 47 47 product categories out of the 305 included by the National Statistical product categories out of the 305 included by the National Statistical Office, Office, and 4 and start following around 4, 755 product-varieties that account for 22% of the CPI start following around 4, 755 product-varieties 4 that account for 22% of the CPI Chilean basket. Chilean basket. The The main main stylized stylized facts facts on on the the price price setting setting was was calculated calculated out out for for four differentiated periods: (1) “pre-shocks”, that goes from June 2019 to 18 four differentiated periods: (1) “pre-shocks”, that goes from June 2019 to 18 of October, (2) “social outbreak” from the end of October till mid-March of October, (2) “social outbreak” from the end of October till mid-March and (3) and (3) “pandemia” “pandemia” fromfrom 18 18 March March and and (4) (4) from from the the partial partial uplift uplift of of mobility mobility restrictions restrictions from July till November 2020 (502 days at this point). We start from July till November 2020 (502 days at this point). We start by constructing by constructing common statistics common statistics of of price price setting setting based based onon averages averages across across products products and and time. time. These include measures focused on three dimensions: frequency These include measures focused on three dimensions: frequency of price changes, of price changes, implied implied duration duration and and the the magnitude magnitude of of price price changes. changes. The The main main conclusions conclusions are are that that both both product product availability availability and and the the price price setting setting behavior changed during Covid-19 and the social outbreak episodes. In behavior changed during Covid-19 and the social outbreak episodes. In particular, particular, the frequency the frequency of of price price changes changes fell fell during during both both episodes, episodes, suggesting suggesting that that firms firms were delaying their price adjustments. In sum, after the two shocks prices were delaying their price adjustments. In sum, after the two shocks prices became became moderately BANCO DE ESPAÑA 8 moderately more sticky. more sticky. InIn terms DOCUMENTO DE TRABAJO N.º 2112 terms ofof availability availability we we found found that that there there was was aa significant reduction significant reduction in in online online availability availability of of products products during during pandemia pandemia andand that that they are recovering at a slow pace. This reduction occurred after March they are recovering at a slow pace. This reduction occurred after March 18, the 18, the date in date in which which the the state state of of alarm alarm was was declared declared and and confinement confinement measures measures where where
The main conclusions are that both product availability and the price setting behavior changed during Covid-19 and the social outbreak episodes. In particular, the frequency of price changes fell during both episodes, suggesting that firms were delaying their price adjustments. In sum, after the two shocks prices became moderately more sticky. In terms of availability we found that there was a significant reduction in online availability of products during pandemia and that they are recovering at a slow pace. This reduction occurred after March 18, the date in which the state of alarm was declared and confinement measures where put in place in Chile due to Covid-19. The goods that disappear quickly from the stores are mostly perishable and some nonperishable items that people are likely to Weinend hoard fearupof with futurea supply datasetdisruptions, following prices that isontoasay, daily basis on a focus foressential approximately goods. 47 productduring However, categories out ofoutbreak the social the 305 theincluded by the reduction in National Statisticalwas online availability Office, not and start following significant. around 4, 755 product-varieties that account for 22% of the CPI Chilean basket. The main stylized facts on the price setting was calculated out for Along four differentiated these episodes periods: the CLP (1)also “pre-shocks”, registered athat goes fromagainst depreciation June 2019 to 18 the USD. of October, (2) “social outbreak” from the end of October Therefore, if passed through, we may expect imported goods to become more till mid-March and (3) “pandemia” expensive. To from 18 March explore this,and we (4) havefrom the partial identified the uplift brand offormobility restrictions each product and from July till November 2020 (502 days at this point). We start therefore we can infer the country of origin. This allows us to explore additional by constructing common statistics features such as theofrole price setting based of exchange on averages or rate pass-through across whetherproducts there and have time. been These includedisruptions. supply-chain measures focused on three dimensions: frequency of price changes, 5 implied duration and the magnitude of price changes. This paper is organized as follows. After this introduction, in section 1, we reviewThe themain conclusions literature are that both on microdata product on prices andavailability webscraping andinthe price setting section 2. In behavior section 3changed we describeduringthe Covid-19 and the social data collection method outbreak and theepisodes. In particular, characteristics of the the scrapedfrequency data. of In price section changes 4 we fell reportduring the both main episodes, results onsuggesting that firms product availability, we conclude were frequencydelayingand with sizethe their of preliminary price adjustments. adjustments results Indrawn comparing so4far. sum,allafter the two Finally, periods. shocks prices became in section 5 moderately more sticky. In terms we conclude with the preliminary results drawn so far. of availability we found that there was a significant 5 We are awarereduction that this in assumption online availability of products may be simplistic in a contextduringwhere pandemia and firms some domestic that they are recovering at a slow pace. This reduction occurred after March 18, the might be using imported goods and at the same time we are identifying the country of origin of final goods, that at the same time if involved in global value date in which the state of alarm was declared and confinement measures where chains the true origin of the product 2 be Literature can from other or from multiple Review origins. put in place in Chile due to Covid-19. The goods that disappear quickly from the 2 Literature stores are mostly perishable Review and some nonperishable items that people are likely to Price setting analysis based on official micro data.– The existence of price hoard in fear of future supply disruptions,5 that is to say, a focus on essential goods. rigidities Price setting is a determinant of the oneffectiveness of monetary policy. In general, However, duringanalysis the social based outbreak official micro the reduction data.– in online The existence availability of price was not the literature rigidities is a refers determinantto two of ways the toeffectiveness measure theof degree monetary of price policy.flexibility: In one general, significant. constitutes the literature a macro refers to approach two ways through modelling to measure the asdegreein Gali and Gertler of price flexibility:(2000), one and another constitutes is a macrothe micro approachapproach. through The latter is based modellingaasdepreciation on in Gali and four types Gertler of data Along these episodes the CLP also registered against the(2000), USD. sources: and on surveys another is the directed micro at companies, approach. The onlatter microdata is basedcompiled on fourin each types reporting of data Therefore, if passed through, we may expect imported goods to become more establishment sources: on surveys and for each directed variety of article by the agency in charge of generating expensive. To explore this,atwe companies, have identifiedon microdata the brand compiled for each in each reporting product and the consumer price establishment and indicator, for each on scanner variety of databyand article themoreagency recently in web scraped charge of data. generating therefore we can infer the country of origin. This allows us to explore additional This paper focuses the consumer price on the micro on approach using scraped online data. features such as theindicator, role of exchange scanner ratedata and pass-throughmoreor recently whether web scraped there have data. been This paper focuses supply-chain disruptions. on the5 micro approach using scraped online data. Three studies have analyzed the patterns of price setting of Chile, evidence from 1999-2005 has been Three studies havereported analyzedbythe Medina patterns et of al.price (2007) usingof micro setting Chile, data evidenceprovided from by This the paper national is organized statistical as follows. office, and hasAfter beenthis introduction, recently updated in bysection Canales 1,and we 1999-2005 has been reported by Medina et al. (2007) using micro data provided review López the literature on microdata on prices and webscraping in section 2. In by the (2021) national with information statistical office,upand to has March 2020. been recentlyBothupdated studies by findCanales substantial and section heterogeneity 3 we describe theofdata collection method and divisions. the characteristics ofetthe López (2021) in withfrequency information adjustment up to March across product 2020. Both studies Chaumontfind substantial al. scraped (2011) data. In provideinevidence section with4 we report scanner data the main 2007-2009 results on in divisions. product the supermarket availability, industry, heterogeneity frequency of adjustment across product Chaumont et al. frequency coverage ofand 7.23%sizeofofthe adjustments CPI basket. comparing all 4 periods. Finally, in section 5 (2011) provide evidence with scanner data 2007-2009 in the supermarket industry, coverage 5 of 7.23% of the CPI basket. We are aware that this assumption may be simplistic in a context where some domestic firms Price might be using setting imported analysis goods andbased at the sameon webscraping techniques.– time we are identifying the country This paper of origin of contributes finalPrice to up-date setting goods, that the at the analysis facts same time ifbasedfor Chile involvedon but with webscraping in global online or web value chainstechniques.– scraped the true origin ofThisdata. One paper the product of can the firstother be from contributes one to from or to up-date use multiple online the factsdatafor to origins. construct Chile but with a price onlineindex or web in scraped Chile was Cavallo data. One (2013) of the firstand onedue to to use its good onlineperformance data to construct and it ahas priceproven indextoinbeChile a representative was Cavallo source BANCO DE ESPAÑA 9 (2013) of information. and due to its good DOCUMENTO DE TRABAJO N.º 2112 Online data has5and performance proved it hasto proven providetomany be a advantages representative in particular during the social source of information. distancing Online data has episode. proved The practicemany to provide of social distancing advantages in means staying at home and away from others as much particular during the social distancing episode. The practice of social distancing as possible to help prevent of Covid-19. means staying Online data and at home also away has advantages from othersinasthe muchcontext of natural as possible to help disasters, preventforof
(2011) provide evidence with scanner data 2007-2009 in the supermarket industry, coverage of 7.23% of the CPI basket. Price setting analysis based on webscraping techniques.– This paper contributes to up-date the facts for Chile but with online or web scraped data. One of the first one to use online data to construct a price index in Chile was Cavallo (2013) and due to its good performance and it has proven to be a representative source of information. Online data has proved to provide many advantages in particular during the social distancing episode. The practice of social distancing means staying at home and away from others as much as possible to help prevent of Covid-19. Online data also has advantages in the context of natural disasters, for more details see Cavallo et al. (2014), where they analyze the price setting behavior after the earthquake in Chile in 2010 and in Japan 2011. Notwithstanding, online data have also limitations. Even when retail sales are carried online, their services could stop operating during crisis, so there is no way to scrape data. The applications in empirical literature of online or web scraped data are in general three-folded and include: (1) inflation measurement, (2) inflation forecasting (including nowcasting) and (3) the analysis of micro price setting mechanism. In 2008, the Billion Prices Project was created at MIT. It has remained the largest project focused on web scraping and online prices analysis, for more details see Cavallo Cavallo and and Rigobon Rigobon (2016). (2016). The The usage usage ofof big big data data inin Central Central Banks Banks also also is is becoming a common practice. In 6 becoming a common practice. In addition to the above mentioned PRISMA led addition to the above mentioned PRISMA led by by ECB. ECB. Some Some central central banks banks have have used used big big data data tools tools to to track track prices prices online. online. Macias Macias and Stelmasiak (2019), they evaluate the ability of web scraped data to and Stelmasiak (2019), they evaluate the ability of web scraped data to improve improve nowcasts nowcasts of of Polish Polish food food inflation. inflation. They They also also explore explore product product selection selection and and classification classification problems, problems, their their importance importance in in constructing constructing web web price price indices indices and and other other limitations of online datasets. The Sveriges Riksbank, Hull et al. limitations of online datasets. The Sveriges Riksbank, Hull et al. (2017), (2017), they they developed developed anan automatic automatic internet internet data data collection collection process process to to collect collect sales sales prices prices daily for selected fruits and vegetables from a number of Swedish online daily for selected fruits and vegetables from a number of Swedish online retailers. retailers. Their Their results results indicate indicate that that the the information information from from the the daily daily data data could could increase increase the the precision in short-term inflation forecasts in precision in short-term inflation forecasts in Sweden.Sweden. Once Once wewe have have mentioned mentioned the the penetration penetration that that web web scraping scraping hashas had had in in central central banking as a great ally of big data, we want to use its benefits to banking as a great ally of big data, we want to use its benefits to provide evidence provide evidence on price on price setting setting and and inflation inflation during during Covid-19 Covid-19 pandemic pandemic lockdowns. lockdowns. A A lockdown lockdown implies that implies that many many of of typical typical products products thatthat people people buybuy areare no no longer longer available. available. During this During this period period there there are are changes changes bothboth onon the the supply supply andand on on the the demand demand side side that could be reflected in prices, generating either inflationary pressures or that could be reflected in prices, generating either inflationary pressures or deflation. Cavallo (2020) has found that consumers spend relatively deflation. Cavallo (2020) has found that consumers spend relatively more on food more on food and other and other categories categories with with rising rising inflation, inflation, and and relatively relatively less less on on transportation transportation and and other categories other categories experiencing experiencing significant significant deflation. deflation. TheThe coronavirus coronavirus pandemic pandemic has has radically changed radically changed what what households households are are buying. buying. People People are are spending spending aa relatively relatively more more money on food at grocery stores and less on transportation, healthcare and money on food at grocery stores and less on transportation, healthcare and clothes. Also, since overall spending is down, housing costs now clothes. Also, since overall spending is down, housing costs now represent a larger represent a larger share of share of most most people’s people’s expenditures; expenditures; see see Diewert Diewert andand Fox Fox (2020), (2020), Carvalho Carvalho et et al. al. (2020), Dunn (2020), Dunn etet al. al. (2020) (2020) and and Cavallo Cavallo (2020) (2020) for for evidence evidence ofof dramatically dramatically changing changing consumer expenditure consumer expenditure patterns patterns arising arising from from the the pandemic. pandemic. Similarly, by Similarly, by using using scanner scanner data data from from UK UK supermarkets supermarkets covering covering the the Great Great Lockdown episode Lockdown episode Jaravel Jaravel and and O’Connell O’Connell (2020) (2020) for for fast-moving fast-moving consumer goods consumer goods also report a sharp decline in product variety and a reduction in promotions. also report a sharp decline in product variety and a reduction in promotions. BANCO DE ESPAÑA 10 DOCUMENTO DE TRABAJO N.º 2112 3 3 Data Data description description
3 Data description The “Online Prices Project” was initiated by in June 2019 by the Central Bank of Chile, with the objective of collecting public prices posted online by main retailers in Chile. At the initial stages of the project the aim was to construct an index based on fast-arriving online prices of food and alcoholic beverages and other fast moving consumer goods usually sold in supermarkets. By December 2019, the such asofclothing, scope productsfootwear, medicines was expanded andasstarted well astovehicles. scrap theIn prices this paper we will of other focus product on suchgoods typically as clothing, availablemedicines footwear, in two supermarkets, a pharmacy as well as vehicles. In this and papertenwecar willdealers focus where on goodswe typically have collected for more available than in two a 7year of daily supermarkets, data. a pharmacy and ten car dealers where we have collected for more than a year of daily data. Although online transactions are still a small fraction of all retail sales in most countries, Although we online are confident that the transactions aredata still collected online also a small fraction of allprovides information retail sales in most about offline countries, we prices and product are confident availability. that the In Chile data collected delivery online servicesinformation also provides are widely used aboutand moreover, offline prices anddueproduct to the availability. health crisesInderived from Covid-19 Chile delivery services the switch are widely towards used andon-line shopping moreover, due towasthe more widespread health amongfrom crises derived consumers Covid-19 thantheusual in switch 6 order towardsto avoid on-linesocial contact. shopping was more widespread among consumers than usual in order to avoid social contact.6 3.0.1 Products and retailers selection 3.0.1 Products and retailers selection To select which products to track, we comply the criteria suggested by ?: (1) the product To selectshould which be from atoretailer products track, with a highthe we comply market share criteria and (2)bythis suggested ?: retailer (1) the should exploit different channels to sell their products (multi-channel). product should be from a retailer with a high market share and (2) this retailer We only track official pages of retailers and not of intermediaries. should exploit different channels to sell their products (multi-channel). We only track official pages of retailers and not of intermediaries. With this in mind, for grocery products we track prices from the main largest retailers With in thisChile. in mind, These are theproducts for grocery two main weplayers in thefrom track prices retail thefor main food and largest supermarket suppliesThese retailers in Chile. with operations are the twoalong mainthe country. players in theBoth account retail for more for food and than 70% market share. For health products we obtain supermarket supplies with operations along the country. Both account for more prices from the main pharmaceutical than 70% market retailer in Chile share. for 25%products For health of the market share in we obtain the pharma prices from the market. main For car dealers we have collected ten models of passenger cars pharmaceutical retailer in Chile for 25% of the market share in the pharma market. and sport utility vehicle For car(suv). dealers we have collected ten models of passenger cars and sport utility vehicle (suv). Another element to consider when selecting the goods to track is the definition of variety Anotherand product. element The National to consider when selectingStatistical Officeto(Instituto the goods track is theNacional de definition Estadı́stica of variety and- INE) defines The product. a variety as “aStatistical National good or service Office that forms Nacional (Instituto the basic de or elemental Estadı́sticaunit - INE)of the IPC abasket defines varietyand is defined as “a good oraccording service thatto aforms set oftheattributes basic or or pre-established specifications, such as the brand, description, elemental unit of the IPC basket and is defined according to a set of attributes size, content, packaging and provenance, or pre-established among other specifications, such as specific characteristics”, the brand, description,whilesize, a product content,is the “generic term to refer to a good or service that it has a defined packaging and provenance, among other specific characteristics”, while a product is purpose and for which it has term the “generic an expense to referweight. In terms to a good or serviceof the CPI, that it isa the it has representation defined purpose and of for an elementary which it hasaggregate”. an expense Aweight. product In isterms madeofup theofCPI, varieties, as an it is the example, we of representation have an the Pisco product that has Pisco 35 grades or 40 grades; from 700 elementary aggregate”. A product is made up of varieties, as an example, we have to 1000 cc varieties. the Pisco product that has Pisco 35 grades or 40 grades; from 700 to 1000 cc varieties. 6 For a discussion on the representativeness of on-line versus off-line data and implications see Cavallo 6 (2017). For a discussion on the representativeness of on-line versus off-line data and implications see Cavallo (2017). BANCO DE ESPAÑA 11 DOCUMENTO DE TRABAJO N.º 2112 8 8
3.0.2 Web scraping We have collected information through web scraping techniques conducted in Python using Selenium, Beautiful Soup as well as auxiliary libraries to fetch and process data. We collect online prices every day and we perform data acquisition while minimizing the burden on web stores owner. We build an automated procedure that on a daily basis scans the retailers web and searches for the web code associated with each product variety and collects the price, and other characteristics and stores this information. As illustrated in figure 1 we proceed in three steps: • First step. Each day, between 10:00 p.m and midnight, we download from each selected url the information on the varieties and prices. • Second step. The code finds the relevant information. An example of a product-variety in our data is “Brand Name, Pisco, 1 liter”. The scraping software automatically collects data on every single product on display on the retailer’s website each day. We only extract goods are listed on the website if they are in stock, are available for on-line sale and that are not labelled as “agotado”, that is out of stock. Some out of stock items are labelled as such or they immediately disappear from the website, and only reappear on the day that they are offered for sale again (if ever). We can therefore build statistics to study how the set of goods available for purchase changes over time. • Third step. The code stores the following information for each product: its price, the unit of measure (e.g. kilograms, millimeters,..), the product description, whether it is under a promotion, the Stock Keeping Unit (SKU) and the date. The price includes taxes and sale discounts, to account for the final price paid by the consumer. We do not include shipping/delivery costs, which may vary according to the location of the purchaser and are an exclusive feature of the online purchasing experience. In addition, missing prices in the data were often caused by scraping errors on days when the software fails to automatically start. In sum, the collection of data follows the best practices for statistical collection delaying the access to time frames where little traffic is expected and we use Python given the large amount of information we handle and its flexibility. In table 1 we summarize the coverage by division compared with the official methodology. 3.0.3 Data cleaning and filters When analyzing prices at high frequencies, it is observed that the actual prices they often temporarily move away from a slow-moving trend line called the regular price, but after a temporary price change, the nominal price often returns exactly to its pre-existing level. These distinctive features imply that even though an individual price series has a great deal of 9high-frequency price flexibility (the actual price changes frequently), the series also has a great deal of low-frequency price stickiness (the regular price changes infrequently). Before we can address how to treat sales in constructing a measure of price stickiness, we must first be able to accurately identify sales. Sales or promotions are short-lived price changes that are reverted. Constructing a dependable algorithm for12identifying sales has proven challenging for researchers. Researchers often turn BANCO DE ESPAÑA DOCUMENTO DE TRABAJO N.º 2112 to other methods of detecting temporary price changes, such as asserting that sale prices have a particular shape. A common identifier of sales is V-shaped temporary discounts, similar to those employed by Nakamura and Steinsson (2008). We also
Before we can address how to treat sales in constructing a measure of price stickiness, we must first be able to accurately identify sales. Sales or promotions are short-lived price changes that are reverted. Constructing a dependable algorithm for identifying sales has proven challenging for researchers. Researchers often turn to other methods of detecting temporary price changes, such as asserting that sale prices have a particular shape. A common identifier of sales is V-shaped temporary discounts, similar to those employed by Nakamura and Steinsson (2008). We also to its pre-existing level. These distinctive features imply that even though an apply the running mode filter proposed by Kehoe and Midrigan (2015)’s algorithm individual price series has a great deal of high-frequency price flexibility (the actual categorizes each price shift as either temporary or regular, based on each price’s price changes frequently), the series also has a great deal of low-frequency price relative position to the mode price over a given window of time within a particular stickiness (the regular price changes infrequently). time series; notably, the algorithm differentiates between price increases and price decreases. Before we can address how to treat sales in constructing a measure of price stickiness, we must first be able to accurately identify sales. Sales or promotions are the To account number short-lived price for missing information of observations, changes that forare more we take details reverted. on the theprice Constructing information calculations see the a dependable from previous appendix. algorithm periods. For the price index calculations, if a good disappears and later reappears for identifying sales has proven challenging for researchers. Researchers often turn in our sample,availability.– Product we fill missing prices with the previously available values along until a new to other methods of detectingOne of the most temporary 7 priceremarkable changes, such features as asserting thethat sample sale price is theisdecline observed within 150 in on-line days.availability across division and groups, see figure product prices have a particular shape. A common identifier of sales is V-shaped temporary 2. In the sample discounts, similaroftoour thosetwoemployed supermarkets we observeand by Nakamura a decline Steinsson of 55.7% (2008). on Weaverage also 8 with respect to the average available before March apply the running mode filter proposed by Kehoe and Midrigan (2015)’s algorithm 18. In spite of the different nature of the categorizes eachshock,price shiftwe can compare as either this drop temporary or with regular, the based evidence on eachon natural price’s 4 Empirical disasters provided by findings Cavallo et al. (2014) for Chile and relative position to the mode price over a given window of time within a particular Japan. They show that the earthquake time series; that notably,tookthe place in Chiledifferentiates algorithm in 2010 had between an immediate impact onand price increases product price The data availability. decreases. sampleA was large broken share ofinto goodsfour periods went out labelled of stock as follows: within (1) days. “pre-shocks”, The fall was that gradualgoesbutfromlarger June 2019 (than in Chile up toin18Japan), of October, where the (2) number “social outbreak” of productsfrom mid available October fell To till by account mid-March, 32% in for themissing first two (3) “pandemia” months after information from the the we take 18 March earthquake, till end of recoveringfrom price information June slowly and (4) after previous from that. the periods. “gradual Supply For shocks the uplift” price of mobility originate index in the restrictions calculations, destruction if a good inofdisappears July till November production and later2020. capacity and The reappears the dataset disruption in includes our sample, weprice of supply information fill chains. missing pricesofwith Demand food,thedrinks, shocks are toiletries, linked previously to thecleaning available panic that values products consumers until and a new household may equipment. experience after price is observed within 150 days.natural disasters, 7 with people rushing to the stores to purchase basic necessities and hoarding goods that they fear could become unavailable in the daysWe to end come. up9 with three price series for each product-variety, that is, (1) the raw dataset, (2) the price filtered following Nakamura and Steinsson (2008) (NS) and (3) The the price lowestfiltered availabilityseries of using Kehoe on-line and Midrigan products (2015) (KM). was registered during Our Aprilcollected and by 4 data Empirical allows us to observefindingsthat one of the retailers the end of the month it began to recover, but without reaching the levels shown under study follows a more dynamic pricing strategy. at the beginning of March. We identify retailersinasavailability The recovery Ri , i = 1, 2,occurred 3 and 4. in To food calculate and The data sample was broken into four periods labelled as follows: (1) “pre-shocks”, the different beverages, measures while technology of frequency and white of line priceproducts changescontinued each product at loweris weighted levels, with by that goes from June 2019 up to 18 of October, (2) “social outbreak” from mid respect the7 During numberto those registered atforthe ofpandemia observations, morebeginning of on data detailsgoods the collection. This calculations seeoftheperiod of less appendix. October till themid-March, (3) “pandemia” a substantial amount offrom 18wereMarch till end not being offered June or 10 and taking (4) longer availability coincides with the months of greatest imputation by the INE. from the “gradual uplift” of mobility restrictions in July till November 2020. The to reappear, this parameter has been revised at each update. Product dataset includes availability.– price information One ofofthe most food, remarkable drinks, toiletries,features cleaning along the sample products and is We the also decline have household equipment. in information on-line product as regards availability 10the brands across supplied division and in each groups, supermarket see figure as 2. well In the assample the number of ouroftwo varieties suppliedwebyobserve supermarkets each brand.a decline Weofobserve 55.7% on that there average is aWe with reduction end uptoin respect thethree the with number average price ofseries varieties available before for offered each March by 18.each In 8 product-variety, supplier spite thatof but is,the (1)wedifferent do raw the not observe such nature of(2)the dataset, a decline theshock, in the number we can following price filtered of suppliers. compare Nakamura this drop and This points withSteinsson to the evidence a decline (2008)on(NS) in the natural and varieties disasters (3) possibly provided the price with filtered the aim by series Cavallo etofal.Kehoe using focusing (2014)and their for production Chile Midrigan and(2015) Japan.on (KM). their TheybasicOur products, show that the collected see figure earthquake 3. data allows us to observe that one of the retailers under study follows product that took place in Chile in 2010 had an immediate impact on a more availability. A large share of goods went out of stock dynamic pricing strategy. We identify retailers as Ri , i = 1, 2, 3 and 4. To calculate within days. The fall was Analytical gradual but larger classification in Chile (than of in goods.– Japan), In where the different measures of frequency of price changes each product is weighted by order the to number capture of differentiated products available behaviour fell by 32%we in classify the first product-varieties two months afteraccording the earthquake, to theirrecovering nature: slowly emergency, after 7 whether During it the is perishable, pandemia a nonperishable substantial amount of or goodsdurable. were that. Supply shocks originate in the destruction of production capacity and the not A being first offered group or taking include longer to reappear, of disruption thissupply parameter has been chains. revisedshocks Demand at eachareupdate. linked to the panic that consumers 8 This reduction is computed by comparing the average number of items available before March may experience after natural disasters, with people rushing to the stores to purchase 18, the date in which the state of alarm was declared and confinement measures where introduced basic necessities and hoarding goods that10they fear could become unavailable in the in Chile. 9 to come.9 days During the first days after the state of alarm and given the increased demand in fear of shortages or13theDOCUMENTO BANCO DE ESPAÑA derived uncertainty some retailers focused on essential products, in a way of simplifying the DE TRABAJO N.º 2112 delivery process. The lowest availability of on-line products was registered during April and by 10 During the pandemia the National Statistical Office faced several problems to collect the the end of the month it began to recover, but without reaching the levels shown information of prices and used imputation methods. at the beginning of March. The recovery in availability occurred in food and
You can also read