Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Vol. 31, No. 1, January–February 2012, pp. 96–114 ISSN 0732-2399 (print) ISSN 1526-548X (online) http://dx.doi.org/10.1287/mksc.1110.0678 © 2012 INFORMS Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Pradeep K. Chintagunta Booth School of Business, University of Chicago, Chicago, Illinois 60637, pradeep.chintagunta@chicagobooth.edu Junhong Chu NUS Business School, National University of Singapore, Singapore 117592, bizcj@nus.edu.sg Javier Cebollada Public University of Navarra, 31006 Pamplona, Spain, cebollada@unavarra.es H ouseholds incur transaction costs when choosing among off-line stores for grocery purchases. They may incur additional transaction costs when buying groceries online versus off-line. We integrate the various transaction costs into a channel choice framework and empirically quantify the relative transaction costs when households choose between the online and off-line channels of the same grocery chain. The key challenges in quantifying these costs are (i) the complexity of channel choice decision and (ii) that several of the costs depend on the items a household expects to buy in the store, and unobserved factors that influence channel choice also likely influence the items purchased. We use the unique features of our empirical context to address the first issue and the plausibly exogenous approach in a hierarchical Bayesian framework to account for the endogeneity of the channel choice drivers. We find that transaction costs for grocery shopping can be sizable and play an important role in the choice between online and off-line channels. We provide monetary metrics for several types of transaction costs, such as travel time and transportation costs, in-store shopping time, item-picking costs, basket-carrying costs, quality inspection costs, and inconvenience costs. We find considerable household heterogeneity in these costs and characterize their distributions. We discuss the implications of our findings for the retailer’s channel strategy. Key words: channel choice; online grocery shopping; transaction costs; plausibly exogenous; hierarchical Bayesian; green shopping History: Received: November 23, 2010; accepted: July 28, 2011; Preyas Desai served as the editor-in-chief and Bart Bronnenberg served as associate editor for this article. Published online in Articles in Advance December 20, 2011. 1. Introduction of transaction costs for a retailer’s marketing strat- Researchers have identified a number of transaction egy. In addition, we explore the “green” aspect of costs such as opportunity costs of time and trans- online grocery shopping by quantifying possible soci- portation costs as possible drivers of a consumer’s etal benefits of such shopping. As with the previous choice of a physical grocery store (Bell et al. 1998, literature (Bell et al. 1998), we focus on store choice Pashigian et al. 2003, Fox et al. 2004, Briesch et al. conditional on a store visit. We use “store choice” 2009). In an online setting, researchers have found and “channel choice” interchangeably to mean choice delivery charges and retailer reputation to be impor- of an off-line or online store. Furthermore, given our tant factors influencing online store choices for non- focus on channel choice conditional on making a grocery items (Smith and Brynjolfsson 2001). When store visit, our money metrics represent the relative consumers choose between online and off-line grocery transaction costs of shopping online versus shopping channels, there could be additional transaction costs off-line. that are specific to this purchase setting. Online and off-line channels provide varying lev- Our objectives in this paper are (1) to identify the els of distribution services that entail different levels transaction costs from the previous literature as well of direct costs and transaction costs on consumers. as any additional transaction costs that could influ- Direct costs refer to the sum of shelf prices (less dis- ence the choice of online and off-line grocery channels, counts, or net of discounts) of the items in the shop- (2) to incorporate these costs into a model of con- ping basket; transaction costs are the costs needed sumers’ choice between online and off-line grocery to transfer market goods in a store into consump- channels, (3) to quantify and provide money metrics tion goods at home. Transaction costs vary from for these costs, and (4) to investigate the implications trip to trip and differ across households. They play 96
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Marketing Science 31(1), pp. 96–114, © 2012 INFORMS 97 an important role in household choice of shopping choice and category needs).1 A formal treatment of venues. However, many transaction costs are non- all these factors simultaneously poses serious method- monetary and hard to measure and quantify, and ological and computational issues. A key feature of thus it is difficult for retailers to factor them in when our data is that there is very little variation in quan- designing marketing strategies. Because understand- tity choices across channels within a household— ing consumers’ store and channel choice decisions is decomposing the variance in quantity choices within of fundamental importance to retailers, it is of consid- a category into the variance explained by household erable value to managers to understand the monetary fixed effects and channel fixed effects, we find that implications of the various transaction costs. the former explain 87.0%–99.8% of the variance. This The empirical context for quantifying transaction allows us to abstract away from considering purchase costs in channel choice is a unique household panel quantities in the analysis. Next, accounting for each of the categories that a data set from Spain. The data consist of the same household expects to buy on a store visit is nontrivial households choosing between a chain’s off-line stores because there is considerable variation across house- and its online store over six months. Using data from holds in the categories that drive their basket expendi- a single chain may seem limiting because it may tures. Thus, we cannot fix the set of categories under not capture all the purchases of a panelist. This con- analysis as being common across households as in cern, although legitimate, is mitigated in our case, as previous studies, nor can we focus on a smaller sub- the panelists’ annual in-chain expenditures are about set of categories because to account for at least 80% of 80% of grocery expenditures by households of simi- a household’s basket expenditures we need to include lar sociodemographics in the same area. Meanwhile, 30 categories. As our interest in the categories bought focusing on purchases within a chain has several stems from how they influence transaction costs and advantages. First, issues such as store image as choice as the choice of the channel does not affect quan- drivers are no longer relevant. Second, the chain has tities bought, the identities of the specific categories the same prices and promotions online and off-line. bought is not critical. Thus, we can simply summa- Thus, direct costs are identical across channels and rize the information contained in the expected cate- will not affect channel choice. Third, similar assort- gories purchased via metrics that will likely influence ments are available everywhere, so this factor will not transaction costs. Specifically, we focus on the total play a role either. Our empirical context thus allows number of items, the number of perishable items, and us to focus squarely on the role of transaction costs the number of heavy/bulky items (defined in §3.4) in driving channel choice. Conventional reasons for that a household expects to purchase. One benefit of why people shop online, such as lower prices, sales this classification is that transaction costs can be spec- tax avoidance, and so on, are ruled out directly and ified as functions of these numbers. Thus, instead of therefore do not confound our analysis. looking at channel choice and the expected purchase Nevertheless, empirically quantifying transaction in each category, our endogenous variables are chan- costs is a challenging task because several of them, nel choice and the total number of items, the number e.g., costs of putting items into the shopping cart of perishable items, and the number of heavy/bulky items that the household expects to buy. and carrying them home, depend on the specific cat- Given the above set of potentially endogenous vari- egories and their quantities that a household expects ables and given that channel choice depends on the to buy on that shopping trip. These factors need not various expected numbers of items purchased, one be exogenous because unobserved factors that influ- can adopt either a full information approach that ence the choice of shopping channels could also influ- specifies how each of these variables is determined by ence the categories and quantities bought. For exam- a consumer or a limited information approach where ple, if a household member goes out purchasing other numbers of items are instrumented for in the analy- goods for consumption (e.g., clothes), he or she may sis of channel choice. We choose the latter approach decide to combine the trip with a visit to a physical because it not only obviates the need to specify the grocery store. This unobserved factor that influences exact data generating process for these endogenous off-line purchase could also influence the categories variables but also avoids the problem of incorrect purchased (e.g., if the household has limited time to transaction cost estimates in the presence of a mis- spend in the store). Hence, the potential endogeneity specified data generating process. We use as instru- of the channel choice drivers needs to be accounted ments (i) each household’s inventory variables based for. One way to deal with endogeneity is to specify on its own top expenditure categories and (ii) the a full system of equations that characterize channel choice, categories purchased, and associated quanti- 1 Because online and off-line channels have similar assortments, ties (akin to Briesch et al. 2009, who model store brand choice is unlikely correlated with channel choice.
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice 98 Marketing Science 31(1), pp. 96–114, © 2012 INFORMS numbers of items bought by pure online and pure off- of transaction costs that map (not necessarily one- line households of similar demographics to the pan- to-one) onto the set of goods and distribution ser- elists (see §5 for a justification of these instruments). vices (Furubotn and Richter 1997). Stores differ in the We combine these instruments with the Conley et al. goods and distribution services provided, and hence (2010) plausibly exogenous approach in a hierarchical they differ in their direct and transaction costs to con- Bayesian (HB) framework to account for the endo- sumers. For the same level of home production, con- geneity of the numbers of items purchased. A key sumers choose stores with the lowest sum of direct advantage of this approach is that it allows us to work and transaction costs for their shopping basket. with potentially weak instruments. Finally, to pro- Transaction costs play a role in all stages of a vide money metrics for the various transaction costs, consumer’s store choice process. Researchers have we exploit the fact that although direct costs do not applied the notion of transaction costs, in particu- play a role in channel choice, the presence of deliv- lar, the role of time costs and transportation costs, ery costs allows us to compute the marginal utility to analyze consumer’s choice of conventional stores of income. Using the posterior distributions of the (Bell et al. 1998), online stores (Smith and Brynjolfsson parameter estimates, we quantify the distributions of 2001, Lee and Png 2004), and between online/direct various transaction costs across households. retailers and conventional retailers (Liang and Huang The main contributions of our study are as follows: 1998, Balasubramanian 1998, Keeny 1999, Sinai and (1) We integrate the transaction costs identified in the Waldfogel 2004, Forman et al. 2009). Researchers have literature and the additional transaction costs specific identified several types of transaction costs in gro- to the online and off-line grocery setting into a model cery shopping in physical stores. Betancourt (2005) of channel choice. (2) Methodologically, we apply the synthesizes these costs into categories that map into Conley et al. (2010) plausibly exogenous approach to various distribution services: (1) Opportunity costs of a nonlinear and hierarchical setting. (3) We provide time comprise travel time to and from a store and a strategy for reducing the complexity of the channel in-store shopping and waiting time. (2) Transporta- choice analysis by aggregating across categories in a tion costs to and from a store include travel time and manner that helps to simplify the problem substan- costs, and are related to accessibility of store location. tially without losing relevant information. (4) Sub- (3) Psychic costs are costs such as perceived difficulty stantively, we quantify and provide money metrics of use, inconvenience, frustration, annoyance, anx- for the important types of transaction costs in grocery iety, drudgery, dissatisfaction, disappointment, per- shopping that involves online and off-line channels, sonal hassles, shopping enjoyment, or disagreeable which can help retailers to better design marketing social interactions that consumers are subject to in the strategies for the two channels. We also take some store environment (Ingene 1984, Devaraj et al. 2002). preliminary steps to explore the implications of online (4) Adjustment costs as a result of the unavailability of shopping for the environment in terms of reducing products at the desired time or in the desired amount carbon emissions. are costs that arise from additional time and trans- portation costs incurred because of forced search or lower utility associated with altering the consump- 2. Transaction Costs and Online and tion bundle of goods. (5) Search costs are the costs of Off-line Channel Choice time, transport, and other resources in the acquisi- Transaction costs economics (Coase 1937; Williamson tion of information with respect to price, assortment, 1979, 1981; Williamson and Masten 1999) emphasizes physical attributes, or performance characteristics of the role of transaction costs in economic exchanges. the goods provided in different stores. In looking The basic principle of transaction costs economics at online shopping for nongrocery items, Smith and is that agents choose to conduct transactions in a Brynjolfsson (2001) identified two additional costs way that minimizes their transaction costs. Grocery as important transaction costs for choosing online shopping is a canonical context that involves transac- stores: (6) delivery costs and (7) waiting costs for basket tions between retailers and consumers. The basic eco- delivery. nomic function of retailers is to provide consumers When it comes to the choice between online and off- with explicitly priced goods and a set of distribution line grocery stores, there could be additional transac- services, including assortment, assurance of product tion costs as well. Liang and Huang (1998) and Chircu delivery, information, accessibility of location, and and Mahajan (2006) mention these types of costs in ambiance (Betancourt and Gautschi 1986, 1988). These their analysis, and the popular press has discussed goods and distribution services generate customer them in some detail. One such cost that has been iden- value and are essential inputs into household pro- tified is (8) physical costs. Based on the experience of a duction (Becker 1965). To carry out home produc- Webvan customer, Chircu and Mahajan (2006, p. 909) tion, consumers need to incur direct costs and a set note, “He lived on the 4th floor in a building with no
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Marketing Science 31(1), pp. 96–114, © 2012 INFORMS 99 elevator, and the delivery person carried bulky gro- 3. Data and Observed cery purchases up the stairs for him.” Some online Shopping Patterns blogs (e.g., Nick 2007) note the following when they compare costs across off-line and online channels— 3.1. The Grocery Chain “You have to push a heavy cart” versus “Internet Our data are from a major grocery retail chain in shopping carts weigh 0.003 grams,” “You have to Spain. The data are for one metro area, where the maneuver through crowded aisles” versus “Internet retailer has about 200 physical stores and 1 online aisles can fit any number of people,” and “You have to store. The online store is the retailer’s largest store load your groceries in the car, unload them at home, by revenue. It partners with 18 of the chain’s phys- and carry them into the kitchen” versus “The gro- ical stores for grocery supply. The partner stores cery delivery person does all of this.” Physical costs, were selected based on size (they are the largest therefore, include (a) the costs of picking and putting stores), geographic location (dispersed around the items into the shopping cart, which differ significantly metro area), and service ability such that they individ- across online and off-line stores as the Internet option ually have the ability to fulfill online orders (e.g., have eliminates these costs on consumers; and (b) the costs adequate assortments, personnel, delivery trucks) and of carrying the basket home, which are important collectively can do so efficiently. in markets where many people walk or take public When a consumer makes her first online purchase, transport to the store. she needs to register an account with her loyalty card. A second such cost associated with the Internet Her account information is then stored in the com- channel is (9) quality inspection or product evaluation puter server and retrieved for subsequent orders. To costs (BusinessWeekOnline 2001). Liang and Huang place an online order, a consumer first goes to the (1998), in a nongrocery setting, refer to this as an grocer’s website and logs into her account. She then “examination cost.” The Internet increases the costs browses or searches the aisles, selects the items she of providing a given level of assurance of product wants to buy, and puts them into her basket. After delivery at the desired time, or of the desired qual- filling her basket, she proceeds to checkout. The list ity, as a result of the consumer’s inability to inspect of the basket items is stored in the computer system and acquire the product at the time and place of pur- as “previous shopping lists” and can be used for later chase. Quality inspection costs may play an important “express” orders. The chain uses a centralized online role when the shopping basket contains items such ordering system. After receiving an online order, the as perishables whose qualities vary from purchase ordering system assigns it to one of the partner stores occasion to purchase occasion. Delivery of perish- and notifies the household. The household then has able goods with quality guarantee is considered the two options: it can go to the assigned partner store biggest logistical problem for Internet grocers (Demby to pick up the order for free (such purchases are not 2000). Internet retailers also take special actions to observed in our data) or have the basket delivered guarantee the quality of perishables ordered online: home within some chosen delivery time window (e.g., “If, for any reason relating to the freshness of a 7–9 p.m. Monday) and incur delivery charges.2 Perishable Good, you are less than 100% satisfied, The retailer practices uniform pricing so prices are COOLGREENANDSHADY.COM will arrange for the identical across all its online and off-line stores.3 It re-delivery of your order” (Cool Green & Shady 2011). is a Hi-Lo chain and runs chainwide promotions. Direct costs and transaction costs are therefore a The online product offerings are the same for every- function of the shopping channel, basket (product) one and available in all partner stores. Thus, assort- characteristics, shopping occasion (duration and tim- ments are identical across the partner stores and the ing of the trip) characteristics, and household charac- online store. Assortments in nonpartner stores can dif- teristics. Together, they vary across trips for the same fer slightly because of store size differences and conse- household, depending on when, what, and how much quent differentials in stock-out rates. We were able to it buys. They also differ by households for the same product bundle, depending on household sensitivities 2 Recently, chains such as Farm Stores in the Miami–Dade area to different cost components and valuation of conve- in the United States have started offering online ordering com- nience and time. For each shopping occasion, a house- bined with free drive-through pickup services (see http://www hold trades off these costs and chooses a store with .miamiherald.com/2011/04/10/2159495/farm-stores-offers-online the lowest sum of direct and transaction costs. Con- -grocery.html). 3 sequently, we would expect to see both within- and We confirmed with the management and checked the data that across-household variation in store choice as a func- the retailer indeed practices uniform pricing in all off-line stores and the online store. The retailer used to practice zone pricing in tion of these factors. Accordingly, our model of store its off-line stores with two price zones, but it stopped off-line zone choice needs to account for both direct and transac- pricing prior to our data period (see Chu et al. 2008 and 2010 for tion costs. details).
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice 100 Marketing Science 31(1), pp. 96–114, © 2012 INFORMS verify that 98.3% of the online items ordered were also 3.3. Scanner Panel Data purchased in nonpartner off-line stores in our data. We obtain the complete shopping records of 3,556 This, in addition to confirmation from management, households between May and November 2007. This gave us the assurance that available assortments are is a random sample of the retailer’s online cus- similar across stores (see also §5.5). tomers. The households shop interchangeably in the People often walk or take public transport to buy online and off-line channels (we observe purchases groceries (only 68% of stores have parking lots). in 196 stores). We observe at what time a household About 60% of off-line stores also offer delivery ser- visits the chain, which store it visits, what items and vice. The retailer has the same delivery policies for how much of each item it buys, whether the basket online and off-line orders. It charges E6 for deliv- is home delivered, and delivery charges. We select a ery if the basket is below E100 and E4 if the basket random sample of 1,025 households for model esti- exceeds E100. Delivery is free for golden-card (a pre- mation. Table 1 presents major demographics and mium loyalty card awarded to big spenders) clients store characteristics for all households and for the if the basket exceeds E100. Households with quar- chosen sample. An average household has 3.37 mem- terly in-chain expenditure exceeding E600 are eligi- bers, 0.68 preschool and 0.47 school-age children, ble but need to apply for golden-card membership. 2.14 work-age adults, and 0.07 elders. Of these house- In sum, online and off-line channels have the same holds, 15.32%, 26.34%, and 58.34% live, respectively, prices, price promotions, and delivery policies, as well in low, medium, and high income/economic areas. as similar assortments. We also obtained data on pure online and pure off- 3.2. Store Price Promotions line customers that we use to construct instruments. The retailer provided us with price promotion data, including categories and items promoted, promo- 3.4. Observed Shopping Patterns and tion start and end dates, and depths of price cuts. Implications for Econometric Modeling Promotions occur in vastly differing sets of cate- The households exhibit the following shopping pat- gories. Each day there are on average 85 categories terns (see Table 2) that appear to be consistent with and 419 items on promotion. Promotion durations the transaction costs of shopping at each channel. range from three to eight weeks. Three-week, four- These patterns provide some model-free evidence of week, and five-week promotions account for 22.9%, the importance of transaction costs in channel choice 43.2%, and 33.7% of the promotion cycles, respec- decisions. tively. Such multiple-week promotions are quite dif- First, the availability of the Internet option does not ferent from weekly promotions commonly practiced act as a sorting mechanism whereby certain house- by U.S. grocers. Promotion depths vary substantially holds mainly shop online and others primarily shop across products, ranging from 4.5% to 25.0% with a off-line. All sample households shop in both channels. mean of 8.7%. There is not the “20/80” phenomenon, where 20% of Table 1 Household Demographics and Characteristics of Most Frequented Off-line Stores Entire sample Random sample Mean Std. dev. Mean Std. dev. Family size 3039 2020 3037 2005 No. of preschool children (0–5) 0069 0089 0068 0087 No. of school-age children (6–18) 0044 0085 0047 0089 No. of working adults (19–65) 2018 1089 2014 1074 No. of of elders (65+) 0007 0047 0007 0035 Distance to most frequented off-line store 1013 4001 0099 2061 Percentage of reside in low economic area 15043 15032 Percentage of reside in medium economic area 24096 26034 Percentage of reside in high economic area 59061 58034 Percentage of downtown 41046 39061 Characteristics of most frequented off-line stores Store square footage (m2 5 11480061 11102038 11473063 987001 Percentage of bazaar section 63086 64039 Percentage of butcher shop 91068 90083 Percentage of processed meat 93038 92029 Percentage of fish shop 83005 82005 Percentage of having parking lots 68003 68068 Percentage of home delivery 60044 60068
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Marketing Science 31(1), pp. 96–114, © 2012 INFORMS 101 Table 2 Characteristics of Household Shopping Behavior by Channel Total Online Off-line Mean Std. dev. Mean Std. dev. Mean Std. dev. Total trips 16018 11067 5056 4022 10061 10098 Mean trip interval (days) 10097 12034 14046 14029 9014 10075 Basket size (E) 82099 82059 155077 83048 44085 50012 Half-year spending (E) 11342060 874050 866056 697073 476004 534010 No. of unique categories per trip 16086 13011 27067 11000 11019 10026 No. of unique perishable categories 4011 4082 4087 5057 3072 4032 No. of unique heavy/bulky categories 5006 4079 9089 3067 2054 3006 No. of unique items per trip 22061 19018 38047 17026 14030 14028 No. of unique perishable items 4069 5073 5055 6060 4024 5015 No. of unique heavy/bulky items 6064 6086 13039 5098 3010 4006 Percentage of perishables in trip expenditure 23021 25034 11069 13014 29024 27097 Percentage of heavy/bulky in trip expenditure 33003 26014 49030 17081 24050 25077 Coefficient of variation: Basket 0049 0035 0024 0016 0073 0032 Coefficient of variation: Trip interval 0073 0025 0052 0025 0086 0032 Channel switching: Household level (%) 58069 33070 68092 31051 48050 32072 the households account for 80% of the online trips or They make major trips to the online channel and fill- expenditures (see Figure 1). The mean ratio of online in trips to the off-line channel (Kahn and Schmittlein trips to total trips is 41.8% (std. dev. = 2407%). The 1989). Online baskets (E155.8) on average are 3.5 times ratio is below 20% for 22% of the households, above as large as off-line baskets (E44.9). Households buy 80% for 11% of the households, and between 20% 28 (11) unique categories and 38 (14) unique items and 80% for 67% of the households. The mean ratio on an online (off-line) trip; and 95% of a household’s of online expenditures to total expenditures is 65.0% online trips have a basket larger than its mean off- (std. dev. = 2409%). The ratio is below 20% for 7% line basket, and 63% of its off-line trips have below- of the households, between 20% and 80% for 58% of average-sized baskets. This suggests that relative to the households, and above 80% for 35% of the house- shopping off-line, transaction costs of shopping online holds. Channel “switching” is common in the data. are low when the basket is large but high when the The online-to-off-line switching probability is 56.5%, basket is small. It also raises the possibility that there exist and the off-line-to-online probability is 29.6%. This unobservable factors that influence both channel choice and suggests that for the same households, transaction number of items purchased. costs of buying online versus off-line vary from trip Third, households also sort their trips to online to trip. Delivery charges are involved for 22% of the and off-line channels based on category character- online trips and 1% of the off-line trips. Because house- istics. Online baskets consist of more heavy/bulky holds visit both channels, the same household’s temporal items, reflecting the convenience benefit of online channel choice decisions can only be explained by trip-level shopping, whereas off-line baskets have more perish- factors, even though demographics may explain the average ables, reflecting the ability of the physical channel to online shopping intensity across households. allow for quality checking. Perishables refer to fresh Second, households sort their shopping trips to produce, meat, seafood, and bakery items in a house- online and off-line channels based on basket size. hold’s basket. Heavy/bulky items refer to bottled, canned, or bagged consumer packaged goods (CPGs) Figure 1 Household Distribution by Shares of Online Grocery (e.g., mineral water, beer, toilet paper) and liquid-rich Expenditure and Trips non-CPGs (e.g., watermelon). Examples of these are 4 20 one-liter bottles of water, 12 rolls of toilet tissue, etc. Grocery $ No. of trips Note that some categories can appear in both heavy 16 and perishable sets (e.g., watermelon). An online (off- 12 line) basket has 13.4 (3.1) unique heavy/bulky items and 5.6 (4.2) perishable items. Heavy/bulky (perish- 8 able) items account for 49.3% (11.7%) of online and 24.5% (29.2%) of off-line trip expenditures. Further- 4 more, a household, on average, buys 29.3 categories 0 exclusively online, 32.4 categories exclusively off-line, and 23.2 categories in both channels. The channel- 90 specific category purchases suggest that the transac- Online shares (%) tion costs of buying the same categories differ by
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice 102 Marketing Science 31(1), pp. 96–114, © 2012 INFORMS Figure 2 Distribution of Shopping Trips by Basket Size 20 Off-line Online 16 12 (%) 8 4 0
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Marketing Science 31(1), pp. 96–114, © 2012 INFORMS 103 the other two decreased the proportion. This seems online store, implying that the costs of off-line shop- to suggest that promotions of different items differ- ping during office hours may be high compared with entially influence the propensity to visit the differ- online shopping. We include a dummy variable for ent channels, e.g., observing promotions of perishable trips that occurred during office hours (8 a.m.–noon, items encourages customers to visit the off-line store. and 2–6 p.m.). Therefore, we need to account for the differential effects of (2) Transportation costs are directly related to the promotions across channels, even though the economic ben- home–store distance. Consistent with the literature efits of these promotions are the same across channels. (e.g., Bell et al. 1998), we use home–store distance In sum, the observed shopping patterns appear to as a proxy for transportation costs. Thus, home– show that transaction costs of shopping differ by chan- store distance captures the effect of both travel time nel, basket size and composition, shopping occasion, and transportation costs. To more accurately quantify and household characteristics. Our econometric spec- transportation costs and travel time, we need trip- ification tries to incorporate these into the analysis. level mode of transportation used, which is not avail- able to us. Using information at some aggregate level, such as zip-code car ownership, is not useful because 4. Econometric Specification this information does not vary over our short data 4.1. Description of Shopping Costs duration. Furthermore, there is limited variation from A shopping trip involves direct costs and transaction one zip code to another. In the empirical analysis, we costs. As the retailer has uniform pricing in the two interact distance with household income level to cap- channels, direct costs drop out in any discrete model ture differences in time costs across households. of channel choice, so we do not consider them. We (3) Psychic costs are difficult to measure but are now describe how we measure the nine types of trans- closely related to the characteristics of the shopping action costs identified. For a given basket, search costs channels themselves. We use channel dummies and a to find a store are unlikely to play a role in our case, household-, trip-, and channel-specific error term to although in-store search costs will differ, which are reflect these costs. accounted for as follows: (4) Adjustment costs as a result of product unavailabil- (1) Time costs are the opportunity costs of travel ity and (5) in-store search costs: As with psychic costs, time and in-store shopping time. Shopping time is an these costs are channel specific and difficult to mea- important factor in store choice decision. Messinger sure. They may vary over time because of different and Narasimhan (1997) find that the growth of one- products not available in different channels at differ- stop shopping has been a response to the growing ent times and familiarity with product layouts. We demand for timesaving convenience. Pashigian et al. represent them by a household-, trip-, and channel- (2003) examine how firms respond to the increasing specific error term. cost of a shopper’s time in ways that economize on (6) Basket delivery costs: Smith and Brynjolfsson shopping time. Consistent with the literature (e.g., (2001) find consumers are more sensitive to deliv- Bell et al. 1998), we use home–store distance as a ery charges than to product prices when shopping proxy for travel time. As did Hoch et al. (1995), we online. Lewis et al. (2006) find a significant effect of scale distance in the downtown area by 2 to reflect nonlinear shipping and handling fees on purchase its greater congestion. We also scale distance by two incidence and expenditure decisions for dry groceries if the shopping trip occurs at peak weekday hours online. The shadow price of delivery charges differs (7:30–9 a.m. and 6–8 p.m.) to reflect the longer time for online and off-line shopping. For online shop- needed to travel the same distance. ping, delivery charges cover transportation costs to In-store shopping time primarily is a function of the and from the store, opportunity costs of travel time, number of items purchased. We use the number of and physical costs of basket picking and carrying. For basket items as a proxy for in-store shopping time. off-line shopping, they only cover physical costs of Online item selection and checkout differ from that basket carrying. Thus, households may be less sensi- off-line. For the same set of items, a household may tive to delivery charges when shopping online than need less time to shop online than off-line, so we off-line. allow the marginal costs of shopping time to differ by (7) As with adjustment costs, waiting costs for bas- channel. ket delivery are difficult to measure and may vary over The opportunity costs of time also depend on when time because of instances of more urgent need of a a household visits a store. We include a weekday product than others. We represent them by the chan- dummy to capture the higher opportunity costs of nel dummy and by the household-, trip-, and channel- time on weekdays. Furthermore, a consumer may not specific error term. be able to leave her office to buy groceries in a physi- (8) Physical costs of picking items are related to cal store during office hours but can still do so in the the number of items, particularly the number of
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice 104 Marketing Science 31(1), pp. 96–114, © 2012 INFORMS heavy/bulky items purchased. It is a simple mouse + hj7 NHht · dhjt + hj8 Wt + hj9 DCht click for online shopping but can be high for off- + hj10 Pht + hj11 PPht + hj12 PHht + hjt 1 (1) line shopping. We use the number of heavy/bulky items as a proxy for item-picking costs and allow the where FChjt is total transaction costs and dhjt is the marginal costs to differ by channel. The costs of car- home–store distance, which changes from trip to trip rying a basket home are a function of the number of because a household does not always visit the same heavy items in the basket and the distance traveled. off-line store. Variable WKt is a weekday dummy and We use their interaction as a proxy for basket-carrying OHt is an office hour dummy; NIht , NPht , and NHht costs and set them to 0 for online shopping and off- are numbers of items, perishables, and heavy/bulky line shopping with home delivery. The costs of car- items, respectively; and NHht · dhjt is the interaction rying a basket also depend on weather conditions, so between heavy items and distance. Wt is a dummy we allow weather to influence the choice of online for bad weather and DCht is the delivery charge; Pht , and off-line channels differently. PPht , and PHht are the overall promotion index and (9) Costs of inability to verify product quality prior to promotion indices of perishables and heavy/bulky paying are primarily a function of perishables bought items; and hjt stands for unobserved search, adjust- because product qualities such as freshness can vary ment, and waiting costs that we assume are normally a lot from one purchase occasion to the next, whereas distributed. the quality of packaged goods does not change much We constrain the coefficients for the 196 off-line over purchase occasions. We use the number of per- stores to be the same and subtract the costs of shop- ishables bought as a proxy for these costs. Physical ping at the off-line channel from those at the online stores allow households to check product quality channel. Two factors necessitate our assumption on prior to paying; the online store lacks this ability the parameters across off-line stores. First, we do because the actual quality can differ from what is not have a sufficient number of observations for shown on the computer screen. Thus, households may each store to estimate separate effects. Second, even be more likely to visit the off-line channel when buy- if we did, it would be computationally infeasible ing more perishables. to estimate store-specific parameters for each of the Intrinsic channel preferences and an error term 196 stores. At the same time, the assumption is not capture, respectively, the time-invariant (channel restrictive for the following reasons: (1) All off-line image, environmental concerns, etc.) and time-vary- stores belong to the same chain and thus share the ing aspects of unobserved transaction costs. Thus, same or similar chain image, service quality, product given our data, we are able to quantify the impor- layout, returns policy, etc. The effects of other fac- tant types of transaction costs. To account for differ- tors such as store distance and size are accounted for. ences in the perceived benefits of promotions across (2) We estimate coefficients for each household. Even channels for the different products purchased by the though there are 196 stores, 32.9%, 31.1%, and 20.2% households, we include overall promotions, promo- of the households only visit one, two, or three off-line tions of heavy/bulky items, and promotions of per- stores, respectively. (3) For 74.1% of the households, ishable items as drivers of channel choice. Promotions the most frequented off-line stores account for over could result in households choosing the off-line chan- 60% of off-line trips. This combined with household- nel because of lower cherry-picking costs in the off- level parameters should capture some of the het- line stores. erogeneity across stores as well. In §5.5, we try to assess the impact of this assumption. Nevertheless, 4.2. Econometric Model we acknowledge this as a possible limitation of our On each shopping trip, a household compares the analysis. expected costs of different channels and chooses the Next, we drop subscript t for ease of exposi- one with the lowest expected costs to maximize its tion. Define Yh ≡ FCh1 on − FCh1 off , h ≡ h1 on − h1 off , utility. We assume households have rational expec- and h ≡ h1 on − h1 off . Let Xh ≡ {NIht , NPht , NHht , tations over shopping costs, so expected costs are NHht · dhjt } be the vector of numbers of items, per- the same as actual costs. It is reasonable to assume rational expectations on weather, delivery charges, ishable items, heavy/bulky items, and its interaction and travel distance. Given chainwide promotions and with distance; let Gh be other costs in Equation (1) the availability of the Internet option, it is also rea- and Ih be the choice indicator. h represents the subset sonable to assume rational expectations over promo- of h that is associated with the variables Xh , and 1h tions. Household h’s expected total transaction costs represents all other coefficients in h . Thus, we have of shopping at channel j on trip t are the following relationships: F Chjt = hj0 + hj1 dhjt + hj2 WKt + hj3 OHt Yh = Gh 1h + Xh h + h 1 (2) + hj4 NIht + hj5 NPht + hj6 NHht Ih Yh ∼ Binomial Probit0 (3)
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Marketing Science 31(1), pp. 96–114, © 2012 INFORMS 105 4.3. Endogeneity of Channel Choice Drivers We write out this model as a sequence of condi- Numbers of items, perishables, and heavy/bulky tional distributions and run the Gibbs sampler to get items are household decision variables on trip t the Markov chain Monte Carlo (MCMC) sequence and thus may be endogenous to the channel choice (Chib 2001, Rossi et al. 2005, Conley et al. 2010). The decision because unobserved factors (as discussed posterior distributions are previously) that influence numbers of items bought also likely influence the choice of shopping chan- Yh Gh 1 Xh 1 Zh 1 1h 1 h 1 1 è ∼ Truncated Normal1 nels (i.e., Xh could be correlated with h 5. There are 81h 1 h 9 Yh 1 Gh 1 Xh 1 Zh 1 D1 1 ç1 è1 ã1 V ∼ Normal1 several approaches to address this issue: (i) explic- itly model each of these decisions—a full-information Yh 1 Gh 1 Xh 1 Zh 1 81h 1 h 91 ç1 è1 1 A ∼ Normal1 structural approach, (ii) the control function approach ç Gh 1 Xh 1 Zh 1 81h 1 h 91 è1 ç̄1 Aç ∼ Normal1 of Petrin and Train (2010), and (iii) the plausibly exogenous approach of Conley et al. (2010). These è Yh 1 Gh 1 Xh 1 Zh 1 81h 1 h 91 1 ç ∼ Inverted Wishart1 approaches are similar in nature. The second and third approaches rely on finding exogenous instru- ã 81h 1 h 91 V 1 ã̄1 Aã ∼ Normal1 ments for the endogenous variables, and the first V 81h 1 h 91 v0 1 V0 ∼ Inverted Wishart0 (9) approach requires the presence of some excluded variables. The details are available in the electronic companion The plausibly exogenous approach, of which the to this paper that is available as part of the online conventional instrumental variables technique is a version that can be found at http://mktsci.journal special case, relaxes the exclusion restriction that .informs.org/. instrumental variables have no correlation with the unobservables; thus it allows for the use of less-than- perfect instruments. It works well with both strong 5. Variable Operationalization and and weak instruments, so we adopt this approach. We Robustness Checks find a set of instruments Zh for endogenous variables 5.1. Choice of Instruments Xh and include them into Equation (2) as follows: To resolve the endogeneity issue, we need to find vari- Yh = Gh 1h + Xh h + Zh + h 1 (4) ables that affect the numbers of items, perishables, or Y heavy/bulky items a household buys on trip t but Xh = 4Gh Zh 5 +vh 0 (5) that do not directly influence the choice of the shop- This approach essentially replaces the actual num- ping channel. What would be reasonable instruments bers of items, perishables, and heavy/bulky items for the numbers of items bought? A key benefit of with their expected numbers, which are determined our data is that we have access to purchases in all by exogenous instruments (Gh Zh 5. When = 0, categories (about 700 categories, as defined by the this model reduces to the conventional instrumen- retailer, were bought by the households), including tal variables model. Assume (h 1 vh 5 ∼ N401 è5. Let meat, fish, milk, and produce, that make up a signif- p4Data 1h 1 h 1 1 ç1 è5 be the likelihood of the data, icant proportion of a household market basket. This p41h 1 h 1 ç1 è5 the prior distribution of the model allows us to use each household’s own top expen- parameters, and p( 1h 1 h 1 ç1 è5 the prior distri- diture categories to construct its inventory variables. bution of . The inference is based on the posterior Importantly, these inventory variables influence chan- distribution of (1h 1 h 1 ç1 è5, integrating out : nel choice primarily via influencing the numbers of Z items bought; i.e., lower inventory levels in many cat- p4h 1 1h 1 ç1 è Data5 ∝ 6p4Data h 1 1h 1 1 ç1 è5 egories will result in greater numbers of items being bought on that trip.5 To check whether this is indeed · p 4 h 1 1h 1 ç1 è5p4h 1 1h 1 ç1 è5 d70 (6) the case, we run homogeneous probit models of chan- To overcome the computational burden of this nel choice that, respectively, include (1) inventory lev- model, we cast it in an HB framework as els of all items, heavy/bulky items, and perishables; (2) numbers of items, heavy/bulky items, and perish- B = Dã + U 1 U ∼ N401 V 51 B ≡ 81h 1 h 91 (7) ables; and (3) both sets of variables in (1) and (2); and where D is household demographics and ã is the we look at changes in the model fit. We find that num- second-stage coefficients. The common priors are bers of items have a much larger explanatory power than inventory levels. Furthermore, when numbers ∼ N4 1 A−1 51 vec4ç Vh 5 ∼ N4vec4ç̄51 è22 ⊗ A−1 ç 51 of items are included in the model, two of the three è ∼ IW4v0 1 V0 51 vec4ã V 5 ∼ N4vec4ã̄51 V ⊗ A−1 ã 51 5 We assume that there is no serial correlation in the channel choice V ∼ IW4v0 1 V0 50 (8) equation errors as is typical in these models.
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice 106 Marketing Science 31(1), pp. 96–114, © 2012 INFORMS inventory coefficients are no longer significant (see the 5.2. Operationalization of Promotion Variables electronic companion for details). We therefore con- We first calculate category promotion depths as clude that a household’s numbers of low-inventory weighted averages of price cut depths across all items items, low-inventory perishables, and low-inventory in the category, using each household’s mean item heavy/bulky items are reasonable instruments for our expenditures as weights. We then compute an over- endogenous variables (recall that our method does all promotion index as the weighted average of cate- not require perfect instruments). We first identify the gory promotion depths, using each household’s mean top 30 categories for each household by looking at the basket shares of its top 30 categories as the weights. proportion of that household’s total grocery expendi- Similarly, we compute promotion indices for perish- ture (across all purchase occasions) accounted for by ables and heavy/bulky items as weighted averages of each category purchased. Because household-specific promotion depths. The weights are each household’s top 30 account for 81.1% of the basket value (std. mean basket shares of the respective items among dev. = 904%), using the top 30 is nearly equivalent to its top 30 categories. Thus, the promotion indices are using the entire basket. This aggregation strategy sub- household specific. stantially reduces the complexity of the model to a manageable scale without loss of relevant informa- 5.3. Operationalization of Other Cost Variables tion. Next, we compute category inventory for each We use a weather indicator for rainy, stormy, and trip as in Gupta (1988) and find categories within windy days. The home–store distance refers to zip- each household’s top 30 that have below-average code-level distance between a household’s home and inventories for that household. We then sum up the the store visited on that occasion. On occasions when mean numbers of items across the low-inventory cat- the online channel is visited, the distance for the off- egories, perishables, or heavy/bulky categories for line option is the value for that household’s most each household. frequented physical store.6 The distance is zero for We also have information on the numbers of items online shopping. bought by households that have similar demograph- The delivery charges variable (DCht 5 is defined as ics to the panel households but do not make a channel follows. For all off-line visits entailing delivery, the choice decision, i.e., “pure” online and “pure” off-line variable drops out because the charges are the same households. We compute trip-level mean numbers of in both cases. For off-line visits with no delivery, items (heavy/bulky items, perishables) on each day DCht = 0 for the off-line channel and the amount cor- for pure online and pure off-line shoppers, broken responding to the basket size for the online channel, down by observed demographics, and match them and DCht = 0 for golden-card members if the basket to each sample household. The rationale for using exceeds E100. For online visits, we assume that the these variables as instruments is that variation in off-line option corresponds to no delivery, so DCht = numbers of items bought can be partly explained by 0 for the off-line channel and DCht equals the actual unobserved consumer characteristics. As long as these charges for the online channel. We recognize that this characteristics are correlated with observed demo- is an approximation in that, to deal with this struc- graphics, other households’ purchases will be cor- turally, we need to consider off-line visits with and related with those of the target household. Because without delivery as two options. However, other than these other households do not make a channel choice, delivery charges, there is nothing else in the data that their numbers of items are unlikely correlated with informs us of this choice, so we treat off-line visits as the unobservables that drive channel choice for the a single choice. Furthermore, as noted before, a very households in our panel. small fraction of off-line shoppers avail of the deliv- These two sets of instruments explain, respectively, ery option. 19.1%, 20.4%, and 23.9% of the variation in the num- ber of items, number of perishables, and number 5.4. Demographics and Store Characteristics of heavy/bulky items. Together with demographics, These variables appear in the hierarchy as D in Equa- they explain 19.8%, 23.6%, and 24.7% of the varia- tion (7) and allow us to account for observable house- tion; and in combination with exogenous variables in hold heterogeneity. Demographics include family size, Equation (1), they explain 23.2%, 24.9%, and 28.7% number of children, and the residential area’s eco- of the variation in the endogenous variables (see nomic status or income (low/medium/high). House- Table 3). These instruments appear to have indepen- hold basket characteristics include mean basket shares dent explanatory power. Although we recognize the 6 potential consequences of using multiple weak instru- This is the best approximation as it is the cost for most off-line trips. Households visit their most frequented off-line stores for ments, this is less likely in our case because of the low major trips and fill-in trips. We also used the distance to the clos- correlations between instruments (see the electronic est off-line store (the two are the same for many households) and companion for details). obtained similar results.
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice Marketing Science 31(1), pp. 96–114, © 2012 INFORMS 107 Table 3 Regression of Endogenous Variables on Instruments No. of items No. of perishables No. of heavy/bulky items Estimate Std. error Estimate Std. error Estimate Std. error Intercept −380358 30727 −40121 00675 −340932 10905 Pure online shoppers’ mean no. of items 00103 00019 00013 00003 −00006 00010 Pure off-line shoppers’ mean no. of items 00374 00031 00049 00006 00197 00016 Pure online shoppers’ mean no. of perishables 00477 00082 00248 00015 00058 00042 Pure off-line shoppers’ mean no. of perishables −00386 00102 −00009 00018 −00274 00052 Pure online shoppers’ mean no. of heavy/bulky items 00527 00035 −00005 00006 00404 00018 Pure off-line shoppers’ mean no. of heavy/bulky items −00420 00081 −00051 00015 −00255 00042 Pure online shoppers heavy items · Distance −00443 00075 −00019 00014 −00221 00039 Pure off-line shoppers heavy items · Distance 00706 00246 −00034 00045 00452 00126 No. of low inventory items 00558 00029 00035 00005 00127 00015 No. of low inventory perishables −00071 00094 00350 00017 −00141 00048 No. of low inventory heavy/bulky items −00067 00044 −00036 00008 00281 00023 Low inventory heavy/bulky items · Distance −00533 00081 00025 00015 −00379 00041 Family size −00095 00262 −00028 00047 −00062 00134 Preschool children −10204 00517 −00063 00094 −00921 00264 Parking 50632 00978 10481 00177 20369 00500 Partner store (yes/no) 80682 10100 10159 00199 30495 00562 MBS: Ratio of heavy/bulky a −90750 40710 −10308 00853 220744 20408 MBS: Ratio of perishables 70965 60036 190806 10093 10613 30086 Medium economic status −10117 10411 00069 00255 00521 00721 High economic status −00060 10274 00309 00231 00316 00651 Overall promotion 00392 00281 00106 00051 00052 00144 Promotion of perishables −20949 10077 00260 00195 −10234 00551 Promotion of heavy/bulky 10195 00520 −00029 00094 00604 00266 Delivery charges 40968 00310 −00203 00056 30293 00158 Weekday dummy −00470 10212 −00701 00219 −00216 00620 Bad weather dummy −80911 30115 −00314 00564 −40553 10593 Store distance 00134 00024 00007 00004 00077 00012 Office hours 170477 00942 10139 00170 90004 00481 R2 00232 00249 00287 N 16,796 16,796 16,796 a MBS, mean basket share for each household across all store visits. (across store visits) of perishables and heavy/bulky same zip code are grouped into the same bin, and all items. Characteristics of the most frequented physi- those zip codes with less than 200 observations are cal store include store square footage (a proxy for in- grouped into one bin. We estimated different sets of store shopping time but also correlated with the store parameters for different bins for the homogeneous location—a large mall, etc.) and indicators for seafood probit model. We find that it is not possible to pre- shop, parking lots, delivery service, and Internet store cisely identify these coefficients, and the great major- partnership. ity of the coefficients are not statistically significantly different from each other. 5.5. Robustness Checks To check whether possible differences in store assort- A model without instruments: We run an HB pro- ments play a role in store choice, we estimate an HB bit model of channel choice with the same channel model with households that primarily shop at part- choice drivers, but not accounting for the endogeneity ner stores and stores of the same size as the partner of some channel choice drivers. We find that, consis- stores. We find that the results obtained from this sub- tent with the literature on endogeneity, not accounting sample are very similar to those from the entire sam- for endogeneity leads to biased estimates. Some coef- ple, which alleviates concerns about store assortments ficients (e.g., coefficient of delivery charges) are biased influencing store choice. toward zero, and some coefficients (e.g., weather Households could systematically differ in their dummy) are wrongly signed (see the electronic com- package size choices even within a particular group of items panion for details). (perishables, heavy/bulky items). To account for this Zip-code-specific coefficients: To check how restrictive variation, we also tried a different operationalization it is to constrain all off-line stores to have the same of the number of items by weighting the numbers coefficients, we grouped off-line stores into 20 differ- by pack sizes relative to the smallest size in each ent bins based on their zip-code location—stores in the group. This did not change our results materially.
You can also read