Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice

Page created by William Manning
 
CONTINUE READING
Vol. 31, No. 1, January–February 2012, pp. 96–114
ISSN 0732-2399 (print) — ISSN 1526-548X (online)                                             http://dx.doi.org/10.1287/mksc.1110.0678
                                                                                                                     © 2012 INFORMS

       Quantifying Transaction Costs in Online/Off-line
                   Grocery Channel Choice
                                                    Pradeep K. Chintagunta
            Booth School of Business, University of Chicago, Chicago, Illinois 60637, pradeep.chintagunta@chicagobooth.edu

                                                         Junhong Chu
                      NUS Business School, National University of Singapore, Singapore 117592, bizcj@nus.edu.sg

                                                       Javier Cebollada
                              Public University of Navarra, 31006 Pamplona, Spain, cebollada@unavarra.es

      H     ouseholds incur transaction costs when choosing among off-line stores for grocery purchases. They may
            incur additional transaction costs when buying groceries online versus off-line. We integrate the various
      transaction costs into a channel choice framework and empirically quantify the relative transaction costs when
      households choose between the online and off-line channels of the same grocery chain. The key challenges in
      quantifying these costs are (i) the complexity of channel choice decision and (ii) that several of the costs depend
      on the items a household expects to buy in the store, and unobserved factors that influence channel choice also
      likely influence the items purchased. We use the unique features of our empirical context to address the first
      issue and the plausibly exogenous approach in a hierarchical Bayesian framework to account for the endogeneity
      of the channel choice drivers. We find that transaction costs for grocery shopping can be sizable and play an
      important role in the choice between online and off-line channels. We provide monetary metrics for several
      types of transaction costs, such as travel time and transportation costs, in-store shopping time, item-picking
      costs, basket-carrying costs, quality inspection costs, and inconvenience costs. We find considerable household
      heterogeneity in these costs and characterize their distributions. We discuss the implications of our findings for
      the retailer’s channel strategy.
      Key words: channel choice; online grocery shopping; transaction costs; plausibly exogenous;
        hierarchical Bayesian; green shopping
      History: Received: November 23, 2010; accepted: July 28, 2011; Preyas Desai served as the editor-in-chief and
        Bart Bronnenberg served as associate editor for this article. Published online in Articles in Advance
        December 20, 2011.

1.    Introduction                                                    of transaction costs for a retailer’s marketing strat-
Researchers have identified a number of transaction                   egy. In addition, we explore the “green” aspect of
costs such as opportunity costs of time and trans-                    online grocery shopping by quantifying possible soci-
portation costs as possible drivers of a consumer’s                   etal benefits of such shopping. As with the previous
choice of a physical grocery store (Bell et al. 1998,                 literature (Bell et al. 1998), we focus on store choice
Pashigian et al. 2003, Fox et al. 2004, Briesch et al.                conditional on a store visit. We use “store choice”
2009). In an online setting, researchers have found                   and “channel choice” interchangeably to mean choice
delivery charges and retailer reputation to be impor-                 of an off-line or online store. Furthermore, given our
tant factors influencing online store choices for non-                focus on channel choice conditional on making a
grocery items (Smith and Brynjolfsson 2001). When                     store visit, our money metrics represent the relative
consumers choose between online and off-line grocery                  transaction costs of shopping online versus shopping
channels, there could be additional transaction costs                 off-line.
that are specific to this purchase setting.                              Online and off-line channels provide varying lev-
   Our objectives in this paper are (1) to identify the               els of distribution services that entail different levels
transaction costs from the previous literature as well                of direct costs and transaction costs on consumers.
as any additional transaction costs that could influ-                 Direct costs refer to the sum of shelf prices (less dis-
ence the choice of online and off-line grocery channels,              counts, or net of discounts) of the items in the shop-
(2) to incorporate these costs into a model of con-                   ping basket; transaction costs are the costs needed
sumers’ choice between online and off-line grocery                    to transfer market goods in a store into consump-
channels, (3) to quantify and provide money metrics                   tion goods at home. Transaction costs vary from
for these costs, and (4) to investigate the implications              trip to trip and differ across households. They play
                                                                 96
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
Marketing Science 31(1), pp. 96–114, © 2012 INFORMS                                                                                97

an important role in household choice of shopping                      choice and category needs).1 A formal treatment of
venues. However, many transaction costs are non-                       all these factors simultaneously poses serious method-
monetary and hard to measure and quantify, and                         ological and computational issues. A key feature of
thus it is difficult for retailers to factor them in when              our data is that there is very little variation in quan-
designing marketing strategies. Because understand-                    tity choices across channels within a household—
ing consumers’ store and channel choice decisions is                   decomposing the variance in quantity choices within
of fundamental importance to retailers, it is of consid-               a category into the variance explained by household
erable value to managers to understand the monetary                    fixed effects and channel fixed effects, we find that
implications of the various transaction costs.                         the former explain 87.0%–99.8% of the variance. This
   The empirical context for quantifying transaction                   allows us to abstract away from considering purchase
costs in channel choice is a unique household panel                    quantities in the analysis.
                                                                          Next, accounting for each of the categories that a
data set from Spain. The data consist of the same
                                                                       household expects to buy on a store visit is nontrivial
households choosing between a chain’s off-line stores
                                                                       because there is considerable variation across house-
and its online store over six months. Using data from
                                                                       holds in the categories that drive their basket expendi-
a single chain may seem limiting because it may
                                                                       tures. Thus, we cannot fix the set of categories under
not capture all the purchases of a panelist. This con-                 analysis as being common across households as in
cern, although legitimate, is mitigated in our case, as                previous studies, nor can we focus on a smaller sub-
the panelists’ annual in-chain expenditures are about                  set of categories because to account for at least 80% of
80% of grocery expenditures by households of simi-                     a household’s basket expenditures we need to include
lar sociodemographics in the same area. Meanwhile,                     30 categories. As our interest in the categories bought
focusing on purchases within a chain has several                       stems from how they influence transaction costs and
advantages. First, issues such as store image as choice                as the choice of the channel does not affect quan-
drivers are no longer relevant. Second, the chain has                  tities bought, the identities of the specific categories
the same prices and promotions online and off-line.                    bought is not critical. Thus, we can simply summa-
Thus, direct costs are identical across channels and                   rize the information contained in the expected cate-
will not affect channel choice. Third, similar assort-                 gories purchased via metrics that will likely influence
ments are available everywhere, so this factor will not                transaction costs. Specifically, we focus on the total
play a role either. Our empirical context thus allows                  number of items, the number of perishable items, and
us to focus squarely on the role of transaction costs                  the number of heavy/bulky items (defined in §3.4)
in driving channel choice. Conventional reasons for                    that a household expects to purchase. One benefit of
why people shop online, such as lower prices, sales                    this classification is that transaction costs can be spec-
tax avoidance, and so on, are ruled out directly and                   ified as functions of these numbers. Thus, instead of
therefore do not confound our analysis.                                looking at channel choice and the expected purchase
   Nevertheless, empirically quantifying transaction                   in each category, our endogenous variables are chan-
costs is a challenging task because several of them,                   nel choice and the total number of items, the number
e.g., costs of putting items into the shopping cart                    of perishable items, and the number of heavy/bulky
                                                                       items that the household expects to buy.
and carrying them home, depend on the specific cat-
                                                                          Given the above set of potentially endogenous vari-
egories and their quantities that a household expects
                                                                       ables and given that channel choice depends on the
to buy on that shopping trip. These factors need not
                                                                       various expected numbers of items purchased, one
be exogenous because unobserved factors that influ-
                                                                       can adopt either a full information approach that
ence the choice of shopping channels could also influ-                 specifies how each of these variables is determined by
ence the categories and quantities bought. For exam-                   a consumer or a limited information approach where
ple, if a household member goes out purchasing other                   numbers of items are instrumented for in the analy-
goods for consumption (e.g., clothes), he or she may                   sis of channel choice. We choose the latter approach
decide to combine the trip with a visit to a physical                  because it not only obviates the need to specify the
grocery store. This unobserved factor that influences                  exact data generating process for these endogenous
off-line purchase could also influence the categories                  variables but also avoids the problem of incorrect
purchased (e.g., if the household has limited time to                  transaction cost estimates in the presence of a mis-
spend in the store). Hence, the potential endogeneity                  specified data generating process. We use as instru-
of the channel choice drivers needs to be accounted                    ments (i) each household’s inventory variables based
for. One way to deal with endogeneity is to specify                    on its own top expenditure categories and (ii) the
a full system of equations that characterize channel
choice, categories purchased, and associated quanti-                   1
                                                                        Because online and off-line channels have similar assortments,
ties (akin to Briesch et al. 2009, who model store                     brand choice is unlikely correlated with channel choice.
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
98                                                                                  Marketing Science 31(1), pp. 96–114, © 2012 INFORMS

numbers of items bought by pure online and pure off-             of transaction costs that map (not necessarily one-
line households of similar demographics to the pan-              to-one) onto the set of goods and distribution ser-
elists (see §5 for a justification of these instruments).        vices (Furubotn and Richter 1997). Stores differ in the
We combine these instruments with the Conley et al.              goods and distribution services provided, and hence
(2010) plausibly exogenous approach in a hierarchical            they differ in their direct and transaction costs to con-
Bayesian (HB) framework to account for the endo-                 sumers. For the same level of home production, con-
geneity of the numbers of items purchased. A key                 sumers choose stores with the lowest sum of direct
advantage of this approach is that it allows us to work          and transaction costs for their shopping basket.
with potentially weak instruments. Finally, to pro-                 Transaction costs play a role in all stages of a
vide money metrics for the various transaction costs,            consumer’s store choice process. Researchers have
we exploit the fact that although direct costs do not            applied the notion of transaction costs, in particu-
play a role in channel choice, the presence of deliv-            lar, the role of time costs and transportation costs,
ery costs allows us to compute the marginal utility              to analyze consumer’s choice of conventional stores
of income. Using the posterior distributions of the              (Bell et al. 1998), online stores (Smith and Brynjolfsson
parameter estimates, we quantify the distributions of            2001, Lee and Png 2004), and between online/direct
various transaction costs across households.                     retailers and conventional retailers (Liang and Huang
   The main contributions of our study are as follows:           1998, Balasubramanian 1998, Keeny 1999, Sinai and
(1) We integrate the transaction costs identified in the         Waldfogel 2004, Forman et al. 2009). Researchers have
literature and the additional transaction costs specific         identified several types of transaction costs in gro-
to the online and off-line grocery setting into a model          cery shopping in physical stores. Betancourt (2005)
of channel choice. (2) Methodologically, we apply the            synthesizes these costs into categories that map into
Conley et al. (2010) plausibly exogenous approach to             various distribution services: (1) Opportunity costs of
a nonlinear and hierarchical setting. (3) We provide             time comprise travel time to and from a store and
a strategy for reducing the complexity of the channel            in-store shopping and waiting time. (2) Transporta-
choice analysis by aggregating across categories in a            tion costs to and from a store include travel time and
manner that helps to simplify the problem substan-               costs, and are related to accessibility of store location.
tially without losing relevant information. (4) Sub-             (3) Psychic costs are costs such as perceived difficulty
stantively, we quantify and provide money metrics                of use, inconvenience, frustration, annoyance, anx-
for the important types of transaction costs in grocery          iety, drudgery, dissatisfaction, disappointment, per-
shopping that involves online and off-line channels,             sonal hassles, shopping enjoyment, or disagreeable
which can help retailers to better design marketing              social interactions that consumers are subject to in the
strategies for the two channels. We also take some               store environment (Ingene 1984, Devaraj et al. 2002).
preliminary steps to explore the implications of online          (4) Adjustment costs as a result of the unavailability of
shopping for the environment in terms of reducing                products at the desired time or in the desired amount
carbon emissions.                                                are costs that arise from additional time and trans-
                                                                 portation costs incurred because of forced search or
                                                                 lower utility associated with altering the consump-
2.   Transaction Costs and Online and                            tion bundle of goods. (5) Search costs are the costs of
     Off-line Channel Choice                                     time, transport, and other resources in the acquisi-
Transaction costs economics (Coase 1937; Williamson              tion of information with respect to price, assortment,
1979, 1981; Williamson and Masten 1999) emphasizes               physical attributes, or performance characteristics of
the role of transaction costs in economic exchanges.             the goods provided in different stores. In looking
The basic principle of transaction costs economics               at online shopping for nongrocery items, Smith and
is that agents choose to conduct transactions in a               Brynjolfsson (2001) identified two additional costs
way that minimizes their transaction costs. Grocery              as important transaction costs for choosing online
shopping is a canonical context that involves transac-           stores: (6) delivery costs and (7) waiting costs for basket
tions between retailers and consumers. The basic eco-            delivery.
nomic function of retailers is to provide consumers                 When it comes to the choice between online and off-
with explicitly priced goods and a set of distribution           line grocery stores, there could be additional transac-
services, including assortment, assurance of product             tion costs as well. Liang and Huang (1998) and Chircu
delivery, information, accessibility of location, and            and Mahajan (2006) mention these types of costs in
ambiance (Betancourt and Gautschi 1986, 1988). These             their analysis, and the popular press has discussed
goods and distribution services generate customer                them in some detail. One such cost that has been iden-
value and are essential inputs into household pro-               tified is (8) physical costs. Based on the experience of a
duction (Becker 1965). To carry out home produc-                 Webvan customer, Chircu and Mahajan (2006, p. 909)
tion, consumers need to incur direct costs and a set             note, “He lived on the 4th floor in a building with no
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
Marketing Science 31(1), pp. 96–114, © 2012 INFORMS                                                                                       99

elevator, and the delivery person carried bulky gro-                   3.    Data and Observed
cery purchases up the stairs for him.” Some online                           Shopping Patterns
blogs (e.g., Nick 2007) note the following when they
compare costs across off-line and online channels—                     3.1. The Grocery Chain
“You have to push a heavy cart” versus “Internet                       Our data are from a major grocery retail chain in
shopping carts weigh 0.003 grams,” “You have to                        Spain. The data are for one metro area, where the
maneuver through crowded aisles” versus “Internet                      retailer has about 200 physical stores and 1 online
aisles can fit any number of people,” and “You have to                 store. The online store is the retailer’s largest store
load your groceries in the car, unload them at home,                   by revenue. It partners with 18 of the chain’s phys-
and carry them into the kitchen” versus “The gro-                      ical stores for grocery supply. The partner stores
cery delivery person does all of this.” Physical costs,                were selected based on size (they are the largest
therefore, include (a) the costs of picking and putting                stores), geographic location (dispersed around the
items into the shopping cart, which differ significantly               metro area), and service ability such that they individ-
across online and off-line stores as the Internet option               ually have the ability to fulfill online orders (e.g., have
eliminates these costs on consumers; and (b) the costs                 adequate assortments, personnel, delivery trucks) and
of carrying the basket home, which are important                       collectively can do so efficiently.
in markets where many people walk or take public                          When a consumer makes her first online purchase,
transport to the store.                                                she needs to register an account with her loyalty card.
   A second such cost associated with the Internet                     Her account information is then stored in the com-
channel is (9) quality inspection or product evaluation                puter server and retrieved for subsequent orders. To
costs (BusinessWeekOnline 2001). Liang and Huang                       place an online order, a consumer first goes to the
(1998), in a nongrocery setting, refer to this as an                   grocer’s website and logs into her account. She then
“examination cost.” The Internet increases the costs                   browses or searches the aisles, selects the items she
of providing a given level of assurance of product                     wants to buy, and puts them into her basket. After
delivery at the desired time, or of the desired qual-                  filling her basket, she proceeds to checkout. The list
ity, as a result of the consumer’s inability to inspect                of the basket items is stored in the computer system
and acquire the product at the time and place of pur-                  as “previous shopping lists” and can be used for later
chase. Quality inspection costs may play an important                  “express” orders. The chain uses a centralized online
role when the shopping basket contains items such                      ordering system. After receiving an online order, the
as perishables whose qualities vary from purchase                      ordering system assigns it to one of the partner stores
occasion to purchase occasion. Delivery of perish-                     and notifies the household. The household then has
able goods with quality guarantee is considered the                    two options: it can go to the assigned partner store
biggest logistical problem for Internet grocers (Demby                 to pick up the order for free (such purchases are not
2000). Internet retailers also take special actions to                 observed in our data) or have the basket delivered
guarantee the quality of perishables ordered online:                   home within some chosen delivery time window (e.g.,
“If, for any reason relating to the freshness of a                     7–9 p.m. Monday) and incur delivery charges.2
Perishable Good, you are less than 100% satisfied,                        The retailer practices uniform pricing so prices are
COOLGREENANDSHADY.COM will arrange for the                             identical across all its online and off-line stores.3 It
re-delivery of your order” (Cool Green & Shady 2011).                  is a Hi-Lo chain and runs chainwide promotions.
   Direct costs and transaction costs are therefore a                  The online product offerings are the same for every-
function of the shopping channel, basket (product)                     one and available in all partner stores. Thus, assort-
characteristics, shopping occasion (duration and tim-                  ments are identical across the partner stores and the
ing of the trip) characteristics, and household charac-                online store. Assortments in nonpartner stores can dif-
teristics. Together, they vary across trips for the same               fer slightly because of store size differences and conse-
household, depending on when, what, and how much                       quent differentials in stock-out rates. We were able to
it buys. They also differ by households for the same
product bundle, depending on household sensitivities                   2
                                                                        Recently, chains such as Farm Stores in the Miami–Dade area
to different cost components and valuation of conve-                   in the United States have started offering online ordering com-
nience and time. For each shopping occasion, a house-                  bined with free drive-through pickup services (see http://www
hold trades off these costs and chooses a store with                   .miamiherald.com/2011/04/10/2159495/farm-stores-offers-online
the lowest sum of direct and transaction costs. Con-                   -grocery.html).
                                                                       3
sequently, we would expect to see both within- and                       We confirmed with the management and checked the data that
across-household variation in store choice as a func-                  the retailer indeed practices uniform pricing in all off-line stores
                                                                       and the online store. The retailer used to practice zone pricing in
tion of these factors. Accordingly, our model of store                 its off-line stores with two price zones, but it stopped off-line zone
choice needs to account for both direct and transac-                   pricing prior to our data period (see Chu et al. 2008 and 2010 for
tion costs.                                                            details).
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
100                                                                                               Marketing Science 31(1), pp. 96–114, © 2012 INFORMS

verify that 98.3% of the online items ordered were also                    3.3. Scanner Panel Data
purchased in nonpartner off-line stores in our data.                       We obtain the complete shopping records of 3,556
This, in addition to confirmation from management,                         households between May and November 2007. This
gave us the assurance that available assortments are                       is a random sample of the retailer’s online cus-
similar across stores (see also §5.5).                                     tomers. The households shop interchangeably in the
   People often walk or take public transport to buy                       online and off-line channels (we observe purchases
groceries (only 68% of stores have parking lots).                          in 196 stores). We observe at what time a household
About 60% of off-line stores also offer delivery ser-                      visits the chain, which store it visits, what items and
vice. The retailer has the same delivery policies for                      how much of each item it buys, whether the basket
online and off-line orders. It charges E6 for deliv-                       is home delivered, and delivery charges. We select a
ery if the basket is below E100 and E4 if the basket                       random sample of 1,025 households for model esti-
exceeds E100. Delivery is free for golden-card (a pre-                     mation. Table 1 presents major demographics and
mium loyalty card awarded to big spenders) clients                         store characteristics for all households and for the
if the basket exceeds E100. Households with quar-                          chosen sample. An average household has 3.37 mem-
terly in-chain expenditure exceeding E600 are eligi-                       bers, 0.68 preschool and 0.47 school-age children,
ble but need to apply for golden-card membership.                          2.14 work-age adults, and 0.07 elders. Of these house-
In sum, online and off-line channels have the same
                                                                           holds, 15.32%, 26.34%, and 58.34% live, respectively,
prices, price promotions, and delivery policies, as well
                                                                           in low, medium, and high income/economic areas.
as similar assortments.
                                                                           We also obtained data on pure online and pure off-
3.2. Store Price Promotions                                                line customers that we use to construct instruments.
The retailer provided us with price promotion data,
including categories and items promoted, promo-                            3.4.   Observed Shopping Patterns and
tion start and end dates, and depths of price cuts.                               Implications for Econometric Modeling
Promotions occur in vastly differing sets of cate-                         The households exhibit the following shopping pat-
gories. Each day there are on average 85 categories                        terns (see Table 2) that appear to be consistent with
and 419 items on promotion. Promotion durations                            the transaction costs of shopping at each channel.
range from three to eight weeks. Three-week, four-                         These patterns provide some model-free evidence of
week, and five-week promotions account for 22.9%,                          the importance of transaction costs in channel choice
43.2%, and 33.7% of the promotion cycles, respec-                          decisions.
tively. Such multiple-week promotions are quite dif-                         First, the availability of the Internet option does not
ferent from weekly promotions commonly practiced                           act as a sorting mechanism whereby certain house-
by U.S. grocers. Promotion depths vary substantially                       holds mainly shop online and others primarily shop
across products, ranging from 4.5% to 25.0% with a                         off-line. All sample households shop in both channels.
mean of 8.7%.                                                              There is not the “20/80” phenomenon, where 20% of

         Table 1     Household Demographics and Characteristics of Most Frequented Off-line Stores

                                                                           Entire sample                            Random sample

                                                                    Mean                   Std. dev.           Mean              Std. dev.

         Family size                                                 3039                      2020               3037              2005
           No. of preschool children (0–5)                           0069                      0089               0068              0087
           No. of school-age children (6–18)                         0044                      0085               0047              0089
           No. of working adults (19–65)                             2018                      1089               2014              1074
           No. of of elders (65+)                                    0007                      0047               0007              0035
         Distance to most frequented off-line store                  1013                      4001               0099              2061
           Percentage of reside in low economic area                15043                                        15032
           Percentage of reside in medium economic area             24096                                        26034
           Percentage of reside in high economic area               59061                                        58034
           Percentage of downtown                                   41046                                        39061
         Characteristics of most frequented off-line stores
           Store square footage (m2 5                            11480061                  11102038           11473063            987001
           Percentage of bazaar section                             63086                                        64039
           Percentage of butcher shop                               91068                                        90083
           Percentage of processed meat                             93038                                        92029
           Percentage of fish shop                                  83005                                        82005
           Percentage of having parking lots                        68003                                        68068
           Percentage of home delivery                              60044                                        60068
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
Marketing Science 31(1), pp. 96–114, © 2012 INFORMS                                                                                                                              101

                          Table 2       Characteristics of Household Shopping Behavior by Channel

                                                                                                            Total                        Online                 Off-line

                                                                                                  Mean              Std. dev.   Mean         Std. dev.   Mean        Std. dev.

                          Total trips                                                               16018            11067        5056          4022      10061        10098
                          Mean trip interval (days)                                                 10097            12034       14046         14029       9014        10075
                          Basket size (E)                                                           82099            82059      155077         83048      44085        50012
                          Half-year spending (E)                                                 11342060           874050      866056        697073     476004       534010
                          No. of unique categories per trip                                         16086            13011       27067         11000      11019        10026
                          No. of unique perishable categories                                        4011             4082        4087          5057       3072         4032
                          No. of unique heavy/bulky categories                                       5006             4079        9089          3067       2054         3006
                          No. of unique items per trip                                              22061            19018       38047         17026      14030        14028
                          No. of unique perishable items                                             4069             5073        5055          6060       4024         5015
                          No. of unique heavy/bulky items                                            6064             6086       13039          5098       3010         4006
                          Percentage of perishables in trip expenditure                             23021            25034       11069         13014      29024        27097
                          Percentage of heavy/bulky in trip expenditure                             33003            26014       49030         17081      24050        25077
                          Coefficient of variation: Basket                                           0049             0035        0024          0016       0073         0032
                          Coefficient of variation: Trip interval                                    0073             0025        0052          0025       0086         0032
                          Channel switching: Household level (%)                                    58069            33070       68092         31051      48050        32072

the households account for 80% of the online trips or                                                                They make major trips to the online channel and fill-
expenditures (see Figure 1). The mean ratio of online                                                                in trips to the off-line channel (Kahn and Schmittlein
trips to total trips is 41.8% (std. dev. = 2407%). The                                                               1989). Online baskets (E155.8) on average are 3.5 times
ratio is below 20% for 22% of the households, above                                                                  as large as off-line baskets (E44.9). Households buy
80% for 11% of the households, and between 20%                                                                       28 (11) unique categories and 38 (14) unique items
and 80% for 67% of the households. The mean ratio                                                                    on an online (off-line) trip; and 95% of a household’s
of online expenditures to total expenditures is 65.0%                                                                online trips have a basket larger than its mean off-
(std. dev. = 2409%). The ratio is below 20% for 7%                                                                   line basket, and 63% of its off-line trips have below-
of the households, between 20% and 80% for 58% of                                                                    average-sized baskets. This suggests that relative to
the households, and above 80% for 35% of the house-                                                                  shopping off-line, transaction costs of shopping online
holds. Channel “switching” is common in the data.                                                                    are low when the basket is large but high when the
The online-to-off-line switching probability is 56.5%,                                                               basket is small. It also raises the possibility that there exist
and the off-line-to-online probability is 29.6%. This                                                                unobservable factors that influence both channel choice and
suggests that for the same households, transaction                                                                   number of items purchased.
costs of buying online versus off-line vary from trip                                                                   Third, households also sort their trips to online
to trip. Delivery charges are involved for 22% of the                                                                and off-line channels based on category character-
online trips and 1% of the off-line trips. Because house-                                                            istics. Online baskets consist of more heavy/bulky
holds visit both channels, the same household’s temporal                                                             items, reflecting the convenience benefit of online
channel choice decisions can only be explained by trip-level                                                         shopping, whereas off-line baskets have more perish-
factors, even though demographics may explain the average                                                            ables, reflecting the ability of the physical channel to
online shopping intensity across households.                                                                         allow for quality checking. Perishables refer to fresh
   Second, households sort their shopping trips to                                                                   produce, meat, seafood, and bakery items in a house-
online and off-line channels based on basket size.                                                                   hold’s basket. Heavy/bulky items refer to bottled,
                                                                                                                     canned, or bagged consumer packaged goods (CPGs)
Figure 1         Household Distribution by Shares of Online Grocery                                                  (e.g., mineral water, beer, toilet paper) and liquid-rich
                 Expenditure and Trips                                                                               non-CPGs (e.g., watermelon). Examples of these are 4
20                                                                                                                   one-liter bottles of water, 12 rolls of toilet tissue, etc.
            Grocery $
            No. of trips                                                                                             Note that some categories can appear in both heavy
16
                                                                                                                     and perishable sets (e.g., watermelon). An online (off-
12                                                                                                                   line) basket has 13.4 (3.1) unique heavy/bulky items
                                                                                                                     and 5.6 (4.2) perishable items. Heavy/bulky (perish-
 8                                                                                                                   able) items account for 49.3% (11.7%) of online and
                                                                                                                     24.5% (29.2%) of off-line trip expenditures. Further-
 4
                                                                                                                     more, a household, on average, buys 29.3 categories
 0                                                                                                                   exclusively online, 32.4 categories exclusively off-line,
                                                                                                                     and 23.2 categories in both channels. The channel-
      90

                                                                                                                     specific category purchases suggest that the transac-
                                                Online shares (%)                                                    tion costs of buying the same categories differ by
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
102                                                                                                                                                                                                                                                                  Marketing Science 31(1), pp. 96–114, © 2012 INFORMS

Figure 2    Distribution of Shopping Trips by Basket Size
                   20
                                                                                                                                                                                                                                                                                                                                                                   Off-line
                                                                                                                                                                                                                                                                                                                                                                   Online
                   16

                   12
             (%)

                    8

                    4

                    0
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
Marketing Science 31(1), pp. 96–114, © 2012 INFORMS                                                                            103

the other two decreased the proportion. This seems                     online store, implying that the costs of off-line shop-
to suggest that promotions of different items differ-                  ping during office hours may be high compared with
entially influence the propensity to visit the differ-                 online shopping. We include a dummy variable for
ent channels, e.g., observing promotions of perishable                 trips that occurred during office hours (8 a.m.–noon,
items encourages customers to visit the off-line store.                and 2–6 p.m.).
Therefore, we need to account for the differential effects of              (2) Transportation costs are directly related to the
promotions across channels, even though the economic ben-              home–store distance. Consistent with the literature
efits of these promotions are the same across channels.                (e.g., Bell et al. 1998), we use home–store distance
   In sum, the observed shopping patterns appear to                    as a proxy for transportation costs. Thus, home–
show that transaction costs of shopping differ by chan-                store distance captures the effect of both travel time
nel, basket size and composition, shopping occasion,                   and transportation costs. To more accurately quantify
and household characteristics. Our econometric spec-                   transportation costs and travel time, we need trip-
ification tries to incorporate these into the analysis.                level mode of transportation used, which is not avail-
                                                                       able to us. Using information at some aggregate level,
                                                                       such as zip-code car ownership, is not useful because
4.    Econometric Specification                                        this information does not vary over our short data
4.1. Description of Shopping Costs                                     duration. Furthermore, there is limited variation from
A shopping trip involves direct costs and transaction                  one zip code to another. In the empirical analysis, we
costs. As the retailer has uniform pricing in the two                  interact distance with household income level to cap-
channels, direct costs drop out in any discrete model                  ture differences in time costs across households.
of channel choice, so we do not consider them. We                          (3) Psychic costs are difficult to measure but are
now describe how we measure the nine types of trans-                   closely related to the characteristics of the shopping
action costs identified. For a given basket, search costs              channels themselves. We use channel dummies and a
to find a store are unlikely to play a role in our case,               household-, trip-, and channel-specific error term to
although in-store search costs will differ, which are                  reflect these costs.
accounted for as follows:                                                  (4) Adjustment costs as a result of product unavailabil-
    (1) Time costs are the opportunity costs of travel                 ity and (5) in-store search costs: As with psychic costs,
time and in-store shopping time. Shopping time is an                   these costs are channel specific and difficult to mea-
important factor in store choice decision. Messinger                   sure. They may vary over time because of different
and Narasimhan (1997) find that the growth of one-                     products not available in different channels at differ-
stop shopping has been a response to the growing                       ent times and familiarity with product layouts. We
demand for timesaving convenience. Pashigian et al.                    represent them by a household-, trip-, and channel-
(2003) examine how firms respond to the increasing                     specific error term.
cost of a shopper’s time in ways that economize on                         (6) Basket delivery costs: Smith and Brynjolfsson
shopping time. Consistent with the literature (e.g.,                   (2001) find consumers are more sensitive to deliv-
Bell et al. 1998), we use home–store distance as a                     ery charges than to product prices when shopping
proxy for travel time. As did Hoch et al. (1995), we                   online. Lewis et al. (2006) find a significant effect of
scale distance in the downtown area by 2 to reflect                    nonlinear shipping and handling fees on purchase
its greater congestion. We also scale distance by two                  incidence and expenditure decisions for dry groceries
if the shopping trip occurs at peak weekday hours                      online. The shadow price of delivery charges differs
(7:30–9 a.m. and 6–8 p.m.) to reflect the longer time                  for online and off-line shopping. For online shop-
needed to travel the same distance.                                    ping, delivery charges cover transportation costs to
   In-store shopping time primarily is a function of the               and from the store, opportunity costs of travel time,
number of items purchased. We use the number of                        and physical costs of basket picking and carrying. For
basket items as a proxy for in-store shopping time.                    off-line shopping, they only cover physical costs of
Online item selection and checkout differ from that                    basket carrying. Thus, households may be less sensi-
off-line. For the same set of items, a household may                   tive to delivery charges when shopping online than
need less time to shop online than off-line, so we                     off-line.
allow the marginal costs of shopping time to differ by                     (7) As with adjustment costs, waiting costs for bas-
channel.                                                               ket delivery are difficult to measure and may vary over
   The opportunity costs of time also depend on when                   time because of instances of more urgent need of a
a household visits a store. We include a weekday                       product than others. We represent them by the chan-
dummy to capture the higher opportunity costs of                       nel dummy and by the household-, trip-, and channel-
time on weekdays. Furthermore, a consumer may not                      specific error term.
be able to leave her office to buy groceries in a physi-                   (8) Physical costs of picking items are related to
cal store during office hours but can still do so in the               the number of items, particularly the number of
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
104                                                                                    Marketing Science 31(1), pp. 96–114, © 2012 INFORMS

heavy/bulky items purchased. It is a simple mouse                                 + hj7 NHht · dhjt + hj8 Wt + hj9 DCht
click for online shopping but can be high for off-
                                                                                  + hj10 Pht + hj11 PPht + hj12 PHht + ˜hjt 1      (1)
line shopping. We use the number of heavy/bulky
items as a proxy for item-picking costs and allow the               where FChjt is total transaction costs and dhjt is the
marginal costs to differ by channel. The costs of car-              home–store distance, which changes from trip to trip
rying a basket home are a function of the number of                 because a household does not always visit the same
heavy items in the basket and the distance traveled.                off-line store. Variable WKt is a weekday dummy and
We use their interaction as a proxy for basket-carrying             OHt is an office hour dummy; NIht , NPht , and NHht
costs and set them to 0 for online shopping and off-                are numbers of items, perishables, and heavy/bulky
line shopping with home delivery. The costs of car-                 items, respectively; and NHht · dhjt is the interaction
rying a basket also depend on weather conditions, so                between heavy items and distance. Wt is a dummy
we allow weather to influence the choice of online                  for bad weather and DCht is the delivery charge; Pht ,
and off-line channels differently.                                  PPht , and PHht are the overall promotion index and
    (9) Costs of inability to verify product quality prior to       promotion indices of perishables and heavy/bulky
paying are primarily a function of perishables bought               items; and ˜hjt stands for unobserved search, adjust-
because product qualities such as freshness can vary                ment, and waiting costs that we assume are normally
a lot from one purchase occasion to the next, whereas               distributed.
the quality of packaged goods does not change much                     We constrain the coefficients for the 196 off-line
over purchase occasions. We use the number of per-                  stores to be the same and subtract the costs of shop-
ishables bought as a proxy for these costs. Physical                ping at the off-line channel from those at the online
stores allow households to check product quality                    channel. Two factors necessitate our assumption on
prior to paying; the online store lacks this ability                the parameters across off-line stores. First, we do
because the actual quality can differ from what is                  not have a sufficient number of observations for
shown on the computer screen. Thus, households may                  each store to estimate separate effects. Second, even
be more likely to visit the off-line channel when buy-              if we did, it would be computationally infeasible
ing more perishables.                                               to estimate store-specific parameters for each of the
   Intrinsic channel preferences and an error term                  196 stores. At the same time, the assumption is not
capture, respectively, the time-invariant (channel                  restrictive for the following reasons: (1) All off-line
image, environmental concerns, etc.) and time-vary-                 stores belong to the same chain and thus share the
ing aspects of unobserved transaction costs. Thus,                  same or similar chain image, service quality, product
given our data, we are able to quantify the impor-                  layout, returns policy, etc. The effects of other fac-
tant types of transaction costs. To account for differ-             tors such as store distance and size are accounted for.
ences in the perceived benefits of promotions across                (2) We estimate coefficients for each household. Even
channels for the different products purchased by the                though there are 196 stores, 32.9%, 31.1%, and 20.2%
households, we include overall promotions, promo-                   of the households only visit one, two, or three off-line
tions of heavy/bulky items, and promotions of per-                  stores, respectively. (3) For 74.1% of the households,
ishable items as drivers of channel choice. Promotions              the most frequented off-line stores account for over
could result in households choosing the off-line chan-              60% of off-line trips. This combined with household-
nel because of lower cherry-picking costs in the off-               level parameters should capture some of the het-
line stores.                                                        erogeneity across stores as well. In §5.5, we try to
                                                                    assess the impact of this assumption. Nevertheless,
4.2. Econometric Model
                                                                    we acknowledge this as a possible limitation of our
On each shopping trip, a household compares the
                                                                    analysis.
expected costs of different channels and chooses the
                                                                       Next, we drop subscript t for ease of exposi-
one with the lowest expected costs to maximize its
                                                                    tion. Define Yh ≡ FCh1 on − FCh1 off , ˜h ≡ ˜h1 on − ˜h1 off ,
utility. We assume households have rational expec-
                                                                    and h ≡ h1 on − h1 off . Let Xh ≡ {NIht , NPht , NHht ,
tations over shopping costs, so expected costs are
                                                                    NHht · dhjt } be the vector of numbers of items, per-
the same as actual costs. It is reasonable to assume
rational expectations on weather, delivery charges,                 ishable items, heavy/bulky items, and its interaction
and travel distance. Given chainwide promotions and                 with distance; let Gh be other costs in Equation (1)
the availability of the Internet option, it is also rea-            and Ih be the choice indicator. ‚h represents the subset
sonable to assume rational expectations over promo-                 of h that is associated with the variables Xh , and ƒ 1h
tions. Household h’s expected total transaction costs               represents all other coefficients in h . Thus, we have
of shopping at channel j on trip t are                              the following relationships:

   F Chjt = hj0 + hj1 dhjt + hj2 WKt + hj3 OHt                                     Yh = Gh ƒ1h + Xh ‚h + ˜h 1                     (2)

            + hj4 NIht + hj5 NPht + hj6 NHht                                       Ih — Yh ∼ Binomial Probit0                      (3)
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
Marketing Science 31(1), pp. 96–114, © 2012 INFORMS                                                                                    105

4.3. Endogeneity of Channel Choice Drivers                                 We write out this model as a sequence of condi-
Numbers of items, perishables, and heavy/bulky                          tional distributions and run the Gibbs sampler to get
items are household decision variables on trip t                        the Markov chain Monte Carlo (MCMC) sequence
and thus may be endogenous to the channel choice                        (Chib 2001, Rossi et al. 2005, Conley et al. 2010). The
decision because unobserved factors (as discussed                       posterior distributions are
previously) that influence numbers of items bought
also likely influence the choice of shopping chan-                           Yh — Gh 1 Xh 1 Zh 1 ƒ1h 1 ‚h 1 ƒ1 è ∼ Truncated Normal1
nels (i.e., Xh could be correlated with ˜h 5. There are
                                                                        8ƒ1h 1 ‚h 9 — Yh 1 Gh 1 Xh 1 Zh 1 D1 ƒ1 ç1 è1 ã1 V— ∼ Normal1
several approaches to address this issue: (i) explic-
itly model each of these decisions—a full-information                       ƒ — Yh 1 Gh 1 Xh 1 Zh 1 8ƒ1h 1 ‚h 91 ç1 è1 Œƒ 1 Aƒ ∼ Normal1
structural approach, (ii) the control function approach
                                                                                ç — Gh 1 Xh 1 Zh 1 8ƒ1h 1 ‚h 91 è1 ç̄1 Aç ∼ Normal1
of Petrin and Train (2010), and (iii) the plausibly
exogenous approach of Conley et al. (2010). These                           è — Yh 1 Gh 1 Xh 1 Zh 1 8ƒ1h 1 ‚h 91 ƒ1 ç ∼ Inverted Wishart1
approaches are similar in nature. The second and
third approaches rely on finding exogenous instru-                                     㠗 8ƒ1h 1 ‚h 91 V— 1 ã̄1 Aã ∼ Normal1
ments for the endogenous variables, and the first                                 V— — 8ƒ1h 1 ‚h 91 v—0 1 V—0 ∼ Inverted Wishart0       (9)
approach requires the presence of some excluded
variables.                                                              The details are available in the electronic companion
   The plausibly exogenous approach, of which the                       to this paper that is available as part of the online
conventional instrumental variables technique is a                      version that can be found at http://mktsci.journal
special case, relaxes the exclusion restriction that                    .informs.org/.
instrumental variables have no correlation with the
unobservables; thus it allows for the use of less-than-
perfect instruments. It works well with both strong                     5.       Variable Operationalization and
and weak instruments, so we adopt this approach. We                              Robustness Checks
find a set of instruments Zh for endogenous variables
                                                                        5.1. Choice of Instruments
Xh and include them into Equation (2) as follows:
                                                                        To resolve the endogeneity issue, we need to find vari-
               Yh = Gh ƒ1h + Xh ‚h + Zh ƒ + ˜h 1                  (4)   ables that affect the numbers of items, perishables, or
                                  Y                                     heavy/bulky items a household buys on trip t but
                    Xh = 4Gh Zh 5 +vh 0                           (5)   that do not directly influence the choice of the shop-
   This approach essentially replaces the actual num-                   ping channel. What would be reasonable instruments
bers of items, perishables, and heavy/bulky items                       for the numbers of items bought? A key benefit of
with their expected numbers, which are determined                       our data is that we have access to purchases in all
by exogenous instruments (Gh Zh 5. When ƒ = 0,                          categories (about 700 categories, as defined by the
this model reduces to the conventional instrumen-                       retailer, were bought by the households), including
tal variables model. Assume (˜h 1 vh 5 ∼ N401 è5. Let                   meat, fish, milk, and produce, that make up a signif-
p4Data — ƒ1h 1 ‚h 1 ƒ1 ç1 è5 be the likelihood of the data,             icant proportion of a household market basket. This
p4ƒ1h 1 ‚h 1 ç1 è5 the prior distribution of the model                  allows us to use each household’s own top expen-
parameters, and p(ƒ — ƒ1h 1 ‚h 1 ç1 è5 the prior distri-                diture categories to construct its inventory variables.
bution of ƒ. The inference is based on the posterior                    Importantly, these inventory variables influence chan-
distribution of (ƒ1h 1 ‚h 1 ç1 è5, integrating out ƒ:                   nel choice primarily via influencing the numbers of
                              Z                                         items bought; i.e., lower inventory levels in many cat-
   p4‚h 1 ƒ1h 1 ç1 è — Data5 ∝ 6p4Data — ‚h 1 ƒ1h 1 ƒ1 ç1 è5            egories will result in greater numbers of items being
                                                                        bought on that trip.5 To check whether this is indeed
              · pƒ 4ƒ — ‚h 1 ƒ1h 1 ç1 è5p4‚h 1 ƒ1h 1 ç1 è5 dƒ70 (6)
                                                                        the case, we run homogeneous probit models of chan-
 To overcome the computational burden of this                           nel choice that, respectively, include (1) inventory lev-
model, we cast it in an HB framework as                                 els of all items, heavy/bulky items, and perishables;
                                                                        (2) numbers of items, heavy/bulky items, and perish-
   B = Dã + U— 1        U— ∼ N401 V— 51        B ≡ 8ƒ1h 1 ‚h 91   (7)   ables; and (3) both sets of variables in (1) and (2); and
where D is household demographics and ã is the                          we look at changes in the model fit. We find that num-
second-stage coefficients. The common priors are                        bers of items have a much larger explanatory power
                                                                        than inventory levels. Furthermore, when numbers
ƒ ∼ N4Œƒ 1 A−1
            ƒ 51        vec4ç — Vh 5 ∼ N4vec4ç̄51 è22 ⊗ A−1
                                                         ç 51           of items are included in the model, two of the three
 è ∼ IW4v0 1 V0 51      vec4㠗 V— 5 ∼ N4vec4ã̄51 V— ⊗ A−1
                                                        ã 51
                                                                        5
                                                                         We assume that there is no serial correlation in the channel choice
                       V— ∼ IW4v—0 1 V—0 50                       (8)   equation errors as is typical in these models.
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
106                                                                                 Marketing Science 31(1), pp. 96–114, © 2012 INFORMS

inventory coefficients are no longer significant (see the        5.2. Operationalization of Promotion Variables
electronic companion for details). We therefore con-             We first calculate category promotion depths as
clude that a household’s numbers of low-inventory                weighted averages of price cut depths across all items
items, low-inventory perishables, and low-inventory              in the category, using each household’s mean item
heavy/bulky items are reasonable instruments for our             expenditures as weights. We then compute an over-
endogenous variables (recall that our method does                all promotion index as the weighted average of cate-
not require perfect instruments). We first identify the          gory promotion depths, using each household’s mean
top 30 categories for each household by looking at the           basket shares of its top 30 categories as the weights.
proportion of that household’s total grocery expendi-            Similarly, we compute promotion indices for perish-
ture (across all purchase occasions) accounted for by            ables and heavy/bulky items as weighted averages of
each category purchased. Because household-specific              promotion depths. The weights are each household’s
top 30 account for 81.1% of the basket value (std.               mean basket shares of the respective items among
dev. = 904%), using the top 30 is nearly equivalent to           its top 30 categories. Thus, the promotion indices are
using the entire basket. This aggregation strategy sub-          household specific.
stantially reduces the complexity of the model to a
manageable scale without loss of relevant informa-               5.3. Operationalization of Other Cost Variables
tion. Next, we compute category inventory for each               We use a weather indicator for rainy, stormy, and
trip as in Gupta (1988) and find categories within               windy days. The home–store distance refers to zip-
each household’s top 30 that have below-average                  code-level distance between a household’s home and
inventories for that household. We then sum up the               the store visited on that occasion. On occasions when
mean numbers of items across the low-inventory cat-              the online channel is visited, the distance for the off-
egories, perishables, or heavy/bulky categories for              line option is the value for that household’s most
each household.                                                  frequented physical store.6 The distance is zero for
   We also have information on the numbers of items              online shopping.
bought by households that have similar demograph-                   The delivery charges variable (DCht 5 is defined as
ics to the panel households but do not make a channel            follows. For all off-line visits entailing delivery, the
choice decision, i.e., “pure” online and “pure” off-line         variable drops out because the charges are the same
households. We compute trip-level mean numbers of                in both cases. For off-line visits with no delivery,
items (heavy/bulky items, perishables) on each day               DCht = 0 for the off-line channel and the amount cor-
for pure online and pure off-line shoppers, broken               responding to the basket size for the online channel,
down by observed demographics, and match them                    and DCht = 0 for golden-card members if the basket
to each sample household. The rationale for using                exceeds E100. For online visits, we assume that the
these variables as instruments is that variation in              off-line option corresponds to no delivery, so DCht =
numbers of items bought can be partly explained by               0 for the off-line channel and DCht equals the actual
unobserved consumer characteristics. As long as these            charges for the online channel. We recognize that this
characteristics are correlated with observed demo-               is an approximation in that, to deal with this struc-
graphics, other households’ purchases will be cor-               turally, we need to consider off-line visits with and
related with those of the target household. Because              without delivery as two options. However, other than
these other households do not make a channel choice,             delivery charges, there is nothing else in the data that
their numbers of items are unlikely correlated with              informs us of this choice, so we treat off-line visits as
the unobservables that drive channel choice for the              a single choice. Furthermore, as noted before, a very
households in our panel.                                         small fraction of off-line shoppers avail of the deliv-
   These two sets of instruments explain, respectively,          ery option.
19.1%, 20.4%, and 23.9% of the variation in the num-
ber of items, number of perishables, and number                  5.4. Demographics and Store Characteristics
of heavy/bulky items. Together with demographics,                These variables appear in the hierarchy as D in Equa-
they explain 19.8%, 23.6%, and 24.7% of the varia-               tion (7) and allow us to account for observable house-
tion; and in combination with exogenous variables in             hold heterogeneity. Demographics include family size,
Equation (1), they explain 23.2%, 24.9%, and 28.7%               number of children, and the residential area’s eco-
of the variation in the endogenous variables (see                nomic status or income (low/medium/high). House-
Table 3). These instruments appear to have indepen-              hold basket characteristics include mean basket shares
dent explanatory power. Although we recognize the
                                                                 6
potential consequences of using multiple weak instru-              This is the best approximation as it is the cost for most off-line
                                                                 trips. Households visit their most frequented off-line stores for
ments, this is less likely in our case because of the low        major trips and fill-in trips. We also used the distance to the clos-
correlations between instruments (see the electronic             est off-line store (the two are the same for many households) and
companion for details).                                          obtained similar results.
Chintagunta, Chu, and Cebollada: Quantifying Transaction Costs in Online/Off-line Grocery Channel Choice
Marketing Science 31(1), pp. 96–114, © 2012 INFORMS                                                                                           107

Table 3        Regression of Endogenous Variables on Instruments

                                                                      No. of items              No. of perishables        No. of heavy/bulky items

                                                               Estimate        Std. error    Estimate       Std. error   Estimate        Std. error

Intercept                                                      −380358           30727       −40121           00675      −340932           10905
Pure online shoppers’ mean no. of items                          00103           00019        00013           00003       −00006           00010
Pure off-line shoppers’ mean no. of items                        00374           00031        00049           00006        00197           00016
Pure online shoppers’ mean no. of perishables                    00477           00082        00248           00015        00058           00042
Pure off-line shoppers’ mean no. of perishables                 −00386           00102       −00009           00018       −00274           00052
Pure online shoppers’ mean no. of heavy/bulky items              00527           00035       −00005           00006        00404           00018
Pure off-line shoppers’ mean no. of heavy/bulky items           −00420           00081       −00051           00015       −00255           00042
Pure online shoppers heavy items · Distance                     −00443           00075       −00019           00014       −00221           00039
Pure off-line shoppers heavy items · Distance                    00706           00246       −00034           00045        00452           00126
No. of low inventory items                                       00558           00029        00035           00005        00127           00015
No. of low inventory perishables                                −00071           00094        00350           00017       −00141           00048
No. of low inventory heavy/bulky items                          −00067           00044       −00036           00008        00281           00023
Low inventory heavy/bulky items · Distance                      −00533           00081        00025           00015       −00379           00041
Family size                                                     −00095           00262       −00028           00047       −00062           00134
Preschool children                                              −10204           00517       −00063           00094       −00921           00264
Parking                                                          50632           00978        10481           00177        20369           00500
Partner store (yes/no)                                           80682           10100        10159           00199        30495           00562
MBS: Ratio of heavy/bulky a                                     −90750           40710       −10308           00853       220744           20408
MBS: Ratio of perishables                                        70965           60036       190806           10093        10613           30086
Medium economic status                                          −10117           10411        00069           00255        00521           00721
High economic status                                            −00060           10274        00309           00231        00316           00651
Overall promotion                                                00392           00281        00106           00051        00052           00144
Promotion of perishables                                        −20949           10077        00260           00195       −10234           00551
Promotion of heavy/bulky                                         10195           00520       −00029           00094        00604           00266
Delivery charges                                                 40968           00310       −00203           00056        30293           00158
Weekday dummy                                                   −00470           10212       −00701           00219       −00216           00620
Bad weather dummy                                               −80911           30115       −00314           00564       −40553           10593
Store distance                                                   00134           00024        00007           00004        00077           00012
Office hours                                                    170477           00942        10139           00170        90004           00481
R2                                                               00232                        00249                        00287
N                                                               16,796                       16,796                       16,796
  a
      MBS, mean basket share for each household across all store visits.

(across store visits) of perishables and heavy/bulky                             same zip code are grouped into the same bin, and all
items. Characteristics of the most frequented physi-                             those zip codes with less than 200 observations are
cal store include store square footage (a proxy for in-                          grouped into one bin. We estimated different sets of
store shopping time but also correlated with the store                           parameters for different bins for the homogeneous
location—a large mall, etc.) and indicators for seafood                          probit model. We find that it is not possible to pre-
shop, parking lots, delivery service, and Internet store                         cisely identify these coefficients, and the great major-
partnership.                                                                     ity of the coefficients are not statistically significantly
                                                                                 different from each other.
5.5.   Robustness Checks                                                             To check whether possible differences in store assort-
    A model without instruments: We run an HB pro-                               ments play a role in store choice, we estimate an HB
bit model of channel choice with the same channel                                model with households that primarily shop at part-
choice drivers, but not accounting for the endogeneity                           ner stores and stores of the same size as the partner
of some channel choice drivers. We find that, consis-                            stores. We find that the results obtained from this sub-
tent with the literature on endogeneity, not accounting                          sample are very similar to those from the entire sam-
for endogeneity leads to biased estimates. Some coef-                            ple, which alleviates concerns about store assortments
ficients (e.g., coefficient of delivery charges) are biased                      influencing store choice.
toward zero, and some coefficients (e.g., weather                                    Households could systematically differ in their
dummy) are wrongly signed (see the electronic com-                               package size choices even within a particular group of items
panion for details).                                                             (perishables, heavy/bulky items). To account for this
    Zip-code-specific coefficients: To check how restrictive                     variation, we also tried a different operationalization
it is to constrain all off-line stores to have the same                          of the number of items by weighting the numbers
coefficients, we grouped off-line stores into 20 differ-                         by pack sizes relative to the smallest size in each
ent bins based on their zip-code location—stores in the                          group. This did not change our results materially.
You can also read