Using Market Basket Analysis to Estimate Potential Revenue Increases for a Small University Bookstore
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ Using Market Basket Analysis to Estimate Potential Revenue Increases for a Small University Bookstore Bogdan Hoanca afbh@uaa.alaska.edu Computer Information Systems Kenrick Mock kenrick@uaa.alaska.edu Computer Science University of Alaska Anchorage Anchorage AK 99508, USA Abstract Market basket analysis (MBA) is a widely used technique for identifying affinities among items that customers purchase together. MBA metrics are support, confidence, and lift. We show that support and confidence may include misleading information about the nature of the affinity, and that lift is the most useful metric. Starting with the MBA, we use the product affinities to predict ways to increase revenues, and we estimate the magnitude of the possible increases as a function of customer price sensitivity and affinity saturation level. We also point out limitations of the MBA and suggest ways to overcome them. For the case of a small university bookstore, we identify pairings of items that have revenue-increasing potential. Depending on the customers’ price sensitivity and affinity saturation level, revenues could be increased by as much as $10,000 or as little as $100 for a $ 410,000 starting level. In particular, we identify pairings where customer price sensitivity might be overcome (one-time situations, for example graduation-related purchases). This case study is the first to provide an actual magnitude of the estimate of potential revenue increases. Keywords: market basket analysis, scanner panel data, forecasting revenue increases 1. INTRODUCTION magnitude of such increases for existing businesses. This paper is the first to predict Market basket analysis (MBA) is a well known potential revenue increases based on an MBA technique for uncovering affinity relationships analysis for a small university bookstore. In the among items purchased together. Also known as remainder of the paper we refer to increasing scanner panel data, the technique identifies the revenues (by focusing on the sales price), associations between items or between but the same approach can be used for categories of items that customers tend to optimizing profits (which is done by focusing on purchase together (complements) or between the profit margin). items customers rarely purchase together (substitutes). The paper first introduces the basics of MBA and the limitations of MBA metrics in providing Despite the research on how to best extract and information about possible increases in revenues optimize the item associations, nothing has been (Section 2). In Section 3, we discuss how the written about the overall ability of the MBA to MBA metrics can be used to forecast possible increase revenues (or profits), or on the increases in revenues. In Section 4, we then _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 1 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ present data from the case study on a small (Vindevogel, Poel, & Wets, 2005). Customers university bookstore, and we discuss the often purchase multiple flavors of ice cream or capabilities and the limitations of the augmented multiple competing brands of soft drinks to MBA technique. We present conclusions in address the needs of various family members. Section 5. This makes functional product substitutes into actual complements. Another problem is the 2. MARKET BASKET ANALYSIS enormous number of transactions in typical larger size businesses, which may make it Although the whole reason for conducting a difficult to carry out MBA unless a well designed market basket analysis is to increase revenues, subsample of the transactions set is used a literature review did not uncover any instead of the entire database (Chandra & publications on how or to what extent the Bhaskar, 2011). analysis could lead to potential or actual increases. A possible explanation is the rather The MBA is nonetheless widely used, in part sensitive nature of the MBA results, which are because of its conceptual simplicity. Given a set likely to be used in-house and closely guarded of N transactions involving two or more items, from competitors, rather than made available for the MBA starts by identifying those transactions publication. that involve pairs of items. For example, given items A and B, the number of transactions MBA was first proposed and used in a involving item A is , the number of supermarket in Sweden (Julander, 1992), transactions involving item B is , and the although most authors cite a later paper on number of transactions involving both A and B is techniques to identify association rules in a . These numbers can be obtained via simple database (Agrawal, Imielinski, & Swami, 1993). queries in a database of transactions. The technique is typically used for evaluating product affinities for retail stores, but it can also The pairing may be for synchronous purchases, be used to evaluate affinities for any other types where the two items are bought during the same of choices: purchases of services, menu choices transaction or for asynchronous purchases, at a restaurant (Ting, Pan, & Chou, 2010), where a customer who has purchased item A in students’ choices of elective classes etc. the past might be targeted for purchasing item B. For synchronous applications, sales staff More recently, MBA has been combined with might be directed to suggest additional relevant additional consumer information including in- items (item B), given a set of items the store behavior (Schmitt, 2010), visual effects customer has already selected in this transaction arising from merchandise positioning in the store (item A). For asynchronous applications, the (including adjacency relations) (Chen, Chen, & application is to suggest additional purchases Tung, 2006), and choice experiments via mailed over time, for example by sending coupons to surveys (Swait & Andrews, 2003). get the customer back to the store, sending reminders for consumables given a typical usage Despite the relative simplicity of the MBA, some or maintenance schedule, or sending offers for researchers find it of limited use. The main goal relevant accessories for an item previously of the MBA is to exploit product affinities by purchased. Although the synchronous and the inducing the consumer to purchase additional asynchronous MBA approaches are similar, there unplanned products based on an already are also significant differences. One of the main committed purchase. Some researchers find that differences is in the ability to track purchases unplanned purchasing is less common than over time to a particular customer, which allows generally accepted (Bell, Corsten, & Knox, 2009) for the asynchronous MBA. This customer and that it depends more on the customers’ tracking is typically done via a loyalty program planning habits and efforts at gathering (frequent buyer card, membership store etc). information, and less on marketing efforts. The direction of the affinity itself is sometimes The MBA is based on three metrics: support, counterintuitive, as the MBA often fails to confidence and lift. All three metrics are derived distinguish between complements (products from the transactions record for the business. customers tend to purchase together) and substitutes (products pairs in which customers tend to substitute one product for another, hence not purchase the two together) _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 2 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ Support been purchased. For example, a customer who has already placed item A in her shopping cart The first metric defined for MBA is support, will have a different probability of purchasing which is the probability of an association item B than if she had not decided to purchase (probability of the two items being purchased A. Mathematically, confidence is given by together). Given the number of times items A ⁄ . and B occur together in the same transaction and the total number of transactions as above, Confidence can be used to assess the probability the support is ⁄ . of purchasing item B in the same visit (this is the synchronous approach we mentioned earlier: Although it would seem that a high support is if item A is in the shopping cart, what is the relevant, this metric has several limitations. The probability that the customer will also purchase MBA approach is based on analyzing past data to item B at the same time?) Alternatively, forecast the future sales. Unless the support confidence could be used to assess the chance values are increasing over time, the only that the customer who already owns item A forecast that can be made is that future sales (having purchased it in a previous transaction) will be similar to past ones. High support values may be willing to purchase item B at this time do not indicate a potential for increasing (asynchronous approach). revenues. Again, a high confidence may seem beneficial, Moreover, some items may end up being but confidence by itself does not indicate that purchased together in a large number of additional revenues are possible. On the other transactions, but not because there is an hand, a high confidence may signal to the sales affinity. Although bread and milk are not staff that a customer who has already purchased necessarily consumed together, customers item A is an “easy target” to be sold item B. This purchase both items so often that the pairing is is somewhat misleading, as the example of milk likely to have a high support for most and bread shows: the confidence is high that a supermarkets. The value of affinity relationships customer who already purchased bread will also is when there is an inherent value in purchasing purchased milk, but this is because most the items together, because they achieve some customers will purchase milk anyway. As such, synergistic goal and because the pairing is more not only is there no clear way to increase likely than the chance level. revenues, but the effort of the sales staff to make another sale is unnecessary, as the Finally, the chance of items being purchased customer is highly likely to purchase item B together is governed by randomness. For small anyway. values of support, it is possible that the appearance of an affinity is the result of a small Lift number of coincidences. As an illustration, the author recalls purchasing a highly technical A third metric in the MBA approach is the lift, textbook on Amazon a decade or so ago and ⁄ . Because the lift is a ratio of noticing a suggestion on the website that probabilities, it mitigates some of the problems customers who purchased that book also with the earlier example of milk and bread. purchased the latest Harry Potter volume. While the affinity was deemed statistically significant There are two ways to look at lift. First, lift is the by Amazon’s software at the time, it was clearly ratio of the conditional probability of purchasing just the result of coincidence. To reduce the B to the simple probability of purchasing B. As likelihood of such spurious associations, a such, lift is a measure of how much more likely minimum support value is typically used in the customer is to purchase B now that she calculations. A support value corresponding to intends to purchase A, as compared to a an is often used. customer who is not purchasing A. Confidence If , items A and B have an affinity that may lead to additional sales. Returning to the The next metric typically defined in the MBA is earlier example, assume that the lift for two the conditional probability of an item to be items is . Armed with this information, a purchased, given that another one has already sales person who sees the customer with item A _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 3 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ in her shopping cart knows that the probability . For a given number of transactions to sell her item B is now ten times higher than involving items A and B separately and together, the probability of selling item B to a customer the lift increases with the total number of who is not intent on purchasing item A. A high transactions in the set. As such, the more lift value might also induce a store manager to transactions there are in a supermarket place items together (or far apart, to induce database, the higher the value of lift calculated. customers to walk across the store), to advertise Standardizing (or normalizing) lift circumvents them together, or to bundle them. this problem, as described next. Clearly, if , the effort to sell item B to the Normalized lift customer is higher than the effort to sell it to a customer who has not purchased A. This might A modified metric that addresses the problem indicate that items are substitutes: a customer above is the normalized lift proposed in who intends to purchase A has no need or (McNicholas, Murphy, & O’Regan, 2008). The interest in purchasing B. Alternatively, the items number of transactions involving both items A could be in different segments (a luxury item and B must be smaller than the number of and a discount version). When there is such a transactions involving either item alone: choice, the store manager may choose to . Also, the number of discontinue the item with lower revenues. transactions involving both items has a lower bound (it needs to be a positive or zero The other way to look at lift is as a ratio of three number). When the sum of the numbers of probabilities. Substituting the formula for the transactions for items A and B exceeds the total confidence in the formula for lift leads to number of transactions, some transactions will ⁄ . From probability theory, if by necessity involve both items. Hence the lower items A and B are independent (customers bound for the number of transactions involving purchase one independently of whether they both items is . also purchase the other), the probability of the two being purchased together is These values are used to calculate , so ⁄ . If , items are and . With a bit of algebra, it is easy to show that . With these values, the normalized lift is more likely to be purchased together than just by chance ( ), again indicating the . Substituting the numbers of affinity, whereas if , the items are even transactions leads to the final formula for the less likely to be purchased together than just by normalized lift . chance. In other words, when comparing customer’s choices with the choices or a random purchasing agent (an agent that picks up items This normalized lift can have values in the at random), pairs of items with if will be interval [0, 1]. For small values of N, the less likely to appear in the rational customer’s normalized lift does depend on the total number shopping cart than in that of the random agent. of transactions, but for large values, once , the normalized lift is independent of The discussion above suggests that finding pairs the total number of transactions, . of items with high lift could lead to increased revenues by selling more items. Still, lift by itself The normalized lift also suffers from a major is just an indication of an affinity, but does not limitation: it does not directly show whether an show how much additional revenues can be affinity exists or not. The unnormalized lift is expected. Lift can also be misleading, identifying greater than one when the two items have an as complements products that are actually affinity, which corresponds to a normalized lift substitutes, but are purchased together for . customer dependent reasons (Vindevogel, Poel, & Wets, 2005). Finally, lift is dependent on the number of transactions, which leads to high In conclusion, the best use of the MBA is to first values for infrequently purchased products. use the support to determine whether an Substituting the number of transactions in the association has statistical significance (the probabilities in the formula for lift leads to number of times two items are associated is _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 4 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ large enough not to be due to chance alone). to purchase item B after purchasing item A, but The lift can be used to determine if the affinity is men do not associate items the same way. If positive (complementary pairing) or negative most women already purchase the items (substitution). Finally the normalized lift gives a together, there is little if anything to gain from measure of the strength of the affinity, the association, unless men could also be independent of the number of transactions in the induced to associate the two items. The data set. saturation of an association could be based on any number of demographic or behavioral traits This knowledge about the affinities between of the customer, including previous purchases, pairs of items in the data set does not suggest income etc. The affinity saturation level is how much more revenues could be obtained. In typically unknown, although it can be estimated the next section, we use the MBA metrics to either empirically (for example via customer estimate the additional revenues. surveys) or via thought experiments (e.g., modeling customer behavior). 3. USING MARKET BASKET ANALYSIS TO INCREASE REVENUES Each possible pairing of two SKUs has a different nature, hence also a different affinity saturation. Given a set of transactions, an opportunity for To complicate matters even further, the affinity increasing revenues arises when there are saturation can change over time, for example as affinities among the items in the data set. At the more customers purchase the items together as most basic level, an affinity will allow sales of a result of a successful marketing campaign. one item to drive up sales of another. If selling items A and B, with , an affinity between In the model developed here, the affinity the items would allow selling more of item B, by saturation level is a variable that multiplies the piggybacking on the sales of item A. potential revenue increase. For a saturated affinity, , while for an affinity with While we refer to pairings below, the way to use maximum potential to increase revenues, . the results of an MBA can be multifaceted. The For simplicity, we assume affinity saturation information about product affinities from the level for all the pairings in our data set to be MBA must be used in pricing strategies, in . This means that any actual revenue product placement and in marketing campaigns, increase will be lower than what we estimate if it is to result in any increase in revenues. below in the case study. As we already mentioned in the section about Impact of relative pricing and customer lift, two items with a positive affinity can be price sensitivity packaged together, could be placed in close proximity or could be priced as a bundle (buy Another factor in the ability to increase revenues one item and get the other one for a discount). is the relative pricing of the two items as it Similarly, items with a negative affinity relates to the customer price sensitivity. (substitutes) can either be the object of market Intuitively, customers who already purchased a segmentation or could increase revenues by high priced item might be somewhat willing to discontinuing one of the items, if done properly purchase a lower priced related product. When (Vindevogel, Poel, & Wets, 2005). For simplicity, the situation is reversed, customers are usually the analysis here assumes fixed pricing, and less likely to purchase a higher priced product does not discuss any pricing strategies. MBA can that has an affinity with the lower priced product also be used to adjust prices for individual items they already intend to purchase. We use a linear or for item bundles to account for affinities. model for the likelihood that customers would be willing to purchase more expensive items. The Impact of lift and affinity saturation level probability of making a purchase of item B if the customer is already purchasing A is It is intuitive that the higher the lift, the more , where k is a measure of the price likely that additional sales of item B could be made to a customer who has committed to sensitivity of the customer (while the lower case purchasing item A. On the other hand, the p earlier stands for probability, while upper case potential for additional sales might not be P here refers to price). The customer is not realized if the affinity is saturated for sales of willing to purchase item B if the price of the item item B. For example, assume that women tend is more than k times the price of item A _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 5 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ ( ). This is a statement about the apparel, snacks and office products. The total relative strength of the affinity as compared to sales for the two month period amount to $ the customer’s price sensitivity. If the customer 410,000. To protect confidential information, the was already predisposed to purchase item B, he number of units sold has been multiplied by a or she would do so regardless of the pricing, not randomly generated scaling factor that affects as a result of the affinity between items A and B. only the overall amounts, but not the MBA affinities or the relative increase in revenues. Impact of competing bundling For the data set, 17.8% of the SKUs bring in A final consideration must include the effect of 90% of the revenues and 8% of the SKUs bring multiple pairings of items on the customer’s in 80% of the revenues. This distribution shows choice. Research has shown that too many many more active SKUs than the typical 80/20 choices will distract the customer and might lead distribution where 20% of the SKUs bring in either to no decision or to a suboptimal decision 80% of the revenues. A possible explanation is (Lehrer, 2010). As such, the number of pairings that a university bookstore carries many should be kept relatively small or localized, so specialty books that sell in very small volumes, each individual decision should be made with a but might be carried because they are written by manageable number of choices. faculty or by local authors. Also, some of the SKUs tend to be very detailed, differentiating Overall impact on revenues between various colors of a sunburst T-shirt, which artificially increases the number of SKUs. To summarize, pairing two items makes sense Interestingly enough, transactions for the month when the number of times item A is sold is of February involve much fewer textbook greater than the number of times item B is sold: purchases and show much closer agreement . Second, the two items must have a with the 80/20 rule (17% of the SKUs are positive affinity (they must be complements) responsible for 80% of the revenues). and the affinity must be below saturation level. Finally, the relative pricing of the items must be The MBA can be applied at the SKU level or at within the customer’s price sensitivity range. the category level (i.e., looking for associations for 2% Horizon Organic milk – SKU level – or for With the considerations above, the additional milk – category level). The data set available did revenues from pairing item B with as many of not include categories, the number of SKUs was the item A as possible will be given by too large, and many SKUs were difficult to . The terms in classify. As such, the MBA we carried out was only at the SKU level. The same analysis at the the formula include in order: the additional category level would give additional insights. number of item B sold , the sales price for item B, , the affinity saturation level, , There were 17,231 different pairs of items that and the price sensitivity probability ( from were sold together in at least one transaction, above). Most generally, the affinity saturation but only 88 pairs were sold together in at least level, , and the customer price sensitivity, k, 10 transactions, 250 were sold together in at will be different for each product pairing. In the least 5 transactions and 586 were sold together case study below, we assume these parameters in at least 3 transactions. As mentioned earlier, to be the same for all product pairings in the the higher the number of times items are sold data set, for simplicity. together, the more likely that the affinity is statistically significant. We consider all three The total impact on revenues is the sum of sets of values for comparison purposes, realizing additional revenues for all the possible pairings. that probably only data for the item pairs sold in at least 10 transactions is to be trusted. 4. CASE STUDY AND DISCUSSION The data set included the end of the academic We now apply the MBA approach to a set of cash year, and most of the strong affinities involve register transactions from the UAA Bookstore. graduation paraphernalia. Graduation is a Transaction dates are from February and March unique event in the life of a student (and of the 2010. There are 13,916 transactions involving parents of that student), so customers might be 28,462 line items and 4,332 individual SKUs more inclined to make purchases they would (stock-keeping units), including textbooks, otherwise not be willing to make (recall the _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 6 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ earlier discussion about upselling a more sensitivity coefficient (customers willing to expensive item when the customer is already purchase items priced up to three times higher willing to purchase a less expensive one). than the price of the item they are buying). The data shows all the pairings that have a Support sufficiently large support for statistical significance, regardless of the confidence level. When considering the support, three pairings Indeed, some pairings might lead to high stand out (Table A.1 in the Appendix) with revenue increases despite their lower support close to 0.03 (all other pairings have confidence. less than 0.01 support): Tassels and caps Because most of the pairings in Table A3 in the Tassels and gowns Appendix are related to graduation Caps and gowns paraphernalia, we expect customers to be willing to pay more for certain items that are These pairings are rather expected, and as appropriate for the occasion. For example, explained before, they do not give much customers who rent a bachelor gown might be information about increasing revenues. willing to pay for the almost twice as expensive announcement packs. In general, the customer’s Confidence willingness to pay for more expensive items will depend on the nature of the transaction, but The list of the highest confidence pairings might involve a price sensitivity coefficient as includes several with confidence higher than 0.8, low as zero. but many of the pairings are rather obvious again (masters hood and tassel) as shown in Table 1 below includes possible additional Table A.2 in the Appendix. Many pairings occur a revenues for a series of price sensitivity small number of times, and many prices are coefficients ranging from 0.1 to 3. Because not such that the customer is unlikely to purchase all affinity pairings might be meaningful (given the more expensive item: the first line in Table 2 the sometimes small number of transactions is for an expensive textbook ($200) as the upsell involving a certain pairing), we list the possible item for $0.15 test sheets. Clearly, the affinity is additional revenues for various numbers of coincidental and not significant. minimum transactions per pairing. Lift Table 1. Expected additional revenues for various values of the price sensitivity factor (k) Lift values range from 0.16 for test sheets and and of the minimum number of occurrences of a snacks to 242 for envelope seals and pairing for a statistically relevant affinity announcement cards. Of all the item pairings that involve at least 3 transactions, 567 have lift Minimum number of occurrences of the greater than one, indicating some affinity. Also, k MBA pairing 248 are pairings involving at least 5 3 5 10 transactions, and 88 involve at least 10 transactions. 0.1 $ 287.97 $ 266.09 $ 58.06 0.3 $ 2,974.39 $ 1,925.20 $ 738.32 Normalized lift 1 $ 18,809.62 $ 13,889.18 $ 5,267.52 Normalized lift values range from 0.01 to 1, with 3 $ 52,410.14 $ 40,039.05 $ 18,327.20 38 pairings having normalized lift greater than 0.9. Recall that normalized lift by itself is not an Based on these results, for the pairings we indication of product affinity. Instead, 567 of identified in the context of graduation season, a these normalized lift values are larger than the reasonable expected value for the additional threshold (indicating that there is some affinity revenues would be in the $5,000-$10,000 between the items). range, if customers are really less price sensitive. On the other hand, with price sensitive Overall impact on revenues customers, the revenues gain could be lower than $100. Table A.3 in the Appendix includes the top revenue generator pairings assuming a price _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 7 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ An extension of the analysis presented could acceptable to graduating students as a one-time involve combinations of more than two items. “splurge” for graduation related paraphernalia. MBA can be used to develop association rules between larger groups of SKUs, when such Depending on the customers’ price sensitivity combinations of several SKUs in one transaction and on the saturation level of the affinities are frequent enough to be statistically uncovered, revenues can increase by as much as significant. Because of the small data set and $10,000 or as little as $100, which is 0.025% to the relatively low numbers of combinations of 2.5% of the overall revenues. two SKUs, this paper did not consider combinations of more than two SKUs. Acknowledgement The authors would like to thank Alessandra Vanover and Anna Gage at the The authors are currently working with the UAA Bookstore for making available bookstore on designing a marketing campaign transactional data. based on the MBA results, to evaluate the saturation level for the affinities, as well as the 6. REFERENCES hypothesis of lower customer price sensitivity for special-occasions purchases. Because most of Agrawal, R., Imielinski, T., & Swami, A. (1993). the revenue increases in the model result from Mining Association Rules between Sets of marketing graduation paraphernalia, a test of Items in Large Databases. ACM SIGMOD the model will need to wait for the next International Conference on Management of graduation season. We will report on the results Data (SIGNOD ’93) (pp. 207-216). in a future paper. Washington DC: ACM. 5. CONCLUSIONS Bell, D. R., Corsten, D., & Knox, G. (2009, January). Unplanned Category Purchase Market basket analysis (MBA) is a widely used Incidence: Who Does It, How Often, and technique for identifying affinities among items Why. Knowledge@Wharton . customers purchase together. In this paper we outline the limitations of the MBA metrics in Chandra, B., & Bhaskar, S. (2011). A new identifying sources of revenue growth, and we approach for generating efficient sample suggest ways to overcome the limitations. For from market basket data. Expert Systems the case of a small university bookstore, we with Applications , 38, 1321–1325. quantify the MBA metrics for the most promising Chen, Y.-L., Chen, J.-M., & Tung, C.-W. (2006). pairings of items and we estimate the potential A data mining approach for retail knowledge revenue increase. discovery with consideration of the effect of shelf-space adjacency on sales. Decision The data set covered two months in 2010, and Support Systems , 42 (2006), 1503-1520. included almost 14,000 transactions involving 28,000 line items and close to 4,300 individual Julander, C.-R. (1992). Basket analysis: A new SKUs (stock-keeping units), including textbooks, way of analysing scanner data. International apparel, snacks and office products. Because of Journal of Retail & Distribution Management the specialized nature of the business, 17.8% of , 20 (7), 10-18. the SKUs bring in 90% of the revenues when considering the textbook selling season, and a Lehrer, J. (2010). How we decide. Boston: more typical 80% of the revenue when Mariner Books : Houghton Mifflin Harcourt. considering months in the middle of the semester. McNicholas, P., Murphy, T., & O’Regan, M. (2008). Standardising the lift of an Although close to 18,000 different pairings could association rule. Computational Statistics be identified, only 88 of them occurred sufficient and Data Analysis , 52, 4712–4721. times to be considered statistically significant. All of these 88 pairings showed an affinity (lift Schmitt, J. (2010). Drawing Association Rules >1). Many pairings suggested selling a more between Purchases and In-Store Behavior: expensive item to a customer who has An Extension of the Market Basket Analysis. purchased a less expensive one. Although this Advances in Consumer Research , 37, 899- would not be very feasible in general, our 901. estimate is that such a pairing might be _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 8 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ Swait, J., & Andrews, R. L. (2003). Enriching Analysis. Cornell Hospitality Quarterly , 51 Scanner Panel Models with Choice (4), 492-501. Experiments. Marketing Science , 22 (4), 442-460. Vindevogel, B., Poel, D. V., & Wets, G. (2005). Why promotion strategies based on market Ting, P.-H., Pan, S., & Chou, S.-S. (2010). basket analysis do not work. Expert Systems Finding Ideal Menu Items Assortments: An with Applications , 28, 583-590. Empirical Application of Market Basket _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 9 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ Appendix. Data tables Table A1. Pairings with highest support values SKU1 Description SKU1 SKU2 Description SKU2 SKU1 SKU2 SKU Support price price Trans Trans 1&2 TASSEL GREEN/BGOLD $ 5.00 GRADUATION CAPS $ 7.00 522 471 406 0.029175 GRADUATION CAPS $ 7.00 BACHELOR GOWN $ 28.00 471 353 346 0.024863 TASSEL GREEN/BGOLD $ 5.00 BACHELOR GOWN $ 28.00 522 353 341 0.024504 Table A2. Pairings with highest confidence values (all pairings have confidence = 1) SKU1 Description SKU1 price SKU2 Description SKU2 SKU1 SKU2 SKU price Trans Trans 1&2 BETTELH/INTRO.TO G $ 202.10 TEST SHEETS $ 0.15 4 912 4 PACKAGE B ANNOUNCE $ 107.95 SHIPPING $ 15.00 13 198 13 THANK YOU NOTES 25 $ 11.75 TASSEL GREEN/BGOLD $ 5.00 6 522 6 THANK YOU NOTES 25 $ 11.75 GRADUATION CAPS $ 7.00 6 471 6 MASTERS HOODS $ 28.00 SHIPPING $ 15.00 4 198 4 THANK YOU NOTES 25 $ 11.75 BACHELOR GOWN $ 28.00 6 353 6 CERT. OF APPRECIAT $ 16.75 SHIPPING $ 15.00 9 198 9 THANK YOU NOTES 25 $ 11.75 SHIPPING $ 15.00 6 198 6 ANNOUNCEMENT $ 11.75 SHIPPING $ 15.00 7 198 7 COVER MASTERS HOODS $ 28.00 TASSEL GREEN/BGOLD $ 5.00 4 44 4 COMMENCEMENT $ 1.83 SHIPPING $ 15.00 11 198 11 SCRUBS V-NECK TUNI $ 13.95 PORTFOLIO 2 PKT A $ 0.50 3 19 3 SCRUBS V-NECK TUNI $ 13.95 REPORT COVER ASST $ 0.50 3 17 3 FOWLER/POLICY STUD $ 123.30 SPRING/AMERICAN ED $ 56.25 7 8 7 KIM/LOST NAMES $ 14.25 KATSU/MUSUI'S STOR $ 15.85 3 8 3 TECLA/DIONISIO AGU $ 25.00 TECLA/FERNANDO SOR $ 23.00 4 6 4 GIANTMICROBES GIGA $ 24.95 GIANTMICROBES GIGA $ 24.95 4 6 4 HAYCOX/FRIGID EMBR $ 21.95 HAYCOX/ALASKA AMER $ 18.95 3 4 3 JOHNSON/HOW DO I L $ 17.75 COREY/I NEVER KNEW $ 135.45 3 4 3 REF CHART ALGEBRA $ 5.95 REF CHART ALGEBRA $ 5.95 6 7 6 REF CHART MED TERM $ 5.95 REF CHART MED TERM $ 5.95 3 4 3 ADOBE P/ADOBE ACRO $ 59.99 STEWART/PROFESSION $ 81.60 3 3 3 HAMILTO/TARASCON P $ 24.95 ROTHROC/TARASCON P $ 14.95 3 3 3 MONTGOM/CHICAGO GU $ 12.75 WILHOIT/BRIEF GUID $ 59.75 3 3 3 ROTHROC/TARASCON P $ 14.95 HAMILTO/TARASCON P $ 24.95 3 3 3 STEWART/PROFESSION $ 81.60 ADOBE P/ADOBE ACRO $ 59.99 3 3 3 WILHOIT/BRIEF GUID $ 59.75 MONTGOM/CHICAGO GU $ 12.75 3 3 3 _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 10 www.aitp-edsig.org
Conference for Information Systems Applied Research 2011 CONISAR Proceedings Wilmington North Carolina, USA v4 n1822 _________________________________________________ Table A3. Pairings with the highest possible additional revenues (k = 3, only pairings with additional revenues in excess of $1000 are shown). The highlight is on the rows where the pricing relationship in the affinity is reversed: customers who have already purchased a lower priced item would need to be induced to purchase another more expensive item. SKU1 Description SKU1 SKU2 Description SKU2 SKU1 SKU2 SKU Additional price price Trans Trans 1&2 revenues BACHELOR GOWN $ 28.00 ANNOUNCEMENTS 25PK $ 45.75 353 47 26 $ 3,526.47 BACHELOR GOWN $ 28.00 THANK YOU NOTES 25 $ 11.75 353 6 6 $ 3,506.92 BACHELOR GOWN $ 28.00 RETURN LABELS 50PK $ 13.25 353 11 7 $ 2,428.82 GRADUATION CAPS $ 7.00 THANK YOU NOTES 25 $ 11.75 471 6 6 $ 2,406.65 BACHELOR GOWN $ 28.00 ANNOUNCEMENT COVER $ 11.75 353 7 4 $ 1,998.18 GRADUATION CAPS $ 7.00 ENVELOPE SEALS 25P $ 6.75 471 22 19 $ 1,776.14 GRADUATION CAPS $ 7.00 TASSEL GREEN/BGOLD $ 5.00 471 9 9 $ 1,760.00 TASSEL GREEN/BGOLD $ 5.00 ENVELOPE SEALS 25P $ 6.75 522 22 20 $ 1,687.50 BACHELOR GOWN $ 28.00 ANNOUNCEMENTS 5PK $ 8.95 353 21 13 $ 1,643.45 GRADUATION CAPS $ 7.00 RETURN LABELS 50PK $ 13.25 471 11 7 $ 1,431.40 GRADUATION CAPS $ 7.00 ANNOUNCEMENTS 5PK $ 8.95 471 21 13 $ 1,430.63 BACHELOR GOWN $ 28.00 ENVELOPE SEALS 25P $ 6.75 353 22 15 $ 1,400.94 GRADUATION CAPS $ 7.00 ANNOUNCEMENT COVER $ 11.75 471 7 4 $ 1,372.27 GRADUATION CAPS $ 7.00 TASSEL GREEN/BGOLD $ 5.00 471 44 37 $ 1,367.88 TASSEL GREEN/BGOLD $ 5.00 THANK YOU NOTES 25 $ 11.75 522 6 6 $ 1,313.65 TASSEL GREEN/BGOLD $ 5.00 ANNOUNCEMENTS 5PK $ 8.95 522 21 15 $ 1,291.80 POSTAGE $ 136.34 S/S SCALLOP SEAL $ 29.95 66 9 6 $ 1,054.76 _________________________________________________ ©2011 EDSIG (Education Special Interest Group of the AITP) Page 11 www.aitp-edsig.org
You can also read