MATCHING MARKETS IN ONLINE ADVERTISING NETWORKS: THE TAO OF TAOBAO AND THE SENSE OF ADSENSE

Page created by Raul Ramsey
 
CONTINUE READING
Matching Markets in Online Advertising Networks:
     The Tao of Taobao and the Sense of AdSense

                                                      1
                                     Chunhua Wu

                                   Job Market Paper
                                      October 2011

  1 ChunhuaWu is Ph.D. candidate in Marketing at Washington University in St. Louis. Contact:
Campus Box 1133, One Brookings Drive, Saint Louis, MO 63130. chunhuawu@wustl.edu.
Matching Markets in Online Advertising Networks:
    The Tao of Taobao and the Sense of AdSense

                                        Abstract

Advertising networks in recent years have played an increasingly important role in the
online advertising market. Critical to the success of an advertising network is the ability
to efficiently match advertisers with publishers.     To achieve this goal, some prominent
advertising networks such as Google AdSense rely on sophisticated computer algorithms
to allocate advertisements to web pages. In contrast, other platforms such as the Chinese
Taobao let advertisers and publishers self-select in a two-sided market. Besides, networks
also differ on the pricing schemes: AdSense uses the generalized second price (GSP) auction
while Taobao uses a listed price scheme. In this paper, we study the value of a successful
match between advertisers and publishers, and find that product category and demographics
are the most important determinants in the advertiser value function. A counter-factual
experiment based on the results suggests that the market-based mechanism adopted by
Taobao can generate nearly as much value to advertisers and publishers as a hypothetical
central planner allocation with full information. Another experiment shows that under
GSP the total advertisers’ revenue is more sensitive than the total publishers’ revenue.
These findings explain the different strategies adopted by different advertising networks:
networks that profit from the total advertisers’ revenue prefer the market-based listed price
mechanism while the others that profit from the total publishers’ revenue may be better off
under GSP auction when they have sufficient knowledge of the matching value of individual
advertisers.

Keywords: Advertising Network, Matching Game, Maximum Score Estimation, Mecha-
nism Design
1     Introduction

Advertising networks, which provide market places for advertisers and online publishers1 ,
are changing the game in online advertising. In a Businessweek article, Hof (2009) reports
that 30% of the $8 billion online display advertising spending in 2008 is through advertising
networks and the share would climb up to 50% in 2009. Advertising networks have been en-
thusiastically embraced by major Internet players. Google’s AdSense Network, for example,
contributed one third of its $28 billion advertising revenue in the year 20102 . Apple’s iAd

Network is delivering millions of in-App advertisements everyday to users’ mobile devices.
Social network websites such as Facebook and LinkedIn, content sharing websites such as
YouTube and micro-blogging sites such as Twitter are even more excited about advertising
networks, as they own an enormous number of lovely users who are volunteer publishers;
moreover, they need not share advertising revenue with them.

      An online advertising network differs from traditional advertising markets in that there
is an independent intermediary or platform. By providing market place and other services
to facilitate the matches between advertisers and publishers, the platform aims to extract
the economic surplus from either side of the advertising market to maximize own profit.
In traditional advertising markets such as TV, print, and online advertising in the past,
publishers and advertisers sought each other from a crowded mass with high transaction
costs. In the age of Internet, however, well-known platforms including Google AdSense
and the Chinese Taobao create advertising networks that bring together advertisers and
publishers. Thousands of small online publishers can now sell advertising spaces on these
platforms that otherwise would go un-reached by advertisers. Large publishers also benefit
from selling advertising slots to a larger number of advertisers. By advertising through
more publishers, advertisers are also benefited by reaching a bigger audience from various
   1
     Publishers in this paper refers to online content producers who provide content for online-browsing
audience.
   2
     http://investor.google.com

                                                   3
fragmented segments at low costs.

      Matching advertisers with publishers is the central task in an advertising network.
A successful match may be determined by many factors, such as product category match,
geographical match and demographic match to publisher’s audiences. Technology-oriented
advertising networks also propose contextual advertising, with great emphasis on the content
semantic match between advertisements and the content on web pages. Our first objective
in this paper is to quantify the value and investigate the determinants of the advertiser-
publisher matches.

      Another research objective in this paper is related to the efficiency of mechanism design
in advertising networks. Conditional on the matching outcomes, a platform will profit from
the transactions between advertisers and publishers. A common business model for platforms
is to share revenue with publishers. For example, Google gains 49% of the advertising revenue
from its participating publishers in its AdSense network; Major web2.0 sites that rely on user
generated contents (UGC) such as Facebook and Twitter harvest 100% of the revenue. On
the contrary, Taobao’s Alimama, an advertising network that we study in this paper, aims
to maximize advertisers’ revenue. Instead of sharing profit with publishers, Taobao’s model
is to generate profit from advertisers’ sales revenue in the retail place while providing free
services to advertisers and publishers. The difference in the source of profit has a direct
impact on the mechanisms of allocating advertising slots adopted by platforms. Taobao uses
a market based distribution mechanism letting advertisers directly solicit advertising spaces
from the publishers. To maximize their own profits, advertisers may have to experiment with
purchasing from different publishers, and publishers may have to experiment with varying
prices for their advertising slots. This mechanism puts additional burden on advertisers
and publishers, but it also maximally exploits the private information processed by both
sides in the market. Other advertising platforms such as DoubleClick and Advertising.com
have also adopted this mechanism. Yet the market-based mechanism is not the only way
to match advertisers and publishers; instead, a platform may choose to centrally allocate

                                              4
the advertisers to the “best” advertising slots from its profitability perspective. To achieve
this purpose, content analysis such as Natural Language Processing (NLP) and behavioral
tracking algorithms can be implemented to map advertisements to publishers’ web pages and
customer segments. Traditional technology-oriented players such as Google and Facebook use
a centralized allocation mechanism and on their platforms advertisers can not directly select
the exact web pages where they would like to place the advertisements. However, nowadays
these advertising networks also begin to involve advertisers in micro-managing the targets of
their advertising campaigns. This paper investigates the efficiency of Taobao’s market-based
mechanism in comparison with a centralized allocation mechanism. In particular, we are
interested in understanding how the comparison is impacted by the platform’s knowledge of
the various factors that determine the value of matches between advertisers and publishers.

     With these objectives achieved, we proceed further to investigate why advertising
networks differ on their chosen pricing schemes. Traditional players use a real time bidding
system. For example, Google sells advertisements through AdWords with a cost-per-click
(CPC) based Generalized Second Price (GSP) auction mechanism. Facebook uses a similar
pricing scheme. Other platforms such as Taobao adopt a publisher listed price scheme
based on cost-per-thousand-impressions (CPM) or cost-per-day. AOL’s advertising platform,
Advertising.com, uses both pricing schemes in practice. Motivated by these observations,
our third objective is to investigate how the adoption of different pricing schemes may be
linked to advertising networks’ objectives.

     In summary, this paper aims to address the following three questions:

   • What are the determinants of a successful advertiser-publisher match?

   • How efficient is a market-based distribution mechanism in the advertiser-publisher
     matching market? And how does the efficiency of a centralized allocation mechanism
     depend on the platform’s knowledge about the determinants of matching?

   • From the platform’s point of view, which pricing scheme leads to a higher profit? How

                                              5
does the profitability depend on the knowledge of the matching? Do platforms with
     different business models tend to choose different pricing schemes to maximize profit?

     We estimate a structural model of a two-sided matching market using data from
Taobao.com, which provides a free advertising network to online advertisers and publish-
ers. We first prove in this paper that, despite the complicated competitive relationship
among advertisers and among publishers in the advertising network, under very general
conditions an advertiser-publisher stable equilibrium exists in the current pricing scheme of
Taobao. We then estimate the matching function between advertisers and publishers using
some necessary conditions derived from this stable equilibrium. We apply the maximum
score estimator proposed in Manski (1975) and recently further developed in particular to
estimate matching games in Fox (2010a). Our results show that the main determinants of
the matching value function are product category and demographic matches, while content
semantic matches and geographical matches play minimal roles. In terms of demographics,
targeting gender is more important than targeting age or income.

     Based on the model estimation results, we use some counter-factual simulations to
address our second and third research questions. We study a situation where Taobao al-
locates advertisers to publishers in a centralized fashion. We find that the joint value of
advertisers and publishers critically depends on Taobao’s knowledge of the matching values,
which may belong to advertisers’ and publishers’ private information. Centralized allocation
can be inferior to the market-based mechanism Taobao currently adopts when the informa-
tion asymmetry is high. Another policy simulation studies the efficiency of various pricing
schemes. We manipulate the targeting technology as the platform’s ability to uncover the
matching value. We find that total advertisers’ revenue is very sensitive to the targeting
technology, but total publishers’ revenue is not. Our results suggest that it is better for
platforms such as Taobao with the objective of maximizing advertisers’ total revenue to
choose a listed price scheme, while platforms with the objective of maximizing publishers’
total revenue should adopt GSP auction when their targeting technology is high. This find-

                                             6
ing resonates with the different pricing strategies pursued by Taobao and Google AdSense,
providing a plausible explanation for the “Tao” of Taobao and the “Sense” of AdSense.3

        The rest of the paper is organized as follows: Section 2 discusses relevant research
in the marketing and economics literature. Section 3 describes the data and the business
model of our empirical application. Section 4 introduces the model and estimation strategy.
Section 5 shows the main estimation results followed by policy experiments in Section 6.
Finally Section 7 concludes.

2       Relevant Research

Our paper is broadly related to two research streams. First, our paper is closely related with
Internet advertising literature. Early papers in this stream study the performances of online
display advertising. Drèze and Hussherr (2003) investigate online surfers’ attention using
eye-tracking device and survey data. They suggest that surfers actually avoid looking at
banner advertisements and a large part in their processing of banners will probably be done
at pre-attentive level. Chatterjee, Hoffman, and Novak (2003) model consumer response
to banner advertisement exposures using clickstream data. Manchanda, Dubé, Goh, and
Chintagunta (2006) study the impact of repeated banner exposures on customer purchasing
behavior using hazard models. Danaher, Lee, and Kerbache (2010) develop a method that
optimally allocate online advertising scheduling across websites. As the technology evolves
in Internet advertising, especially as a firm’s ability in targeting increases substantially, re-
cent papers investigate targeting related issues. Using data from a large field experiment,
Goldfarb and Tucker (2010) find that matching advertisement to content and increasing
obtrusiveness independently increase purchase intent, however, the two strategies in com-
bination are ineffective. Zhang and Katona (2011) investigate advertising intermediary’s
    3
    “Tao” is a concept in ancient Chinese philosophy that originates in Daoism, which can be interpreted
as the basic principle of the universe or human activities. We use this term as an analogy for the reasoning
of Taobao’s pricing and market allocation strategies in its business model.

                                                     7
strategy in contextual advertising, where targeting is based on content matches. Besides
the traditional online display advertising, the strong growth in search advertising creates
opportunities for researchers. Theoretical papers on search advertising mainly focus on
advertisers’ bidding strategy for keywords and search engine’s platform designs (Edelman,
Ostrovsky, and Schwarz 2007, Katona and Sarvary 2010, Varian 2007). Empirical research
explore diverse topics such as determinants of click through rates, cost per click and other
metrics (Ghose and Yang 2009), the spillover dynamics among keywords (Rutz and Buck-
lin 2011), the complementarity of organic and sponsored search (Yang and Ghose 2010),
the interplay of users, advertisers and search engine (Yao and Mela 2009), the competition
among advertisers (Chan and Park 2010) and the value of customer acquisition through
search advertising (Chan, Wu, and Xie 2011). Powerful advertising networks emerge these
years and the topic of buying and selling advertisements in a network and the influence of
network intermediary remains mostly unexplored. This paper, to our best knowledge, is
the first empirical investigation into the matching effects and platform mechanism design in
advertising networks.

     Second, our paper is closely related to the matching literature. Theoretical papers on
matching games date back to the 1960s. Gale and Shapley (1962) study the college admis-
sions problem and proved that the problem has “stable” equilibrium and further outlined
the “Gale-Shapley” algorithm to find stable equilibria. Koopmans and Beckmann (1957)
study the problem of matching plants to locations, without and with transportation cost
and translates the problems to a linear programming problem and a quadratic programming
problem respectively. Shapley and Shubik (1979) study the one to one matching game with
price transfers and showed that outcomes in the core are solutions of certain linear program-
ming problems and these outcomes correspond exactly to the prices that competitively clear
the market. Becker (1973) study the marriage market and show how sorting is formed in
equilibrium. Hatfield and Milgrom (2005) recently study a many to many matching problem
with contracts.

                                             8
On the other hand, empirical methods in estimating matching games are developed
quite recently. Choo and Siow (2006) use a logit error specification to estimate preferences in
the U.S. marriage market and explore the effects of legalization of abortion. Sørensen (2007)
uses an augmented likelihood by assuming unobserved payoffs to each side to be proportional
to the total value of the match. Hitsch, Hortaçsu, and Ariely (2010) use revealed preference
data from an online dating website to estimate mate preferences. The advantage of revealed
preference data is that it avoids the need to explicitly specify an equilibrium. In general
cases when the match market is large, a full likelihood approach can be computationally
intractable or even the likelihood function is not well defined due to multiple equilibria. Fox
(2010a) proposes a maximum score estimator method which uses only necessary conditions
from equilibrium and can handle multiple equilibria. Fox (2010b) gives identification proofs
on estimating matching games using the proposed maximum score estimator. Several papers
follow the maximum score estimator method. Fox and Bajari (2010) estimate a many-to-one
matching game with complementarity across multiple matches in the application of analyzing
FCC spectrum auctions. Yang, Shi, and Goldfarb (2009) study the matching of professional
athletes to teams in the NBA games and explore the effect of maximum wage limit policy.
Baccara, Imrohoroglu, Wilson, and Yariv (2010) explore different levels of network effects
in matching games with application to professors match to offices. This paper also uses the
maximum score estimator method in estimation.

3    Data

In this section, we first describe the business model of Taobao and the data we use. We
then provide some summary statistics showing the match outcome, i.e., which advertisers
buy advertising slots from which publishers in this market.

                                              9
3.1     Advertising Network on Taobao

We crawl data from Alimama.com, the advertising network affiliated with Taobao.com.
Taobao is the largest online retail platform in the world, with 370 million users, 800 million
                                                 Y400 billion (US $60 billion) in 2010 4 ,
products and an annual gross volume of about CNY −
surpassing eBay (US $53 billion) and Amazon (US $34 billion).5 Although always perceived
as China’s eBay, Taobao is quite different from eBay in its business model. Taobao does not
charge listing fee nor commission. Its revenue comes from two sources: the first is sponsored
keywords advertising on Taobao.com; the second is the investment return from the transac-
tion revenue of its participating online retailers held by Taobao. In order to control online
transaction frauds, the group has established a separate subsidiary company called AliPay,
which functions as Paypal, but all revenue of retailers will be withheld till the buyer receives
the product and confirms the transaction to be valid. The process usually takes a week or
longer, enabling AliPay to gain interest and return on investing the transaction revenue in
the financial market.

      The two sides on the advertising network Alimama.com are the advertisers and pub-
lishers. Advertisers consist of the participating online retailers on Taobao.com across all
the product categories. Publishers are mostly small to medium sized websites, including
personal blogs, interest group pages, discussion forums and small news portals. Publishers
list detailed descriptions of their websites and advertising slots to be purchased at specific
daily prices. Advertisers then make purchase decisions. If no one purchased a particular
advertising slot, Taobao will automatically assign an advertisement for that slot and the
payment is based on number of actions, e.g. how many people purchased products through
the advertisement. Advertisers purchase advertising slots mainly to increase the traffic re-
ferred to their store pages and ultimately to increase transactions. Taobao provides rich
  4
     Y is the symbol for Chinese Yuan (CNY), the currency of People’s Republic of China. −
     −                                                                                   Y1 = $0.152 on
Jan 1, 2011.
   5
     http://www.foxbusiness.com/markets/2011/01/19/alibaba-group-executive-taobao-transaction-value-
cny-billion, http://investor.eBay.com and http://phx.corporate-ir.net/phoenix.zhtml?c=97664&p=irol-
irhome

                                                  10
and transparent information on this advertising network: besides the detailed characteris-
tics provided by publishers, Taobao also provides publisher’s daily traffic statistics and past
transaction history for each slot, including who are the buyers and what are the transaction
prices. Moreover, the links to every advertising slot and advertiser are given, and every
publisher and every advertiser can be easily reached through a real time communication
tool.

        Rooted in its business model, the objective of Taobao is to maximize the total transac-
tion revenue referred by the advertisements. In contrast, other platforms’ objectives are to
maximize the total publisher revenue from advertisement transactions. For example, Google
charges 49% of revenues from publishers for the advertising transactions in its Google Ad-
Sense network. Taobao also differs from Google on its chosen allocation mechanism and
pricing scheme. As advertisers can select the exact publisher pages for its advertisements,
the allocation of advertisements on Taobao is purely market based. However, on Google
AdSense, Google plays a centralized role in the allocation in the sense that advertisers would
not be sure where exactly their advertisements would appear. In terms of the chosen pricing
scheme, Taobao uses the cost-per-day based listed price format while Google adopts the cost-
per-click based generalized second price (GSP) auction that is widely used in its sponsored
search advertising business. GSP is not a pure market mechanism as the platform has the
power to influence the equilibrium outcome through the assignment of quality scores to each
advertiser.

        We collect data from the advertising network on the day of January 01 2011. The data
record both advertiser and publisher characteristics and prices of all available advertising
slots either sold or unsold. Each publisher may offer multiple advertising slots displayed
across its web pages and each advertiser may purchase multiple advertising slots. One slot
can be sold to only one advertiser in each day. Less than 1% of advertisers have purchased
multiple slots from the same publisher and we exclude those advertisers and the correspond-
ing publishers they have multiple transaction with. There are altogether 1,253 publishers

                                               11
who has a positive revenue, with 5,732 slots and corresponding 1,324 advertisers. In the
model estimation, we select those publishers joined more than one year in the platform.
This ensures that our sample consists of relatively more experienced publishers who know
how to set the right market price. We further choose only those publishers with every slot
priced at least −
                Y1.0 per day to ensure that our sample consists of advertisers and publishers
who are of high value. This sample selection process leaves us with a subset of 295 publishers
with 992 slots and 483 advertisers. It is ideal to use the whole market data in the estimation.
However, the estimation time under the maximum score estimator increases extremely fast
with the number of agents on each side. Our estimation on the selected sample represents the
inferences on a sub-market where experienced high value advertisers and publishers transact.

      Table 1 provides summary statistics on the size and characteristics of each side. On
average, an advertiser advertises on 1.3 publishers and a publisher provides 3.4 advertising
slots. Two thirds of the advertising slots provided are sold, resulting in total transaction
amount of −
          Y3,832. For the advertiser side, we have information on geographical location
(city where the advertiser lives), product category based on Taobao’s classification, store
performance (number of items, average price and monthly sales). Advertisers differ a lot
in their store characteristics. Table 4 provides the frequency distribution for advertisers
in each product category. 46% of advertisers are women’s products retailers. The median
assortment size of the stores is 28 items and the median of average price across these stores is
−
Y90. Advertisers’ median monthly sales is 13.3 thousand CNY, while the maximum monthly
sales is 3 million CNY, which is 230 times larger than the median. For the publisher side,
variables include geographical location (city where the publisher is registered), publisher
category based on Taobao’s classification, targeted demographics of the website (gender, age
and income), website visits metrics (PageRank, daily unique visits, average pageviews) and
advertising slots (number of slots, position of slots, prices). Publishers differ significantly
in their ability to attract visitors: the minimum number of daily visits for a publisher is 30
while the maximum is as high as 157,000. The listed prices for advertising slots differ from

                                              12
Y1.0 to −
−       Y50.0 per day. Advertising slots also differ on their sizes, number of competing slots
and the positions of the slots.6 We run a log-linear regression to investigate the determinants
of the advertising slots prices. We regress the log of adverting slot prices on observed slot
attributes and publisher dummies. Results are reported in Table 2. We find that the price
of an advertising slot increases with the size of slot, when the slot in on the mainpage; while
it decreases when there are more competing slots and when the slot is on the bottom of
the page. Besides, we also go deep to content level: for every publisher and advertiser,
we extract website descriptions and store descriptions to compute a semantic distance as
described in the subsection below. For market outcomes, we have information on which
advertiser purchased which slots and the prices paid.

3.2     Data on the Matching of Advertisers and Publishers

We now present some data on the factors in an advertiser’s decision of buying advertising
slots from publishers. At Taobao, most publishers produce content on the pages of their
websites before selling advertising slots on pages. Advertisers are able to view the content
through the links set up under each advertising slot when making purchase decisions. This
provides advertisers useful information on the type of audience a website is able to attract,
and whether or not their advertising message fits with the content on specific web pages.

       To explore the correlation between advertisers and publishers in terms of the content
of their websites, we apply the Latent Semantic Analysis (LSA) algorithm developed in Nat-
ural Language Processing (NLP). LSA is a statistical method for indexing contents based
on available vocabulary (Deerwester, Dumais, Furnas, Landauer, and Harshman 1990, Lan-
dauer, Foltz, and Laham 1998). Indeed, Google acquired the company Applied Semantics
to start its AdWords and AdSense program and LSA is believed to be the earliest technique
   6
    Slot position is a variable we constructed, which is the ratio of line number the advertisement appears
over the total number of lines in the raw html file.

                                                    13
used by Google.7 We use the method to examine whether the semantic correlations are sug-
gestive for the advertisers’ decisions of buying slots from publishers. A detailed description
of how we use LSA to compute the correlations is in Appendix A. Because these are Chinese
websites, we translated the Chinese contexts into English using Google Translate, which is
a standard product widely used in translation for its merit of preserving the semantic mean-
ing. The calculated semantic correlations are reported at the upper panel in Table 3. “Full
sample” in the table refers to all potential pairing of advertisers and advertising slots in our
data. We define “matched sample” as the subset of the full sample where advertisers are
observed to purchase the corresponding advertising slots, and “unmatched sample” includes
the rest. The correlations for possible matches between advertisers and publishers in our
selected sample ranges from -0.13 to 0.95, out of the maximum possible range of -1.0 to 1.0.
The mean correlation for the matched pairs is 0.15, which is statistically significantly dif-
ferent from unmatched pairs of 0.14, with t-statistic of 2.62 and p-value of 0.008. However,
since the difference of means between matched pairs and unmatched pairs is small, other
factors may be more important in driving the matching outcomes.

        We next investigate the effect of geographical distances by measuring the bilateral
geographical distances between advertisers and publishers at city levels. If online advertisers
and publishers only target local population, geographical distance will reduce the degree of
match because of the difference in targeted consumers. Sample statistics are reported at the
lower panel in Table 3. The t-statistic for the difference between matched and unmatched
samples is 0.61 with a p-value of 0.54. Thus, it suggests that geographical distance is not a
main factor in consideration when advertisers and publishers match with each other in the
online market.

        Finally, we look at the relative frequencies of matches based on other characteristics
of advertisers and publishers. We specifically look at how the product category of advertis-
ers relates to the demographics of targeted audience and website content characteristics of
  7
      http://searchenginewatch.com/2196001

                                              14
publishers. These are important information for publishers and advertisers in most business
practices. For example, Google allows advertisers to specify the demographic groups they
wish to target starting from 2008 and Facebook also provides similar functions for advertis-
ers. Table 4 provides the tabulation of frequencies. In this table and subsequent analysis,
we group the advertisers into five product categories: men’s products (mainly clothes and
shoes), women’s products (mainly clothes and shoes), digital products, foods and household
items (mainly furnitures). We also group publishers into five categories based on contents
of the website: fashion, life information, news portal, online shops/services and entertain-
ment/others.8 Strong evidence of self-selection is shown in this table. First, at the category
level advertising slots of websites focusing on fashion are mostly sold to advertisers sell-
ing women’s products. Also, digital products retailers tend to purchase advertisements on
entertainment related websites. Evidence of matching based on demographics of websites’
audience is also observed: women’s products are rarely targeted to websites with mainly
male users, young population is preferred by retailers of men’s products, women’s products
and digital products, while elder population is preferred for foods and household products.
Also, household item retailers mostly target the relatively wealthy population. All these
statistics provide evidence on the behavior of selective matching between advertisers and
publishers through the market mechanism at Taobao. To fully quantify the impacts of the
characteristics of advertisers and publishers on the market outcomes, we develop a struc-
tural matching model that helps to study the economic value created by the matching of
advertisers and publishers.

4       Model

We begin this section by discussing the equilibrium concept in an advertiser-publisher match-
ing game under general functional forms and prove that a stable equilibrium exists. We then
    8
    Taobao has 52 main categories and more than 4,000 subcategories for stores (advertisers in this appli-
cation) and 23 categories and 125 subcategories for publishers.

                                                   15
make a functional form specification that can be used for empirical estimation and also
discuss the estimation strategy.

4.1    A Conceptual Framework of Stable Matching Equilibrium

We model the market transactions of advertising slots described above as a many-to-many
matching game with transferable utility, in which advertisers (A) and publishers (P ) compete
among themselves on each side of the market. This game is complicated because it involves
numerous differentiated advertisers and publishers: advertisers target different consumers
depending on the products they sell, and publishers attract different types of audience de-
pending on the content they offer on websites. Advertising on a publisher’s website may
bring high profit to an advertiser but low for another. Furthermore, unlike Fox (2010a), the
payoff from a matched pair can not be fully specified by a product function (fij ) involving
the advertiser and the publisher in the match. In general, each advertiser’s valuation of an
advertising slot depends on what other slots it receives for a variety of reasons. Because of
the complicated nature of the game, market equilibrium may not exist and even when exists
it may not be unique. We will formulate in this section the conditions under which market
equilibrium exists.

      An allocation is defined as the matching of advertisers to advertising slots, with the
constraint that each advertising slot can be matched to at most one advertiser. At Taobao,
matching is the outcome of self-selection from advertisers through the price mechanism where
all prices of advertising slots are listed by publishers. Denote the set of advertisers to be A,
the set of publishers to be P and the set of advertising slots to be S. Let M be the collection
of all possible allocations of advertisers to advertising slots. An element M ∈ M is a specific
allocation, where i, kj  ∈ M ⊂ A × S denotes the specific match of advertiser i to publisher
j where the k − th slot is assigned in the allocation. A corresponding vector P denotes the
listed daily prices of all advertising slots for publishers, where an element pkj ∈ P is the

                                              16
price paid to publisher j for the k − th slot.

      Denote Vi (M) as advertiser i’s total revenue or value function from advertising on the
slots it obtains in the allocation M. We define an advertiser’s total profit as the difference
between Vi (M) and the total advertising cost of slots it purchases. That is:

                                                      
                                 πi = Vi (M) −                    p kj                    (1)
                                                  kj :i,kj ∈M

      We assume that Vi (M) satisfies three properties:

   • independent from others: Vi (M) = Vi (Mi ), Mi = {i, kj |i, kj  ∈ M}.

   • monotonic: Vi (M ∪ M ) ≥ Vi (M), ∀M ∩ M = ∅.

   • decreasing marginal return: Vi (M ∪ M) − Vi (M ) ≥ Vi (M ∪ M) − Vi (M ), ∀M ⊂
     M , M ∩ M = ∅, M ∩ M = ∅.

      The independent from others assumption implies that an advertiser’s value for the set
of slots purchased does not depend on who else are taking which of the other slots. In an
advertising network, there are numerous players in each side making simultaneous decisions
and thus who are the direct competitors are hard to tell a priori and may be less relevant.
However, the competition effect between advertising slots may still be captured in slot level
attributes, e.g. the number of competing advertisements. The monotonic property is a nat-
ural assumption which requires non-negative marginal valuation. The decreasing marginal
return property further assumes that an advertiser’s marginal return from purchasing more
advertising slots is non-increasing. This property captures a number of institutional reali-
ties, such as cannibalization effect between advertisements due to audience overlap between
publishers, and decreasing marginal effect if one advertiser advertises on multiple slots of-
fered by the same publisher. The case of quota constraint, as discussed in Fox (2010b), is a
special case of decreasing marginal return revenue function where when the quota constraint

                                                 17
is reached, Vi (Mi ) would not increase for any new advertising slots. A special case of the
value function that satisfy the above three properties is the linear additive function, that is
Vi (Mi ∪ Mi ) = Vi (Mi ) + Vi (Mi ). We also assume that every advertiser has an outside out-
side option “o”. The outside option refers to other marketing opportunities, such as search
advertising, email advertising and other offline channels. We assume the outside option value
Vi0 is the same for each advertising slot for an advertiser. For each rational advertiser to
purchase a slot kj , the marginal contribution from this slot must be higher than the price
plus the value of the outside option, Vi0 , that is Vi (Mi ) − Vi (Mi \i, kj ) ≥ pkj + Vi0 .
                                                                                                 
        A publisher’s profit is the sum of prices for each slot it sells: Πj =        k   pkj I[       i   I[i, kj  ∈
M] = 1]. Where I[·] is an indicator function which takes value 1 if the statement is true and
0 otherwise. If I[i, kj  ∈ M] = 1, slot kj is sold to advertiser i. Any slot can only be sold
                              
to one advertiser, therefore i I[·] can be at most equal to 1, and if kj is not sold the sum
is 0.

        The information structure of the game is as follows. We assume that the properties of
the publishers’ websites are common knowledge, and that each advertiser knows its own value
function Vi (M). This assumption is consistent with institutional realities that the platform
provides detailed information about each advertising space. In addition, it is possible for
agents to learn over periods about the value function and resolve any information asymmetry.
Importantly, we assume that the valuation functions are not known perfectly to the platform.
The platform can either rely on sophisticated algorithms to partially uncover this information
or rely on auction to elicit the private information of advertisers.

        We propose an equilibrium concept of Advertiser-Publisher Stable Allocation. We de-
fine an allocation as advertiser-publisher stable if the advertisers and publishers in the current
allocation have no incentives to deviate.9 An advertiser-publisher stable allocation is in the
core of this advertiser and publisher many-to-many matching game. Two types of deviation
   9
     It is worth noting although the allocation is defined at advertiser-slot level, each slot should not be
treated as an independent player. This is because publishers maximize total profits from slots bundles
rather than maximize profit of each individual slot.

                                                    18
from M and P can be potentially profitable. First, as in the classic many-to-many matching
game, either the advertiser or the publisher in a current match can deviate by partially ex-
iting the market. For example, the advertiser can stop purchase any slot from the publisher
and the publisher can refuse the purchase from the advertiser.10 Second, an advertiser can
reallocate its budget to a specific set of advertising slots if it can make every publisher who
sells those slots strictly better off. We define the stable allocation equilibrium based on the
conditions that neither of the above deviations would be profitable.

Definition 1 (Advertiser-Publisher Stable Allocation) An A-P stable allocation consists of
an advertiser-slot allocation M and a price vector P which satisfies two conditions:

       • Individual Rationality: under the stable allocation M, no advertiser can get better-off
         by partially exiting the market and no publisher can get better-off by partially exiting the
         market. Formally, ∀i, kj  ∈ M, Vi (Mi ) − Vi (Mi \i, kj ) − pkj ≥ Vi0 and ∀kj , pkj ≥ 0.

       • Incentive Compatibility : There does not exist an advertiser i, a new allocation M
         and a price vector P  , such that:

                                                              
           – Vi (Mi ) −       δ:i,δ∈M   pδ ≥ Vi (Mi ) −        δ:i,δ∈M pδ .
                                                                           
           – ∀j ∈ {j|i, kj  ∈ M },            k:∃a,a,kj ∈M   pkj ≥       k:∃a,a,kj ∈M pkj .

           – Strict inequality holds for at least one condition.

         That is, when advertiser i purchases the set of advertising slots specified in the alloca-
         tion M at prices specified in the price vector P  , the advertiser gets better off, and all
         the publishers who the set of slots corresponds to make higher profits.

         The first condition in the above definition says that neither the advertiser nor the
publisher in a stable match can get better off by partially exiting the market, i.e., the
advertiser would rather not purchase the slot or the publisher would rather not offer the
  10
    In practice, even if the publisher does not have the option to refuse any purchase request, he can always
achieve this by raising the price to a high enough level.

                                                               19
slot. The second condition states that for any new allocation M and price P  , there must
exist an advertiser-publisher matched pair, such that either the advertiser or the publisher
is worse off than in the current stable allocation.

      The following existence result can be proved for a general Vi function which is inde-
pendent, monotonic and has decreasing marginal return.

Proposition 1 The A-P stable allocation exists for the advertiser-publisher many-to-many
matching game.

      Further, when advertiser value from different advertising slots is linear additive, that
                                  
is, Vi (M ∪ M ) = Vi (M) + Vi (M ), ∀M ∩ M = ∅, we obtain the following result:

Proposition 2 Under linear additive value function, an A-P Stable allocation is Pareto
optimal.

      Pareto optimal requires that there does not exist another feasible allocation, that
makes at least one individual better off without making anyone else worse off. The concept
of Pareto optimality in this paper refers to the welfare of advertisers and publishers, but not
consumers.

      The proofs of the two propositions are in Appendix B. Proposition 1 shows that an
equilibrium A-P allocation exists. But in general, it may not be unique. A stable equilibrium
is assumed to be the market outcome observed in our data, perhaps after an initial period of
experimentation of pricing and matching between advertisers and publishers at the platform.
Since we only select those experienced advertisers and publishers in the sample for model
estimation, this assumption seems quite reasonable. Proposition 2 argues that A-P Stable
allocation is Pareto optimal under linear-additive value function assumption. However, it is
worth noting that Pareto optimality does not necessarily imply maximization of joint profit
for advertisers and publishers. Alternative mechanisms that improve the joint profit may
exist. For example, Taobao may help allocate advertising slots to advertisers, playing the

                                              20
role of a central planner. Alternatively, it may adopt a hybrid mechanism like Google’s by
assigning quality scores to advertisers to control the allocation outcome. Whether or not
through these mechanisms can improve the advertiser-publisher joint profit and consequently
Taobao’s profit remains an unresolved issue to be addressed in our empirical analysis in later
sections.

4.2     Advertiser Valuation Function

The advertiser’s valuation function Vi (Mi ) is general in the above equilibrium condition.
For empirical estimation purpose, we now make two simplification assumptions and specify
the functional form in detail. The two assumptions are on the substitution effect between
advertising slots. That is, no substitution effect between slots across publishers and per-
fect substitution effect between slots within a publisher. Formally, Vi (i, kj  ∪ i, kj  ) =
Vi (i, kj ) + Vi (i, kj  ) and Vi (i, kj  ∪ i, kj ) = max{Vi (i, kj ), Vi (i, kj )}. We argue that
the across publisher no substitution effect is a reasonable assumption in our application,
because those publishers are dispersed and not big websites that one may visit frequently.
Thus, it is unlikely that a consumer would visit multiple publishers in a short period and be
repeatedly exposed to the same advertisement. The perfect substitution assumption for slots
within a publisher is an empirical fact from our data, as no advertiser purchases multiple
slots from a same publisher.

       It is worth noting that the value function under these assumptions is a special case
of the linear additive value function. The perfect substitution effect assumption puts a
constraint on the number (at most one) of advertising slots one advertiser could get from
each publisher. Thus, the set of feasible allocation is a subset of the general M. And under
each such feasible allocation, the value function is linear additive due to the no substitution
assumption for slots between publishers.

       We next specify the functional form for an advertiser’s revenue. Let Vikj = Vi (i, kj )

                                                       21
be the advertiser revenue when advertiser i advertises through the k − th slot of publisher j.
The advertiser’s revenue comes ultimately from the product sales referred by advertisements,
which is specified as:

  Vikj = Impressions × Pr(click|impression) × Pr(purchase|click) × E(value|purchase)
                                                                                          (2)
       = EIj × P Ek × CT Rij × CRij × vi

where EIj is the expected impressions of publisher j. The reason we use this expected
impressions instead of raw number of visits is that a visit does not necessary equals to
an impression, especially because people may intentionally avoid advertisements when they
browse web pages (Cho and Cheon 2004). P Ek is the positional effect of advertising slot k,
CT Rij is the base click through rate. Click through rate generally depends on the match
between audience of the advertiser (j) and advertisement (i) and also a positional effect
of the advertisement. Thus in the above formulation, P Ek × CT Rij can be viewed as the
true click through rate and this formulation implicitly assumes that the positional effect in
click through rate is a same scalar function across advertisers. This assumption is consistent
with those usually made in the sponsored search advertising literature (Edelman et al. 2007,
Varian 2007). CRij is the conversion rate, which we assume to depend on audience of
advertiser (j) and product (i) but not the position of advertisement that made the referral.
Finally vi is the expected value of transaction conditional on conversion. We further combine
the CT Rij and CRij part and relabel it as Mij which stands to be the matching effect
between advertiser i and publisher j, representing the value created when a specific pair of
advertiser and publisher match together. This results in the reformulated value function of

                                             22
Vijk = EIj × P Ek × Mij × vj , with each component modeled in a log-linear fashion:

                                            ln EIj = Zj γ + νj

                                           ln P Ek = Yk δ + κk

                                           ln Mij = Wij α + εij

                                             ln vi = Xi β + μi

Thus, The final valuation equation is :

                     Vikj = exp(Zj γ + Yk δ + Wij α + Xi β + νj + κk + μi + εij )              (3)

        Variables included in each part of the value function are listed in Table 5. For advertiser
effect Xi , we include the log of assortment size (number of items), the log of average prices
and the log of monthly sales. For publisher effect Zj , variables include a PageRank score,
the log of daily unique IP visits and the log of average number of pageviews. PageRank
is a measurement for the importance of websites in the World Wide Web11 . We use the
scores calculated by Google, which are integers from 0 to 10, with a high value representing
high importance. For example, the PageRank for Google.com is 10 and for Taobao.com
is 8. Scores for publishers in our application ranges from 0 to 6. Unique IP visits is an
approximation for number of unique individual exposures to the website and average page
views measures how many pages people would view on the particular website, with a high
value for paying more effort and time and also possibly a larger marketing opportunity. The
advertisement’s positional effect Yk includes variables of advertisement size, which is the log
of square root of advertisement area, an indicator of whether the advertisement would appear
on the main page, log of number of pages the advertisement will show up. To capture the
competition between advertisements, we include the log of number of slots the publisher offers
 11
      For a detailed description, see http://en.wikipedia.org/wiki/PageRank

                                                     23
on the website. Finally, we also include a relative position measurement, ranges between 0
to 1, which is measured as the ratio of the line number the advertisement appears over the
total number of lines in the raw html file. For most web pages, this measurement highly
correlates with the true position displayed, and the order of each advertisement is also
preserved. Lastly, the matching effect Mij includes those variables discussed in the previous
section, a correlation measuring semantic relevance between advertiser and publisher, log
of the geographical distance between advertiser and publisher, and dummy variables for
bilateral relationship categorization for product categories and audience demographics.

      For those variables which we use on a log scale, this specification of the value function
implies that the corresponding coefficient estimate represents the elasticity. To see this,
denote the focal variable by x1 and the rest by X1− , then Vikj = exp(X1− β1− ) exp(β1 ln(x1 )) =
                               ∂Vikj       Vikj
xβ1 1 exp(X1− β1− ) and η1 =   ∂x1
                                       /   x1
                                                  = β1 . Thus, β1 measures the percentage change in value
according to one percent change in x1 .

      The stochastic components (νj , κk , εij , μi ) are unobservable to researchers. Advertisers
know these values perfectly when they purchase advertising slots. We assume each stochastic
component is independently distributed from all the observed attributes and also indepen-
dent from each other. We further assume that the distribution for each stochastic term is
symmetric about 0, which results in the mean and median to be also 0. The assumption on
stochastic components enables us to use the maximum score estimator.

4.3     Estimation Method

Although we have established the conditions under which the Advertiser-Publisher stable
equilibrium exists, in a general empirical setting where there are many advertisers and many
publishers on both sides of the advertising network, and they are heterogeneous in the value
of matching with each other, it is very difficult to fully specify the sufficient conditions
for equilibrium outcomes. Furthermore, unique stable equilibrium in general does not exist.

                                                           24
Therefore the standard maximum likelihood estimation approach can not be applied without
imposing additional restrictive assumptions (e.g. Sørensen (2007)). We adopt in model
estimation the maximum score approach (Fox 2010a, Manski 1975), which only uses the
necessary conditions derived from equilibrium.

     Three sets of inequalities can be derived from the Advertiser-Publisher stable equilib-
rium of allocation M and price P:

   • Across publisher pairwise stability: Vikj − pkj ≥ Vikj  − pkj  , ∀i, kj  ∈ M, i, kj   ∈
                                                                                                     / M.
     This condition is derived as follows: consider the case that i purchases k  − th slot
     from j  instead of k − th slot from j. Then, given the prices of each slot as fixed, the
     individual rationality condition in Definition 1 implies that Vi (M−            
                                                                           i ∪ i, kj  ) − pkj  −
                            −
                                                  
       δ:i,δ∈M− pδ < Vi (Mi ∪ i, kj ) − pkj −
                 i                                  δ:i,δ∈M− pδ . Given our two assumptions
                                                                  i

     regarding the substitution effects between advertising slots, we know that an advertiser
     has no incentive to purchase multiple slots from a same publisher and the total value
     from advertising slots across multiple publishers is linear additive. That is: Vi (M−
                                                                                         i ∪

     i, kj ) = Vi (M−
                      i ) + Vi (i, kj ). Putting this into the condition implied by Definition 1,

     we get the pairwise stability condition Vikj − pkj ≥ Vikj  − pkj  , ∀i, kj  ∈ M, i, kj   ∈
                                                                                                        /
     M. The condition is valid regardless of whether the alternative advertising slot kj  is
     currently occupied or not. If it is not occupied and the profit of advertising on it is
     larger than the current position of kj , then the advertiser can be better off by simply
     purchase kj  instead of kj . On the other hand, if it is currently occupied by advertiser
     i and advertiser i also prefers the slot to the current one, then the publisher can at
     least increase the price of pkj slightly to get better off.

   • Within publisher pairwise stability: Vikj +Vi kj ≥ Vikj +Vi kj , ∀i, kj  ∈ M, i , kj  ∈ M.
     This condition is a local production maximization condition on publisher j. From
     our specification for the value function, it is clear that advertisers have homogeneous
     preference ranking over the slots within a publisher, with the ranking determined by

                                                  25
the slot effect. That is, if Vikj > Vikj , then Vi kj > Vi kj . Without loss of generality, we
      assume that Vikj > Vikj and Vikj > Vi kj . If Vikj + Vi kj < Vikj + Vi kj , then, advertiser
      i and i would have incentive to exchange the advertising slots to get both better off
      under a certain transfer. The transfer can be realized when publisher j sets new prices
      for both slots with the sum no worse than the current condition.

   • Individual rationality: Vikj − pikj ≥ Vi0 , ∀i, kj  ∈ M. Here Vi0 is the outside option
      value if the advertiser does not advertise on this advertising platform. This is directly
      from the individual rationality condition under Definition 1.

      For estimation, we use the semi-parametric maximum score estimator introduced by
Fox (2010a). This estimator maximizes a score function over the parameter space. The
score value is the total number of equilibrium inequalities that are satisfied under specific
parameters. In our application, we only use all the inequalities implied by the across publisher
pairwise stability condition, leaving the inequalities under the other two conditions out of
the score function. The reason is that the other two conditions bring much less information
than the first one, i.e. the number of inequalities from the first condition is 643,845, much
larger than the other two, 1,974 and 655 respectively. Besides, the second condition is more
restrictive requiring that advertisers can freely exchange or with the help from the publisher,
which involves three agents; and the third condition requires to estimate the outside value
option, which brings far more parameters. If we define the deterministic part in the profit
function as π̄ikj = V̄ikj − pkj = exp(Zj γ + Yk δ + Wij α + Xi β) − pkj and denote θ to be the
set of parameters to estimate, the estimator is defined as :

                1 
       Q(θ) =                I[i, kj  ∈ M, i, kj   ∈
                                                         / M] · I[π̄ikj (θ) ≥ π̄ikj  (θ)]         (4)
                N i j j k 
                                  j   kj 

where I[·] is the indicator function and N is the total number of inequality conditions.
Given the independent, symmetric about 0 assumptions for stochastic components, we can

                                                  26
derive that M edian(π̄ikj − π̄ikj  ) = M edian(πikj − πikj  ). This property is equivalent to the
assumption that the median of stochastic components conditional on observed attributes is
0 in Manski (1975), which is required for maximum score estimation.

      The identification and consistency properties are discussed in detail in Manski (1975)
and Fox (2010b). One difference between our score estimator and other applications is
that we use the equilibrium transfer data in the profit function specification and thus do
not use the local product maximization condition from summing up two inequality pairs.
Using Monte Carlo experiments, Akkus and Hortaçsu (2006) and Fox and Bajari (2010) both
verified that maximum score estimator with equilibrium price transfers performs extremely
well and is robust to different distributional assumptions of the error terms. One unit of
measurement in the profit function is equal to −
                                              Y1, however, without price transfers the
parameters are only identified up to a monotonic scale.

      The score function defined above is a step function. Thus, we can not use the derivative
based optimization routines. Instead, we use the differential evolution (DE) (Storn and
Price 1997) method suggested in the literature. The DE method is a mataheuristic method
to optimize a problem, which does not guarantee an optimal solution to be found. The
algorithm works by iterating over a population of candidates. New candidates are proposed
by using simple mathematical formulae of current candidates and are kept if the score of
the optimization problem is improved.12 The confidence intervals of parameter estimates
from this score estimator are difficult to derive analytically and we rely on sub-sampling
to compute them. Fox (2010b) showed that sub-sampling yields consistent estimates for
those standard errors based on the work of Politis, Romano, and Wolf (1999). We randomly
sample 150 publishers and those corresponding transacted advertisers in each sub-sampling
iteration. We sub-sample 200 iterations to derive the 95% confidence intervals and ensure
that each cell of characteristics in bilateral matching has at least one observation such that
  12
     We have also tried the simulated annealing method (Kirkpatrick, Gelatt, and Vecchi 1983), which turns
out to be much less efficient (takes much longer time) in this application.

                                                   27
You can also read