Experiments In Market Design - Stanford University

Transcription

Experiments in Market DesignThis (still-missing-a-few-parts, put readable through p61) draft: January 2012Design: Noun: the arrangement of elements or detailsVerb: to create or construct1. IntroductionThe phrase ―market design‖ has come to include the design not only of marketplaces butalso of other economic environments, institutions and allocation rules. And it includes notonly the design of new institutions ("design" as a verb) but also renewed attention to howthe design of economic institutions ("design" as a noun) influences their performance. Itis both one of the oldest and one of the newest areas of experimental economics.It is one of the oldest because every economic experiment involves the design of aneconomic environment, and many experiments compare the effects of different designs.And it is one of the newest because only since the 1990‘s have economists becomeregularly involved in the detailed design of marketplaces and other economic institutionsin ways that have led from the initial conception all the way to the adoption andimplementation of practical new designs (and to the beginnings of a new scientificliterature of market design). This new usefulness of market design has brought new usesfor experiments.To see that market design has always played a role in experimental economics, note thatwhen Chamberlain (1948) sought to investigate competitive equilibrium, he designed notonly the kind of marketplace (pairwise negotiation) that he wished to investigate; he alsodeveloped the technique that has since been widely used to induce particular supply anddemand conditions, by giving each buyer and seller the prices and quantities at whichthey could in effect sell to or buy from the experimenter to fulfill any trades they made.11Thus e.g. a participant in an experimental market might be told that he could sell one unit of some goodin the market, which would cost him 20 (to be subtracted from his sale price), while another might be toldthat he would be paid 50 for the first unit he bought (minus his purchase price). So if those twoparticipants happened to transact at a price of 30, the seller would earn 10 while the buyer would earn 20.1

That is, he not only designed marketplace rules, he designed a whole experimentalmarket, complete with preferences of buyers and sellers. When Dresher and Floodproposed in 1950 to test Nash equilibrium in a challenging environment, they designedthe underlying game (the Prisoner's dilemma), the environment in which subjects wouldencounter it (repeated play against a fixed other player), and the payoffs that wouldmotivate the players. When John Nash subsequently proposed that a different (nonrepeated) environment might produce different behavior, he was making a conjectureabout how the design of the economic environment (repeated or non-repeated, see Flood,1952, 1958) would influence the behavior of the participants. Similarly, when VernonSmith (1962) proposed that competitive equilibrium would be reached more easily in arepeated double auction than in Chamberlain‘s non-repeated pairwise negotiations, hewas investigating how elements of a market‘s design influenced its performance. Manysubsequent experiments in these and other lines of investigation have since reportedcareful within-experiment comparisons focused on such issues of design, and on the moreprecise hypotheses that arose from series of experiments that built on one another. Thefirst volume of this Handbook reported many series of experiments that resulted from thiskind of conversation among experimenters, and between experimenters and theorists.2In the practical market design efforts that will be the main focus of this chapter, there arestill conversations among experimenters, and between theorists and experimenters, butthose are only parts of a larger conversation. Often the need for a newly designedmarketplace is sparked by a market failure, or a new law or regulation, and a new designwill require coordination among many parties. So the conversation is conducted amongeconomists, market participants, regulators, policy makers, and their constituents.Progress still emerges cumulatively from series of investigations, conversations, anddebates, but in general not from series of experiments only.Among the experiments covered in Roth (1995) but worth mentioning again in adiscussion of market design are Hong and Plott (1982) and Grether and Plott (1984).Both experiments investigated how the regulation of pricing practices could influence2See particularly Roth (1995a), Holt (1995), Kagel (1995), and Ledyard (1995).2

prices, in connections with investigations by the Interstate Commerce Commission andthe Federal Trade Commission, respectively. The ICC case concerned whether bargeoperators should be required to post prices and announce price changes in advance. TheFTC case also involved advance notification of price changes, and other contractingpolicies of four chemical companies that supplied oil refiners with the additives for―leaded‖ gasoline, as that product was being gradually phased out due to environmentalconcerns. In both experiments, a simple laboratory environment was created that capturedsome of the important aspects of the situation, and in both experiments changes in theregulations concerning price announcements and contracts influenced prices. Hong andPlott note that it is difficult to draw general conclusions from such an experiment, but thatthe results ―shift the burden of proof‖ and put the burden on those who would argue thatin the markets of interest in the field the results would be different.Experiments seem to have been most useful in practical design when they are used ascomplements to other empirical and theoretical work. Used together with other tools,experiments have played multiple roles, not only in designing new marketplaces andinstitutions, but in helping diagnose and understand market failures and successes, and incommunicating results to policy makers. This will make the discussion of experiments inmarket design somewhat different from the other chapters in this volume and from thediscussions of the older strands of market design in the previous volume (Kagel andRoth, 1995). To put market design experiments in context, it will be necessary todescribe at least briefly the problems that a new market design was called on to solve, thetechnical and political and other obstacles that were faced, and how the experiments wereused as complements to other, non-experimental work to bring the effort to a conclusion.The most complete account of the role of experiments in market design can therefore begiven in connection with designs that have been adopted and implemented.But there are lots of barriers to new market designs, and so of course experiments havealso played a role in practically motivated design efforts that did not end with theadoption of a new marketplace. Much can nevertheless still be learned from some of theearliest market design experiments, which fall into this category. In Section 2 I‘ll discuss3

early experiments aimed at improving the allocation of airport takeoff and landing slots.This is a subject that experimenters have contributed to for over three decades now,without yet seeing the adoption of an efficient allocation scheme. Some of the issues thatarise in connection with airport slots also arose in designing auctions for radio spectrum,discussed in Section 3, and here economists, including experimental economists, weremore successful in contributing to design decisions that were ultimately implemented.Section 4 discusses how experiments have played a role in another area of auction design,concerning eBay‘s reputation system. Section 5 discusses labor market clearinghouses, adomain in which economists, and experiments, have played a large role in the design andimplementation of new marketplaces.It‘s useful to think of economics experiments (and a good deal of economics in general)as being part of three big conversations, which I spoke about in Roth (1995a) as―speaking to theorists,‖ ―searching for facts,‖ and ―whispering in the ears of princes.‖Market design is clearly aimed at princes, and their modern day incarnations asbusinessmen, bureaucrats, politicians, and policy makers of all sorts. It turns out thatsome of the things that sway princes are the same things that persuade scientists, so thattesting theory and discovering and documenting behavioral regularities play a role inbringing new designs from conception to implementation. But policymaking alsoinvolves a rich palette of demonstration, persuasion, and communication in addition topurely scientific concerns, and we‘ll see that experiments are also useful for thesepurposes.One long anticipated use of experiments, that came to fruition particularly as a generationof computerized auctions came to be deployed, was as test beds used to make sure thatthe proposed auction designs were usable by bidders, much as wind tunnels are used totest scale models of new aircraft before full size aircraft are actually built (see e.g. Plott1987 on this). Often this involves using the laboratory infrastructure to test a system afterit has been at least partly designed, but before it has been deployed. Another use ofexperiments that grew in importance as economists became involved in designingmarkets was as demonstrations of underlying economic principles.4

One of the clearest examples of a demonstration experiment to illustrate an economicprinciple is the auction of a jar of coins to illustrate the winner’s curse in a common valueauction. The idea is to concretely model an auction in which bidders do not know withcertainty the value of the object being auctioned, such as an auction for the right to drillfor oil or to harvest timber at a certain site. Bob Wilson invented a demonstration that hasproved very durable, for advising both bidders and auction designers.In one variation, the demonstrator circulates a closed jar filled with coins among theaudience, allowing everyone to examine the jar (but not to open it). After everyone hasexamined it, a first-price, sealed bid auction is conducted for the value of the coins in thejar. That is, each member of the audience is invited to write down a bid; these bids arecollected, and the highest bidder pays his bid and receives in return the value of the coinsin the jar (which is typically paid in paper money, for the convenience of the winningbidder, and of the demonstrator who keeps the jar for another day). Sometimes theaudience members are additionally asked to write down their estimate of the value of thecoins in the jar, for discussion afterwards.A very usual outcome is that the estimates of the value of the coins in the jar vary widely,but are distributed roughly around the actual value. That is, collectively, the estimates arenot too bad. But the auction is often won by the bidder who had the highest estimate, andalthough his bid is typically lower than his estimate, it is almost always above the actualvalue of the coins. Thus the winning bidder loses money; he suffers from the ―winner‘scurse.‖The theory of common value auctions was initially explored in Wilson (1967, 1969) andRothkopf (1969). They initiated the study of models in which n bidders each receive anoisy signal of the common value of the object being sold (e.g. estimates from theirgeologists of the amount of oil that can be recovered). This signal gives the bidder anestimate of the value, but not as good an estimate as if he could also see the otherbidders‘ signals. A simple way to appreciate the problem facing such bidders is to5

suppose that all bidders adopt the same bidding strategy, and that the higher their ownsignal (i.e. the more valuable their information indicates the object is), the higher theybid. In this case, the bidder with the highest signal will win the auction. But even if allthe signals are unbiased, i.e. if they are drawn from a distribution whose mean is the truevalue, the highest of n such signals (the nth order statistic) will be higher than the truevalue (and much higher if n is large). If bidders understand this, then their biddingstrategy should take into account that they must substantially discount the naïve estimatebased on their own signal alone, i.e. the naïve estimate that ignores the fact that, if theywin the auction, their signal is the highest of n such signals.A bidder who fails to reduce his bid sufficiently below his signal is likely to findthat the value of the object he has won is less than his bid. In auctions in which suchmistakes are widespread or persistent, it could be that winning bidders will regularly losemoney. This possibility was brought to the attention of the oil industry by Capen, Clapp,and Campbell (1971). But an article in a petroleum journal urging oil companies to bidsubstantially less than their geologists advised them might be just an attempt to fostercollusion, or to gain a bidding advantage over competitors, rather than a description of awidespread mistake. The jar of coins demonstration was designed to show that this is aneasy mistake to make.In reply to my query about the origin of this demonstration, Wilson writes:―I recall using it in a series of three lectures at Weyerhauser in Tacoma in about1970-71 or so and occasionally thereafter such as at the Dept of Interiorroughly 1973-4 and then late 70s with oil companies, and of course in classes atthe GSB where invariably the students overbid‖ (Wilson, 4/18/2008 email)When the demonstration is used to show that the auction design matters, the firstprice sealed bid auction is sometimes followed by an ascending oral auction, in which theauctioneer calls out ascending prices and bidders indicate with raised hands whether theywish to continue bidding at that price. When I have used the demonstration for thispurpose, I announce at the outset that I will auction the jar in two different ways, and that6

we will afterwards toss a coin to determine which of the two auctions will determine thewinner and the winning price. The value of the jar of coins is revealed only after bothauctions have been conducted.The difference between the sealed bid and the oral auction is that the bidders inthe oral auction can see when other bidders drop out, and so a bidder with a high estimatequickly learns that most other bidders had lower estimates. This allows bidders in the oralauction to update their estimates, in a way they cannot in the sealed bid auction.Preston McAfee conducted such demonstrations during the discussion of thedesign of the FCC auctions of radio spectrum (to be discussed in Section 3). McAfeeused a jar containing 200 M&Ms, and told the bidders that the (unknown number of)M&Ms in the jar would be worth 0.10 each, and he displayed a closed envelope thatcontained the (unknown) value. He writes:―I sold the envelope, although I did pass the M&Ms around afterward Isold a 20 bill for 140; this was the extreme. I ran first a sealed bid, thenbefore revealing the results, an oral auction. The winner's curse was invariablyless in the oral auction, but [the] winner in the oral auction also lost money.―I did it six or seven times to different telecom audiences. The largest with the 140 winner - was CTIA. I was on the cover of their magazine later.―One guy in the audience said "it doesn't matter what you bid, later it willbe worth three times as much.". I said "you should bid a trillion dollars." Anyonewondering about the telecom meltdown only needs to know that participant'smindset.‖ [McAfee, 6/11/08 email]3There are a number of differences between experiments primarily intended asdemonstrations and experiments conducted to carefully test hypotheses (not that the jar ofcoins demonstrations don‘t test, and reject, the hypothesis that the winner‘s curse is just ahypothetical mistake that cannot readily be observed). Among these differences are howmuch attention is given to controlling the environment (e.g. to making sure that biddersdon‘t directly communicate their private estimates to one another), how much effort isspent investigating relevant parameters (such as the number of bidders) by systematicallyvarying them, how much care is taken with the experimental design (e.g. if we want tocompare oral auctions with sealed bid auctions, it would be better to run them under3CTIA is a wireless industry association originally called the Cellular Telecommunications & InternetAssociation.7

identical conditions, instead of running the oral auction with bidders who had justparticipated in a sealed bid auction). Last but not least, in formal experiments care istaken to collect, analyze and report the data. So, despite the rapid spread of the jar ofcoins demonstration, particularly as a teaching tool in classes that covered auctions, itwas a welcome development when the winner‘s curse started to be examined in thelaboratory.The first paper I know of reporting an experimental examination of the winner‘scurse is Bazerman and Samuelson (1983), who literally studied the jar of coinsdemonstration in the laboratory. They varied the number of bidders and the contents ofthe jar, and solicited confidence intervals about the value of the jar along with sealedbids. They report that the size of the winner‘s curse increases in the number of biddersand the uncertainty about the contents of the jar. Subsequent experimenters haveimplemented common value auctions in ways that give them more flexibility and controlover the signals that bidders receive, and how these are related to the true common value.In a striking series of experiments by Kagel and Levin, the common value (that thewinning bidder will receive) is drawn from a distribution known to the bidders, and theneach bidder receives a signal independently drawn from a known distribution around thiscommon value (see Kagel 1995 and Kagel and Levin 2002 for surveys of this literature).These experiments have documented new regularities (e.g. concerning the effect ofproviding an additional public signal in auctions in which the winner‘s curse is present),and have in turn inspired some new directions for both empirical investigation (see e.g.Kagel and Levin 1986) and theory (see e.g. Eyster and Rabin 2005).In short, an experimental demonstration that grew out of theoretical issues relatedto auction design led in turn to a formal program of experimentation that has raised newissues, some of which are also of concern in market design. But even as a demonstration,the jar of coins experiment helped bring the winner‘s curse to the attention not only ofbidders but also of auction designers, at the Department of the Interior for offshore oilleases, and to various parties with interests in the design of the FCC‘s radio spectrumauctions, in a way that would have been difficult to do from the theoretical literaturealone.8

In the case of the winner‘s curse, experiments were used to demonstrate and then tostudy a phenomenon that was originally controversial among economists, since it is anout of equilibrium phenomenon: at equilibrium, bidders all discount correctly and no oneis a predictable victim of the winner‘s curse. In the next section we will see thatexperiments can also be used to demonstrate points that are relatively uncontroversialamong economists, but may still need to be communicated to policy makers in aneffective way.2. Some early design experiments: allocation of airport slotsAn early attempt to use experiments for practical market design comes fromairline deregulation. Many aspects of the airline business that were once regulated havelong since been deregulated and opened to decision making by individual airlines andcompetition among airlines. These include how ticket prices are set, how it is determinedwhich airlines will fly between which cities, and even how new airlines may enter themarket. But, at least since 1969 when the Federal Aviation Agency limited the number oftakeoff and landing slots at several of the busiest airports, the allocation of these slots byadministrative means has been a source of potential inefficiencies. There has been alongstanding interest (which continues today) in replacing these administrativeprocedures with some kind of market both for allocating slots and for allowing airlines totrade already allocated slots. This might call for a market of considerable complexity andflexibility, as slots at a given airport are complements both to other slots at the sameairport and to slots at the other airports at which the corresponding flights will begin orend.The Civil Aeronautics Board (CAB) commissioned a study by Grether, Isaac, andPlott (1979) to compare the existing system of slot allocation by committee to a simplemarket. Grether, Isaac, and Plott (GIP) report that one of the first things they did was sitin on some of the committee meetings (transcripts of these meetings are included in theirreport). These committees had been formed after a 1969 Federal Aviation Agency (FAA)9

ruling limiting the number of takeoff and landing slots per hour at the most congestedairports (at that time La Guardia, JFK, Washington National, and O'Hare). Each airporthad a separate committee with members representing the airlines operating out of thatairport. The committee discussions were restricted to slot allocations at a single airport,preventing discussion even of what other airports were involved in a flight that causeddemand for a particular slot. The original task of these committees had been to coordinatethe trade of slots among the incumbent carriers. But, following the Airline DeregulationAct of 1978, these committees would also have to allocate slots to new entrants. Plott(1985) writes"The CAB staff became concerned that the committees could be used as a barrierto new competition. I was contacted to study the committees because of myprevious [experimental] work on committee behavior [Fiorina and Plott, 1978].‖GIP observed that the committees operated by consensus, that is, unanimityappeared necessary to make a change in existing allocations. However, at the time oftheir observations, since no committee had failed to reach consensus, it was notcompletely clear what the FAA would do in case of such a failure. The first laboratoryexperiments reported by GIP were therefore designed to simply demonstrate that theoutcome of a unanimity-rule committee could be highly sensitive to the "default"allocation that would result in the absence of agreement.4Twenty-three laboratory committees (consisting of either 9 or 14 members) wereobserved under a variety of conditions. Committee members were given initialendowments of "cards" and "flags" of various colors, with instructions on how muchcombinations of these would be worth to them at the end of the session. Committees mettwice, first to allocate "cards" and then to allocate "flags," with the value of a flag4Grether, Isaac, and Plott (1989) write in their introduction to the reprinted report (p.xi) "Theexperiments.are described in the report as "demonstrations." Because we used experiments, there was noneed to explain the details of a game-theoretic model or a solution concept such as a core. The reader couldlook at the rules, use some intuition about what might happen, and then look at the data. In this way a levelof understanding about the argument could be achieved without resort to complicated theory." In a similarspirit, GIP (1981) speak of the role of these experiments in communicating with policy makers as follows(p166): This type of evidence will probably be of little value to economists who already have considerableexperience with the behavioral properties of a variety of allocation processes. [S]ome decision makersmay have no experience with game-theoretic models, and rely on instincts and general theories of acompletely different sort."10

depending on how many cards had been obtained in the first meeting (think of cards andflags as being slots at different airports.) In each committee meeting, one of the followingthree default rules was used to determine what would happen in case the committeedefaulted, i.e. in the absence of a unanimous agreement: 1. each committee memberwould receive his/her initial endowment; 2. each committee member would receive arandom allocation unrelated to initial endowments; 3. each committee member wouldreceive an endowment created by taking items at random only from those with largeinitial endowments and giving those items to those with small or zero initial endowments.(This latter condition was motivated by a belief that the FAA might mandate suchadjustments to facilitate entry of new carriers.)The results were that, in each condition, the final allocations were close to theexpected value of the default allocations. When these were the initial endowments, finalallocations were near the initial allocations or somewhat below them for those with largeendowments, and when default would result in a random allocation, final allocations wereessentially equal, independent of initial allocations. In particular, GIP noted that while thefinal allocation was sensitive to the default rule, it was not sensitive to the underlyingdistributions of values that determined the profitability of different combinations andhence the efficient allocation. And the committees were also not able to coordinateefficiently between first and second meetings; in each meeting the outcome was primarilydetermined by the default rule and the initial endowment in that committee session.GIP suggested that a more efficient way of allocating slots would be to auctionthem in each airport in a multi-unit sealed bid auction in which all bidders would pay thelowest accepted bid5, and then to allow an aftermarket in which units bought in theseindividual markets could be traded. They conducted several demonstration sessions oflaboratory auctions to initially allocate goods, and of oral outcry double auctions to tradethem. At least one experimental session paralleled the "cards and flags" condition of their5Note to self: GIP suggested that such an auction was roughly dominant strategy incentive compatible.(VI-3 "This particular market organization has the feature that the optimum bidding strategy is for eachbuyer to bid the maximum that he/she is willing to pay (except possibly for the marginal bids where thestrategy is sensitive to the information state of the bidder).11

committee experiments, and they noted the increased efficiency of the resultingallocations.6They also noted that it would be desirable for an actual market for trading takeoffand landing slots at different airports to allow some sort of package bidding for "blocks"of slots, since airlines have economies of scale at a given airport, and since takeoff andlanding slots are related to business plans involving which routes an airline will serve.They wrote (GIP, 1979, pVI-8 [square brackets were footnotes in the original])"Each carrier would register in a central computer the maximum(minimum) price it would pay for (sell) a particular slot. Contingencies such asblock provisions [A carrier may want to buy (sell) only if it can acquire (sell) acertain set of slots.] should also be listed. Such contingencies allow carriers totake advantage of interdependencies of operations which occur because of timeand size (nonconvexities). By simply asking for a 'print out' each carrier can seethe full pattern of offerings at any given time and can activate a transactionthrough the computer (an 'open book' feature).[The identity of the carrier makingan offer (bid) to sell (buy) would not be available to the potential buyers (sellers).]Many techniques exist for summarizing information and allowing participants tobe fully aware of the state of the market. [Those desiring further details aboutsuch a computerized market should contact the authors.]"GIP did not, however, report any experiments with a market that allowed packagebidding for blocks of slots. Their proposal for auctions of slots followed by trading ofslots with package bidding was not adopted.In 1982, a proposal for directly allocating slots via a package bidding auction wasmade, and an experiment was reported, by Rassenti, Smith and Bulfin (1982). Rassenti,Smith and Bulfin (RSB) proposed that all slots should be simultaneously allocated in a―combinatorial‖ auction that would allow airlines to place bids for packages of slots, witha bidding language that would allow an airline to make bids on multiple packages whilespecifying e.g. that it wanted one or the other of two packages but not both. The winning6Note to self: In the preface to the reprint, GIP (1989, page x) they note "The relationship between initialendowments (randomly determined) and markets is studied experimentally for the first time in Chapter VI.As can be seen (Figure 26), the market has much more variance than is the case when the distribution ofsupply is constant over time." This refers to a double oral auction run for 18 periods (with a shift in supplyand demand at period 7). Starting in period 14, the initial endowments were randomly permuted betweenplayers, so that from period to period players no longer had the same endowment (although aggregatesupply and demand were kept constant). Most transaction prices become much further from the equilibriumprices (although the final transactions in each period are close to the equilibrium price.)12

bids would then be determined by finding the revenue maximizing set of non-intersectingpackages. They noted that this involved solving an integer programming problem.They further noted that an auction of this kind would not be incentive compatible,that is, ―the door is open to the possibility of strategically underbidding the true value ofcertain packages.‖ But they conjectured that strategic bidding in this environment iscomplex, and that this might deter it. In particular, they note

Among the experiments covered in Roth (1995) but worth mentioning again in a discussion of market design are Hong and Plott (1982) and Grether and Plott (1984). Both experiments investigated how the regulation of pricing practices could influence 2 See particularly Rot