Research Article Bounded Rationality In Agent-based Models .

Transcription

International Journal of Geographical Information ScienceVol. 20, No. 9, October 2006, 991–1012Research ArticleBounded rationality in agent-based models: experiments withevolutionary programsS. M. MANSON*Department of Geography, University of Minnesota, 414 SST, 267 19th Ave South,Minneapolis, MN 55455, USA(Received 8 October 2005; in final form 1 May 2006 )This paper examines the use of evolutionary programming in agent-basedmodelling to implement the theory of bounded rationality. Evolutionaryprogramming, which draws on Darwinian analogues of computing to createsoftware programs, is a readily accepted means for solving complex computational problems. Evolutionary programming is also increasingly used to developproblem-solving strategies in accordance with bounded rationality, whichaddresses features of human decision-making such as cognitive limits, learning,and innovation. There remain many unanswered methodological and conceptualquestions about the linkages between bounded rationality and evolutionaryprogramming. This paper reports on how changing parameters in one variant ofevolutionary programming, genetic programming, affects the representation ofbounded rationality in software agents. Of particular interest are: the ability ofagents to solve problems; limits to the complexity of agent strategies; thecomputational resources with which agents create, maintain, or expandstrategies; and the extent to which agents balance exploration of new strategiesand exploitation of old strategies.Keywords: Agent-based model; Bounded rationality; Evolutionary programs;Land change1.IntroductionHuman transformation of the Earth’s land surface has environmental and socioeconomic impacts that extend from specific locales to the entire globe. We knowsurprisingly little about this land change because it is a complex phenomenon causedby individual actors, such as households or firms, within larger environmental andsocial systems (NRC 2001). Models that integrate human decision-making, societalcontext, and ecological relations are central to understanding land change (Brown etal. 2004, Gutman et al. 2004). Nonetheless, relatively few land-change modelsactively consider theories of human behaviour (Irwin and Geoghegan 2001). Thiswork instead has tended to address ‘highly empirical questions using statisticalmethods and has generally avoided abstract theoretical formulations’ (Walker2004: 248). There is a need for a broader conceptualization of individual decisionmaking in land-change models as such and in the cognitive and social sciences moregenerally (Agarwal et al. 2002).*Corresponding author. Email: manson@umn.eduInternationalJournal of GeographicalScienceReprint: Manson, S. M.(2006). Boundedrationality Informationin agent-basedmodels: experiments withISSN 1365-8816 print/ISSN 1362-3087 online # 2006 Taylor & Francisevolutionary programs. InternationalJournalofGeographicInformationScience (20) 9:http://www.tandf.co.uk/journals991-1012.DOI: 10.1080/13658810600830566

992S. M. MansonAgent-based models (also termed multi-agent systems or individual-basedmodels) are increasingly used to represent decision-making in land-change contexts.An agent-based model is a system of semi-autonomous software programs, or‘agents’, that represent the complex behaviour of interacting entities such asindividuals, households, or firms. Agent-based models advance our understandingof decision-making by representing actors, such as individuals or households, asautonomous, heterogeneous, and locally interacting entities. To fully leverage theseadvantages, however, modellers must also invest agents with theoretical andmethodological representations of the decision-making that drives land change(Parker et al. 2003).This research explores the use of evolutionary programming to represent decisionmaking in agent-based models. It uses the SYPR Integrated Assessment (SYPRIA)model, named for the southern Yucatán peninsular region of Mexico, which is hometo deforestation and attendant cultivation that threatens one of the world’s largestremaining subhumid tropical forests. The chief proximate cause of this land changeis household agricultural activity. To represent these households and their decisionmaking, SYPRIA agents make land-use decisions in a simulated landscape,decisions that account for individual, social, and environmental context. Otherstudies describe SYPRIA in general and its capacity for scenario generation(Manson 2000, 2004, 2005, 2006).This paper focuses on linking the theoretical and technical imperatives ofdecision-making research in agent-based models of land change. SYPRIA agents areinvested with a form of evolutionary programming, termed genetic programmingthat serves as a computational analogue to real household decision-making. Thispaper examines how changes in genetic programming parameters affect therepresentation of a specific decision-making theory, bounded rationality, in agents.Section 2 examines how agent decision-making is a form of multicriteria evaluationthat assesses the suitability of land for agriculture. Agents treat their multicriteriaevaluation question as a symbolic regression problem that they solve withevolutionary programming. This approach offers important methodologicaland conceptual advantages when representing decision-making in agent-basedmodels of land change. Section 3 describes how changes in geneticprogramming parameters affect the ability of agents to act in accordance withbounded rationality.2.Modelling agent decision-makingAgents in many agent-based models of land change choose locations in a simulatedlandscape for activities such as clearing forest and planting crops (Gimblett 2002,Janssen 2003, Parker et al. 2003). Agents can treat this decision-making problem asa form of multicriteria evaluation (Collins et al. 2001), where each agent createsstrategies to choose locations in the simulated landscape according to theirsuitability for any given production activity as a function of spatial factors that varyover the landscape (Manson 2005). A key challenge in designing software agentsthat represent real decision-makers is determining the manner in which each agentsolves its multicriteria evaluation problem. When agents use evolutionaryprogramming to solve this problem as a form of symbolic regression, the modellergains a useful computational tool and a means of representing agent decisionmaking as a form of bounded rationality.

Bounded rationality in agent-based models2.1993Multicriteria evaluation as a symbolic regression problemAgents can treat their multicriteria evaluation problem as a symbolic regression.Symbolic regression is empirical modelling that involves inductively building amathematic or computational statement of relationships among random variables.Symbolic regression approximates an ideal function f (x) that reproduces the valueof a dependent variable Y as a function of independent predictor variables X5{X1, , Xn}. These variables are known solely through empirical observations O5{y, x}where y:5(yi)m, a column vector of m observations on Y, and x:5(xi, j)m6n, an m6nmatrix of m observations on the n independent variables X. Any given observationof the m observations in O is denoted oi or {yi, xi}. The value of f (x) at xi, or f (xi), isdenoted f̄i and is related to the ideal value fi through error defined as ei5fi2f̄i.Consider a statistical linear regression, for example, that takes the familiarfunctional formY f ðxÞ f ðx; bÞð1Þ b0 zb1 X1 zb2 X2 z . . . zbn Xn zewhere dependent variable Y is expressed as a linear function of independentvariables X5{X1, , Xn}, coefficients b5{b1, , bn}, and error term e.Generic symbolic regression does not specify a functional form for f (x), butinstead combines primitive functions that range from simple arithmetic operators tospecialized domain-specific functions. Symbolic regression approximates f (x) asfˆ ðxÞXnfˆ ðxÞ&a w ðxÞð2Þi 1 i icomprising n pairings of function w (x) with coefficient a estimated to minimize eover the values of dependent and independent variables given by observationsO5{y, x} (after Ralston and Rabinowitz 2001). The means used to estimate f (x)vary according to the restrictions imposed by its functional form. In the case of thestatistical linear regression, we can employ an approach such as ordinary leastsquares. Other versions of symbolic regression require specialized methods such ascomputational intelligence or machine learning to compose independent variables,coefficients, and primitive functions in a way that minimizes the error of fˆ ðxÞ withrespect to observations O (Russell and Norvig 1995).In the case of SYPRIA, the dependent variable Y maps onto actual land use, andthe independent variables X correspond to social and environmental factors. Thepropensity for real households to clear land for extensive agriculture in the SouthernYucatán is a function of factors related to environmental systems (e.g. water, soil,precipitation) and social drivers (e.g. distance to market, land tenure). As proxies toactual households, each software agent must approximate its own f (x) to establishthe suitability of any given location as a function of spatially varying social andenvironmental factors. Each of the m observations in O gives the value of thedependent and independent variables for a single location in the area surroundingeach household.In SYPRIA, a different agent represents each of the roughly 5000 real householdsin the Southern Yucatán. These agents sample nearby locations in the simulatedlandscape, represented by GIS layers based on real data, to extract values for the

994S. M. Mansondependent and independent variables found in observations O; this formulation issimilar to that found in other agent-based models of land change (cf. Polhill et al.2001, Evans and Kelley 2004, Parker and Meretsky 2004, Brown et al. 2005). For thepurposes of this paper, the dependent variable is land use observed in 1998 asderived from Thematic Mapper remotely sensed imagery. Independent variablesinclude soil type, elevation, slope, aspect, precipitation, surface hydrology, distanceto roads and markets, and socio-economic and demographic factors for the sameyear (Manson 2005). While the general characteristics of these data and the largermodel structure are given here as background, these data and most specifics ofSYPRIA are exogenous to the analysis of the linkages between evolutionaryprogramming and bounded rationality, and as such are not detailed beyond what isnecessary here.2.2Symbolic regression and evolutionary programmingAgent-based models in which agents conduct symbolic regression use approachesincluding statistics, linear programming, rules and heuristics, vector weightingschemes, belief–desire–intention models, and classifier systems (Tesfatsion 2001,Parker et al. 2003). Agents can also solve their respective multicriteria evaluationproblems through evolutionary programming, or the creation of software programsthrough computational means inspired by biological processes, particularlyDarwinian evolution (Eiben and Schoenauer 2002).Genetic programs are trees composed of terminals and functions in a branchingstructure. In terms of the underlying biological metaphor, a tree that constitutes agenetic program is equivalent to that program’s genetic code, and each function andterminal is analogous to a gene. Consider a simple tree that corresponds to thesymbolic regression equation Y5b1X1 b2X2 (figure 1(a)). This equation can berepresented by a tree that possesses two branches, one composed of the terminals b1and X1, and another by b2 and X2. These sub-branches are joined by themultiplication function, 6, implicit in the regression equation Y5b16X1 b26X2.These sub-branches in turn are joined by the addition function, , at the root nodeof the tree. Tree depth is the number of branches spanning the longest descent fromthe root to terminals, while length is the total number of nodes (figure 1(a)).Terminals such as b1 and X1 are, respectively, numerical coefficients and spatialfactors corresponding to the variables in X, while functions can range from simplearithmetic operators to more complex operators. Functions tend to be restricted to asmall set because the genetic programs will evolve sophisticated functionsthemselves, just as simple machine languages can undergird complicated programs(Banzhaf et al. 1998). The tree structure is useful because it can be interpretedvisually while being functionally equivalent to a generic computer program (Koza1992).A genetic programming system manages individual programs, exerting pressure onthe programs to evolve over time to become better solutions to agent symbolicregression problems. Each agent possesses a genetic programming system thatcontrols genetic program population K of size M that evolves over generations G(after Koza 1992). Members of the initial population (g50) comprise randomlychosen terminals and functions. Members of this generation compete for theopportunity to pass on their genetic material to the next generation (g51), which inturn aim to do the same for the next (g52), and so on. Each generation serves as the

Bounded rationality in agent-based modelsFigure 1.995Genetic program (a) and crossover, reproduction, and mutation operators (b).‘children’ of the preceding generation and the ‘parents’ for the next. This evolutionmay involve as few as 10 generations or upwards of several thousand.The genetic programming system chooses parents to create offspring throughselection, whereby fit parents are more likely to have children. SYPRIA employs asteady-state approach, where population size is fixed over time by replacing less fitprograms in the parent generation with the children of fitter parents. The systemselects parents according to their fitness, fk, in approximating function fˆ ðxÞ as givenby minimizing error over observations O. The most common fitness function ingenetic programming is summed error, or the sum of the absolute value of thedifferences between genetic program output and the dependent variable given by O(Koza 1999). Variants calculate the mean error in order to include the effect ofsample size, sum of the squared errors, or the sum of square root of error (examinedbelow and given in table 1). From this follows the sometimes confusing conventionthat a lower fitness score indicates a greater ability to solve the symbolic regressionproblem.Three genetic operators create child genetic programs from fit parents to replaceless fit parents: crossover, reproduction, and mutation (figure 1(b)). Crossovermingles the genes of two parents by creating a single child that carries a mixture ofparental genes. Crossover is the driving force behind evolution because it quicklyculls substandard strategies from the population while creating potentially betterstrategies through the intermingling of well-performing genetic programs.Reproduction is analogous to asexual reproduction or cloning, whereby a geneticprogram in the parent generation makes a copy of itself for insertion into the child

996S. M. MansonTable 1. Genetic program parameter settings.ParameterValue (default)Tested parametersFitnessfunction: error e measured by f (xi), denotedfi, and the ideal value fi given by yi in o5{y, x} P f {f e mPmi 1 i ie i 1 fi {fi n P 2ae mi 1 fi {fi P 0:5e mi 1 fi {fiCreation method: initial profile of programsFull, Variable, Ramped Variable,Ramped Full, Ramped Variable/FullaSelection mechanism: means by which parents are Probabilistic, Ranked, Tournamenta,chosenElitist Tournament, DemeticPopulation size (M): programs in the population10–500 (300a)Generations (G): generations over which programs10–100 (30a)evolveSecondary parameters0.9Crossover probability (Pc): probability that aprogram will be the offspring of two otherprogramsMutation probability (Pm): probability that a0.001program mutates12(Pc Pm)Reproduction probability (Pr): probability that aprogram will be the offspring of one parentprogramInternal crossover points (%): probability that0.7program is crossed or mutated at a functioninstead of a terminalCreation depth: maximum depth of programs when6createdCrossover depth: maximum depth at which pro17grams crossFunction set: functions that can act as tree nodes 246Randomness: probability of choosing an ephemeral0.5constantProgram length: maximum genetic program length5005Tournament size: tournament is held between XindividualsaDefault.generation. Reproduction is useful for maintaining fit strategies across generations.Mutation is a special form of reproduction in which a parent creates an identicalchild but a gene is randomly mutated in the process. Mutation introduces changes ina program by chance and ensures that terminals and functions do not disappearfrom the population due to crowding by more successful terminals and functions inearly generations.2.3Evolutionary programming to represent bounded rationalityThe choice of evolutionary programming as a symbolic regression method stemsfrom the broader challenge of establishing how agents may approximate fˆ ðxÞ in amanner that satisfies theoretical imperatives. One of the most common theories ofdecision-making in the social sciences, perfect rationality, is joined by a growingnumber of alternatives, of which bounded rationality is prominent. Every decisionmaking theory has specific corollaries that govern its attendant computational

Bounded rationality in agent-based models997implementation of symbolic regression. Perfect rationality is expressed in a varietyof ways, particularly via statistical regression models, while bounded rationality issuccessfully implemented with evolutionary programming.Perhaps the most commonly accepted model of decision-making in the socialsciences—and, by extension, land-change models—is rational choice theoryimplemented with statistical regression. One variant of this theory, termed perfectrationality, is particularly widespread because it offers power and analyticaltractability via assumptions of utility maximization, perfect computation, andcomplete information on alternatives (Myers and Papageorgiou 1991). Manymethods for solving symbolic regression problems are in keeping with the corollariesof perfect rationality. Econometric research in particular adapts a variety ofregression methods (e.g. least-squares, hazard, and logistic models) to theunderlying principles of perfect rationality for land use (Bockstael 1996, Nelsonand Geoghegan 2002) and rational choice in general (McFadden 2001). Whenapplied to empirical data, these symbolic regression methods aim to solve inthe aggregate the same multicriteria evaluation problem faced by agentsindividually.The normative tradition of rational choice is increasingly joined by descriptivealternatives (Bell et al. 1988, van den Bergh et al. 2000). Key among these is boundedrationality, introduced by Simon (1997: 267) as ‘rational choice that takes intoaccount the cognitive limitations of the decision-maker—limitations of bothknowledge and computational capacity’. Actors under bounded rationality do notoptimize because they possess limited computational resources and information onwhich to base decisions. Actors instead satisfice—make suboptimal yet acceptabledecisions—with a small cadre of partial strategies based on imperfect information. Akey effect of these limits is that strategies have limited complexity; decision-makersuse simple heuristics or ‘rules of thumb’ (Baumol and Quandt 1964, Slonim 1999).These key precepts of bounded rationality—limits to information and cognition—have been extended by positing that learning from experience is important toexplaining satisficing, creation of heuristics, and limits to information. Actorsimprove strategies through a Darwinian process of learning-by-doing that balancespath-dependent exploration of new strategies by extending current strategies versussimply exploiting existing strategies (Arthur 1993, Gigerenzer et al. 2001).A key advantage claimed for agent-based modelling is the ability to represent keyfeatures of bounded rationality in agents, such as learning over time, bounded accessto information, and limited computational resources (Epstein 1999). Agent-basedmodels of land use implement bounded rationality (sometimes implicitly) throughmechanisms such as limits to knowledge, as when agents use local information orhave a limited search capacity, and limits to computation, particularly throughsymbolic regression approaches such as heuristics or classifier systems (Gimblett2002, Janssen 2003, Parker et al. 2003).Evolutionary programming is a particularly promising means of representingbounded rationality in agent-based models, although the modeller must choosebetween two ways in which to do so: functional and representational (Chattoe1998). Evolutionary programming is functional when it acts in an instrumentalmanner to find optimal or near-optimal solutions to well-specified problems in highdimensional domains characterized by noise and complexity (Kaboudan 2003). Inland-change research, evolutionary programming is most often used in a functionalmanner for the optimal allocation of land because it can quickly navigate

998S. M. Mansoncomplicated search spaces defined by millions of land parcels or raster grid cells (e.g.Balling et al. 1999, Matthews et al. 1999, Xiao et al. 2002).In contrast, the representational use of evolutionary programming draws severalparallels between human decision-making and the genetic programming system(Edmonds 2001, Chen 2003). Representational evolutionary programming grantseach agent a population of strategies; in an agent-based model of land change likeSYPRIA, each program is an alternative multicriteria evaluation strategy of landuse. An agent uses the fittest strategy when it makes a decision at any given time, butit also possesses many other potential strategies to use in the face of changingcircumstances (Dosi and Nelson 1994). Importantly, the fittest strategy for an agentis almost always satisficing instead of optimizing because the representational use ofevolutionary programming imposes computational limits to cognition andinformation.Representational evolutionary programming maps onto many corollaries ofbounded rationality. First, limits to computational ability are imposed by reducingthe number of trees available to an agent and the depth or length that they canachieve (Edmonds and Moss 2001). Second, limits to information are imposed bythe manner in which offspring carry remnants of preceding generations. Theseportions of earlier programs are equivalent to memory, but this memory does notprovide perfect information because it is distorted over time (Chen 2003). Third,evolutionary programming can also represent learning under bounded rationality asthe above-noted balance between exploitation and exploration. Agents use crossoverto blend successful strategies to create better strategies, in essence learning fromexperience (Dawid 1999), while agents use reproduction as the equivalent ofexploiting existing well-performing strategies (Moss and Edmonds 1998,Beckenbach 1999). Fourth, mutation is akin to the errors, experimentation, andaccidents that form the basis for innovation (Chen and Chie 2004). Fifth, crossoverimplements imperfect information and search costs when less fit programs engagein crossover, leading to suboptimal offspring (Brenner 1999). Finally, agentscan engage in imitation, communication, and development of new strategiesby borrowing strategies, although this interpretation requires us toadopt the computational conceit that function follows form (Pingle 1995, Dawid1997).Of course, the linkages between evolutionary programming and boundedrationality are relatively nascent. There is mounting evidence for the correspondence between evolutionary programming, models of bounded rationality, andbounded rationality in real decision-making (in addition to works noted above,for reviews see Chattoe 1998, Edmonds 1998, Chen et al. 2002, Chen and Wang2004). Evolutionary programming also captures aspects of decision-making thatperfect rationality and statistical models do not. Few researchers would claim,however, that either all bounds to rationality or the rationality (as such) of realactors strongly maps onto the artificial cognitive processes of agents equippedwith genetic programming systems. Bounded rationality assumes that realdecision-makers face limits to their cognitive processing but does not assumethat they use genetic programs or that programs fully reconstruct actualstrategies that map onto actual cognitive structures (leaving aside thephilosophical question of where cognition resides). There is an ongoing need toexplore computational analogues, in general, and evolutionary programming, inparticular, to bounded rationality.

Bounded rationality in agent-based models3.999MethodsWe face unanswered questions when linking the conceptual underpinnings ofbounded rationality to the technical implementation of genetic programming. Themodeller must modify the genetic programming system possessed by each agent inorder to fulfil the functional task of solving the multicriteria evaluation problemwhile also satisfying the representational goal of capturing features of boundedrationality. I constructed an experimental frame to test five key geneticprogramming parameters—fitness function, creation mechanism, selection operator,population size, number of generations—and their effects on both the functionalefficiency of programs and their representational fidelity.3.1Experimental frameTo better understand the role of the five system parameters in modelling boundedrationality with genetic programming, I tested variations of one of the fiveparameters of interest across 600 model runs in SYPRIA while holding the otherparameters constant, for a total of 3000 runs. Table 1 details the range of theseparameters and outlines secondary parameters common to genetic programmingapplications (Banzhaf et al. 1998). As noted above in section 2.1, much of theoperation of SYPRIA is irrelevant to the experimental frame, which reports on thegenetic programming systems of SYPRIA agents solving their symbolic regressionproblem.3.1.1 Fitness function. The heart of genetic programming lies in breeding betterstrategies, which in turn requires the selection of fitter parents for reproduction,crossover, and mutation. Parents are selected according to their fitness fk as given byminimizing error over observations O when solving for fˆ ðxÞ. The four mostcommon functions are summed errors, mean error, sum-of-squared errors, and sumof-square-root errors (table 1).3.1.2 Creation operator. There are three ways to modify genetic programs at theircreation, and these can be used to create five kinds of population (figure 2). The firstis to fill a population with full programs, each of which has branches that extendoutward from the root to a uniform maximum depth. The second is to create apopulation of variable trees, the branches of which can vary in depth up to amaximum value. The third is to create a ramped population of either full (rampedfull) or variable (ramped variable) trees by dividing it into subpopulations that varyin their maximum depth. A ramped full population of 10 programs can consist offive subpopulations, for example, in which the members of the first have a depth oftwo, the second a depth of four, and so on to the fifth group having a depth of 10.Finally, a population may also have both full and variable members to create aramped variable/full population. The purpose of creating a ramped or variablepopulation is to grant it greater diversity of program size and complexity, instead offorcing all programs to share a common structure at the outset.3.1.3 Selection operator. The overall fitness of the genetic program populationincreases over time because successful parents are more likely to have offspring thatreplace poorly performing programs. Five functions are commonly used to selectparents: probabilistic, ranked, tournament, elitist tournament, and demetic. Thewidely used probabilistic approach was developed by Holland (1975) for geneticalgorithms, the precursor to genetic programming. The probability of a program

1000S. M. MansonFigure 2.Creation operators.being chosen is proportional to its individual fitness compared with the summedfitness of all other programs in the population. Parent programs are thereforeselected according to their fitness as a proportion of the total fitness, while programsto be replaced in the steady-state population are selected with the inverse operation.Under ranked selection, parents are chosen according to their fitness rank in thepopulation overall, and their offspring replace the worst-ranked individuals. Thetournament method chooses a small group of programs at random from the entirepopulation. Each member in this group is compared with the others, and theprogram with the best fitness is selected for crossover, reproduction, or mutation,and its child replaces that with the worst fitness. The elitist tournament method isidentical to the tournament technique except that the fittest individual in thetournament automatically reproduces. Finally, the demetic grouping method dividesthe population into subpopulations, or demes. Within each deme, parents are

Bounded rationality in agent-based models1001chosen at random, but migratory programs are probabilistically chosen from eachdeme and sent to another deme. Demetic grouping is useful for halting prematureconvergence in the population, since each deme can evolve without pressure save forthe occasional incursion of migratory programs (Langdon 1998).3.1.4 Population size. When each agent has a larger initial population, it has agreater diversity of candidate solutions and an increased likelihood of finding goodsolutions. Population size in applications across th

Bounded rationality in agent-based models: experiments with evolutionary programs S. M. MANSON* Department of Geography, University of Minnesota, 414 SST, 267 19th Ave South, Minneapolis, MN 55455, USA (Received 8 October 2005; in final form 1 May 2006) This paper ex