RMM Vol. 0, Perspectives in Moral Science,
ed. by M. Baurmann & B. Lahno, 2009, 199–206
http://www.rmm-journal.de/
Gary E. Bolton and Axel Ockenfels
Testing and Modeling Fairness Motives*
Abstract:
The advent of laboratory experiments in economics over the last few decades has pro-
duced an enormous literature devoted to describing, testing and modeling economic and
social behavior. Measured by publications and citations, the development of social prefe-
rence models to capture decisions motivated by fairness and other social criteria, is one
of the success stories in this literature. But with this success, and maybe even because
of it, controversies have arisen about what the models can and cannot do. In this note,
we comment on some of these debates. Our main theme is that descriptive models of be-
havior should be judged with respect to their usefulness. This is often neglected, partly
because there are no accepted measures and tests for the usefulness of a model, while
standard procedures to test whether a model is true are readily available. A model that
does not capture a ‘grain of truth’ is unlikely to be useful; however, the relationship is not
monotonic in that a ‘truer’ model is not necessarily a more useful model.
1. Are Fairness Models True?
We organize our discussion around a simple fairness model called ERC (Bolton
and Ockenfels 1998; 2000). The model characterizes how people trade-off ma-
terial self interest with a preference for fair distribution. More specifically, the
model stipulates that each agent has a ‘motivation function’ such that for a given
relative payoff (defined as one’s own payoff relative to the total payoff allocated
in a reference group), an agent’s choice is consistent with the standard assump-
tion made about preferences for money; that is, more money is better than less.
Alternatively, holding the pecuniary payoff fixed, an agent’s motivation function
is strictly concave in one’s relative payoff, with a maximum around the allo-
cation at which one’s own share is equal to the average share. That is, people
care about their status in a reference group and, in particularly, may dislike an
unfavorable relative position. Otherwise, the model is consistent with standard
game theoretic modeling, including the usual assumptions about rational beha-
vior and equilibrium play. The model organizes a large and otherwise disparate
set of laboratory observations, many anomalous to models based solely on self
interest, such as equity seeking behavior in bargaining games and reciprocity
* Bolton gratefully acknowledges financial support from the National Science Foundation. Ocken-
fels gratefully acknowledges financial support from the Deutsche Forschungsgemeinschaft.
200 Gary E. Bolton and Axel Ockenfels
in social dilemma games. But the model also implies the very competitive play
that economists expect, and that is observed, in market games (see Bolton and
Ockenfels 2000 for details).
Is the model true? The model basically traces social behavior back to the
decision makers’ concern for relative position. However, human behavior in ge-
neral and social behavior in particular is a complex phenomenon. E.g., fairness
does not only have a motivational side, but also cognitive, biological (including
neurobiological but also chemical, physical etc.), sociological, adaptational and
other roots. It has been shown, for instance, that intranasal administration of
oxytocin causes a substantial increase in trust among humans, thereby grea-
tly increasing the benefits from social interactions (Koesfeld et al. 2005). Also,
subjects who briefly held a cup of hot (versus iced) coffee judged a target per-
son as having a ‘warmer’ personality (generous, caring), and subjects holding a
hot (versus cold) therapeutic pad were more likely to choose a gift for a friend
instead of for themselves (Williams and Bargh 2008). There is also a large and
convincing literature showing that humans do not behave fully rationally, but
rather follow boundedly rational heuristics such as that suggested by Simon’s
satisficing approach.
ERC focuses on individual motivations and their interactions with strategic
decision making in laboratory games, and neglects any cognitive, biological or
sociological roots of social decision making. For example, it cannot capture the
observation that holding a cup of hot coffee affects social behavior. As a con-
sequence, it cannot be ‘true’ in the sense of capturing all factors that may be
relevant.
2. All Models Are Approximations
A ‘true’ model of social decision making needs to incorporate all relevant moti-
vational, cognitive, biological, sociological and other factors. This is unfeasible
and probably undesirable. We use models as maps.1 You use a different map de-
pending on where you want to go and how you are going to get there (walk, drive
or fly). The most useful maps communicate the critical information in compact
form, and so they omit or even sometimes actively distort facets of the lands-
cape. We propose to judge and test models using the same standards we judge
maps by: On the basis of how accurately they portray an aspect of the landscape
we want to know about.
Observe, for instance, that a critical ingredient in a subway map‘s success
is the simplifying assumptions it makes about the locations of the train stati-
ons: Stations on the London Underground map, for example, are laid out like
nodes on a grid, either vertical, horizontal or at a 45 degree angle to one ano-
ther. The result is a clear and economical tool for navigating the underground.
You can quickly see that to get from Russell Square to Great Portland Street,
1 The discussion in this section follows Bolton (forthcoming).
Testing and Modeling Fairness Motives 201
you should change trains at King’s Cross. Yet if you took this map as a guide
for foot travel, you would end up walking from Russell Square to Great Portland
Street in the wrong direction. Setting the Underground and geographically cor-
rect maps side-by-side, it is easy to see the value of this distortion—for subway
riders: Restricting station locations to a grid structure makes the map far more
transparent and simpler to use. We think we want our models to be simple for
similar reasons—and the cost for this simplicity, as it is for maps, is a loss of
detail and sometimes even some distortion.
Increasingly, models are how findings from the economics lab are communi-
cated to the larger community of researchers. This is how it should be. All of
these models are approximations of what we have learned. There is, as there
should be, vigorous debate about which model provides the best approximation.
But just as with a map, there are inevitably trade-offs between accuracy and
simplicity, and in particular, between breadth of use and detail. A map of a ci-
ty university campus tends to show the locations of laboratory and classrooms
in greater detail and accuracy than the non-university buildings surrounding
them. A map of the entire city, however, will generally have less local detail and
accuracy but broader application. So you will use the city map to get from the
airport to the university subway stop but use the university map to find your
way from the subway stop to the economics department.
3. Simplicity Has Value
In a recent paper, Bergh (2008) addresses what he deems a puzzle concerning
the “huge impact” of ERC and related work by Fehr and Schmidt (1999, hereaf-
ter FS). Bergh aims to critically examine “the merits of the models as a theory of
fairness and explanations of human behavior”. His basic premise is that “simply
put, a scientific explanation of a phenomenon needs to provide an answer to the
question of why it occurs”. He cites evidence that conflict with the models. In the
concluding discussion, he returns to ask “why a theory with rather limited ap-
plicability and no deeper explanatory power has become so widely popular and
heavily cited”. One potential explanation he offers is that “the theory was easy to
incorporate in other sub-disciplines, where it could be used to seemingly explain
central questions”. In fact, we would say that is exactly the point: Models like
ERC are successful precisely because they provide a simple and useful map, in
this case of how relative standing can influence decision making. Adding cogni-
tive, biological or other explanatory factors to the model would probably make it
‘truer’ or ‘deeper’ (if these terms can be adequately defined), but not necessarily
more useful.
To illustrate the point, we consider an example also cited by Bergh (2008), a
paper by Engelmann and Strobel (ES 2004) reporting laboratory tests of fairness
models. ES reject ERC and FS as an explanation for their data in favor of an
explanation that involves a preference for efficiency combined with self interest
and maximin (a fairness measure that makes somewhat different predictions
202 Gary E. Bolton and Axel Ockenfels
than the measures used by ERC or FS). This is basically the same combination
of preferences proposed by Charness and Rabin (2002). Engelmann and Strobel
go on to argue that a preference for efficiency, beyond what self interest can
explain, may play a more prominent role in the broader set of games to which
ERC and FS are typically applied. So, ES reject ERC and FS by claiming that
they focus on wrong motives.2
However, we think ES’s paper, and recent papers like it, are indicative of
confusion over what it is that social preference models are trying, and should be
trying, to achieve. As Alvin Roth (2002) puts it, “since we know that approxima-
tions aren’t precisely true, it is easy not to be impressed by evidence that they
are not”. ES may have shown that ERC and FS are wrong in the sense that they
do not get every possible lab experiment right. But, all models are approximati-
ons. This also holds for ES’s model, and it is easy enough to identify laboratory
games within exactly their own laboratory environment (which we will not re-
peat here in detail) in order to show that their model, too, is an approximation.
Consider, for instance, an ES kind of game in which the decision maker has
to choose one of two payoff distributions, A and B, over a total of six subjects.
He himself gets paid 8 regardless of his choice, and the other five subjects get
8, 8, 8, 15, 1, respectively, for alternative A, and 2, 2, 2, 33, 2, respectively, for
alternative B (all payoffs in Euro). Alternative B strictly maximizes both maxi-
min and efficiency, whereas alternative A is the ERC choice. In an experimental
study of this situation, 45 out of 48 subjects (94%) chose alternative A.3 Does
this failure disqualify maximin plus efficiency as a possible explanation of other
games? We don’t think so (although, the experiment suggests that subjects avoid
efficiency when it comes with costs to others). Social utility models cannot, and
we would say should not, aim to capture every behavior in all settings. Rather,
the challenge is to identify general principles of economic behavior that are use-
ful in organizing and predicting decision patterns. Our little experiment rejects
maximin plus efficiency as the ‘true’ explanation but it does not assess the use-
fulness of this approach. For this, one needs to analyze a wider, non-degenerate
range of economically salient situations. ERC and FS did exactly this.4
2 See Bolton and Ockenfels 2006 for a detailed reply to Engelmann and Strobel 2004.
3 Analogous to ES we kept the decision maker’s payoff fixed, played this game in strategy method,
and explicitly informed subjects about the fact that distribution B yielded a higher maximin payoff
and higher efficiency. We did not, however, choose a three-person game, because this would have
not allowed us to keep the decision-maker’s payoff fixed and to increase maximin and efficiency
while at the same time diminishing fairness as measured by ERC or FS.
4 Regarding salience, in the ERC paper we dealt with gaming situations that economists have been
traditionally concerned with, having to do with markets, bargaining, public goods, and the like;
games where the decision makers face meaningful trade-offs. The games ES examine, on the other
hand, involve a single decision maker who chooses one of three payoff allocations for three people.
In 8 of the 11 treatments, the decision maker had no payoff at stake. Both ERC and FS models
admit strictly self interested behavior. This means that for nearly three quarters of the treatments
in ES’s study, every choice available is consistent with both ERC and FS. The salience critique
also applies to the few games where decision maker stakes do vary with the decision, since the
expected value of the differentials (given that there was only a one third chance a decision makers
choice would count for payoff) are never greater than DM2/3 or about US$0.30. Moreover, in 2 of
Testing and Modeling Fairness Motives 203
ES’s response to their rejection of ERC and FS is that we need to develop mo-
dels that incorporate yet more motives; a logical response if you think the goal
is to explain all behavior in a single model. But, as we demonstrated above, the
question of which combination of motives should be included is far from obvious,
even within the restricted setting of ES’s experiment. To further illustrate the
point, if we go beyond ES’s setting, their approach becomes basically unmanage-
able, because the number of motives tends quickly to pile up. For instance, one
important place where self interest, maximin and efficiency unambiguously fail
is the ultimatum game, where all three of these motives imply that no positive
offer should be rejected, contrary to a mountain of data. As ES put it in an ear-
lier version of their comment, ultimatum game behavior “is only consistent with
a model based on efficiency, maximin preferences, selfishness, competitiveness,
and perceived intentions if the role of inequality aversion is relatively weak com-
pared to intentions and competitiveness”. Clearly, models based on such a large
number of motives as suggested by ES are unattractively complicated, intracta-
ble in more complex situations, and risk becoming tautological and less robust.
Summing up, they are not useful.
The challenge to the researcher is not to test which model is ‘true’, because
all models are approximations and so it is easy not to be impressed that they are
not. There is no hope that any tractable model can fully capture the complexities
of human decision making. Even if we focus only on motivations, things are
quickly getting complex; e.g., Selten (1990, 653) states: “There is no reason to
suppose that human behavior is guided by a few abstract principles. Nobody
should be surprised if it turns out that the motivational system is as complex
as the anatomy and physiology of the human body.” So, at least at this stage of
knowledge, the challenge to social decision making research is to find tractable
models that capture important drivers of social behavior in useful ways.
This view has implications for testing and modeling fairness. Most import-
antly, there is value in simplicity. A subway map that includes details about
the rolling stock, signaling, communication, power supply, fare collection, air-
conditioning systems, tunnel corrosion, noise, vibration, floating slabs, and the
temperature of the drivers’ coffee is not useful to travelers. Even if a perfectly
true model of the subway could be devised, one that is identical to the real world
subway, it would not be useful to most of us. Coming back to fairness in labora-
tory games, ERC is arguably one of the simplest formulations of the idea that
the 3 treatments where selfishness could matter, the selfish choice is taken, respectively, by just
10% and 23.3% of the subjects. (In the third treatment, the selfish choice is taken by a majority,
but the choice agrees with all the fairness and efficiency measures considered.) This poor showing
for selfishness is very different than what we observe in most other experiments, including in
experiments where fairness and reciprocity are important. But it has an easy explanation: There
is little opportunity in ES’s game to express self interested behavior. ES also want to convince the
reader that people care about efficiency as well as fairness. To an economist, the way you show
you care is by paying. In the only two treatments where efficiency is distinctive from the choice
of a purely self-interested subject, the expected ‘price’ for efficiency is never greater than DM1/3
or about US$0.15 (and the efficient choice is also the fair choice as measured by maximin). There
is not a single decision in the ES experiment that implies a subject is willing to pay any positive
amount to increase efficiency.
204 Gary E. Bolton and Axel Ockenfels
people care about status in social and economic interaction. So, if there is value
in simplicity, testing models against ERC need to discount more complex mo-
dels. For instance, a natural comparison for ES’s maximin plus efficiency is with
explanations of equal parsimony. Just as we can pair efficiency and maximin as
ES did, we can pair the other explanatory variables and compare to see what fits
best. If we do this, a combination of ERC plus maximin, two fairness motives,
explains more choices in 6 of ES’s treatments and less in none, than does ma-
ximin plus efficiency. In some treatments, the improvement in predictive power
is substantial; for example, in treatment E in ES, accuracy nearly doubles, from
39.7 percent to 76.4 percent. In other words, and contrary to a central claim by
ES, efficiency is not critical to explaining the data. That said, it is not our inten-
tion to push any particular explanation too hard: ERC plus efficiency also does
better than maximin plus efficiency. Combinations with FS do pretty well. ERC
plus FS explains some things other pairs do not.5
A quote from a recent survey by Cooper and Kagel (forthcoming) on other-
regarding preferences dealing with ERC and FS succinctly summarizes our
point: “It was clear at the time that both these papers were written that they
had to be ‘wrong’, but as one of my old teachers used to say ‘wrong in the right
way’.”
4. Conclusions: What Fairness Models Can Explain
The fairness models have attracted a good deal of interest and, together with
the work of a number of others, have triggered a new and larger wave of rese-
arch on the role of fairness and reciprocity in economic decision making. In our
view, the useful insights behind ERC and FS are three: First, simple measures
of fairness can approximate the fairness behavior of a population of people over
a broad set of games. Heretofore, many economists thought that individual ans-
wers to the question of ‘what-is-fair’ were too diffuse for fairness to be a useful
predictor of behavior. Of course, not everyone measures fairness the same way.
The claim is that the measure offered by ERC provides a good approximation the
fairness driven behavior of people in broad set of scenarios. Second, fairness can
explain acts of reciprocity. Fairness—sometimes in the guise of justice, someti-
mes in the guise of equity—and reciprocity are age old preoccupations of people
everywhere, and these models make explicit the intimate link between them.
Third, certain types of institutions, such as competitive markets, induce people
5 Nor does investigating the various motives separately support ES’s views. In the only treatments
in the experiment in which efficiency (study 3, treatment Ey), respectively maximin (study 3,
treatment R), makes a prediction distinctive from the other non-selfish motives on the table, the
distribution of choices made cannot be distinguished from random (p = .272 and .684, respectively,
Chi-squared test). Directly comparing the fairness measure proposed by FS and ERC, ES claim
that FS performs better (because, according to ES, FS is more in line with maximin concerns). But
in saying this, ES restrict themselves to study 1, where FS performs better in 3 out of 4 games.
Yet, ERC does better in all other games of the other two studies that yield distinctive predictions,
and so overall explains a larger percentage of all choices than FS in 5 out of 8 cases.
Testing and Modeling Fairness Motives 205
to behave as if they are completely materially self interested. Hence, it is not
that peoples’ concerns for fairness are confined to, say, political or legal spheres,
rather the institutional structure shapes the expression of these concerns. For
some institutions, a model that takes material self interest as the sole driver of
behavior sacrifices little accuracy.
At the same time, ERC and other models are not ‘true’ in the sense of cap-
turing all relevant factors of social decision making. In fact, the controls in the
laboratory allow to ‘falsify’ any model yielding testable predictions, just because
by definition approximations are not exactly true. In our view, the most promi-
sing challenge is to better understand the role of procedures in social decision
making. Simple models like ERC have directed interest towards this and re-
lated research questions. In Bolton, Brandts and Ockenfels (1998), we designed
the first laboratory study in experimental economics investigating the role of ‘in-
tentionality’, and in Bolton, Brandts and Ockenfels (2005) we designed the first
laboratory study in experimental economics investigating the role of ‘procedural
fairness’ in social behavior. We also believe that more research is needed regar-
ding the role of reference group and reference point formation in social beha-
vior (Bolton and Ockenfels 2005). Beyond such motivational aspects of decision
making, it is an exciting endeavor to complementarily investigate the cogniti-
on, biology and sociology behind social behavior, and to put the findings to real
world tests (Bolton, Greiner and Ockenfels 2009; Bolton and Ockenfels 2008;
Ockenfels 2009). That said, we believe that a concern for one’s relative position
will persist to take a central role in our understanding of social decision making:
a model that turns out to be useful is likely to capture a grain of truth.
References
Bergh, A. (2008), “A Critical Note on the Theory of Inequity Aversion”, The Journal of
Socio-Economics 37, 1789–1796.
Bolton, G. E. (forthcoming), “Testing Models and Internalizing Context: A Comment on
Vernon Smith’s ‘Theory and Experiment: What Are the Questions?’”, Journal of Eco-
nomic Behavior and Organization.
—, J. Brandts and A. Ockenfels (1998), “Measuring Motivations for the Reciprocal Re-
sponses Observed in a Simple Dilemma Game”, Experimental Economics 1, 207–219.
—, — and — (2005), “Fair Procedures: Evidence from Games Involving Lotteries”, Eco-
nomic Journal 115, 1054–76.
—, B. Greiner and A. Ockenfels (2009), “Engineering Trust—Reciprocity in the Produc-
tion of Reputation Information”, Working paper, University of Cologne.
— and A. Ockenfels (1998), “Strategy and Equity: An ERC-Analysis of the Güth-van
Damme Game”, Journal of Mathematical Psychology 42, 215–26.
— and — (2000), “ERC: A Theory of Equity, Reciprocity and Competition”, American
Economic Review 90(1), 166–193.
206 Gary E. Bolton and Axel Ockenfels
— and — (2005), “A Stress Test of Fairness Measures in Theories of Social Utility”, Eco-
nomic Theory 25(4), 957–82.
— and — (2006), “Measuring Efficiency and Equity Motives: A Comment on ‘Inequality
Aversion, Efficiency, and Maximin Preferences in Simple Distribution Experiments’”,
American Economic Review 96(5), 1906–11.
— and — (2008), “Does Laboratory Mirror Behavior in Real World Markets? Fair Bar-
gaining and Competitive Bidding on eBay”, Working Paper Series in Economics, No.
36, University of Cologne.
Charness, G. and M. Rabin (2002), “Understanding Social Preferences with Simple
Tests”, The Quarterly Journal of Economics 117(3), 817–869.
Cooper, D. and J. Kagel (forthcoming), “Other Regarding Preferences: A Selective Sur-
vey of Experimental Results”, in: Handbook of Experimental Economics.
Engelmann, D. and M. Strobebl (2004), “Inequality Aversion, Efficiency, and Maximin
Preferences in Simple Distribution Experiments”, American Economic Review 94(4),
857–869.
Fehr, E. and K. M. Schmidt (1999), “A Theory of Fairness, Competition, and Cooperati-
on”, Quarterly Journal of Economics 114, 817–868.
Kosfeld, M., M. Heinrichs, P. J. Zak, U. Fischbacher and E. Fehr (2005), “Oxytocin In-
creases Trust in Humans”, Nature 435, 673–676.
Ockenfels, A. (2009), “Marktdesign und Experimentelle Wirtschaftsforschung”, Perspek-
tiven der Wirtschaftspolitik 10, 31–53.
Roth, A. E. (2002), Slides Prepared for Al’s “Experimental Economics” class, mimeo.
Selten, R. (1990), “Bounded Rationality”, Journal of Institutional and Theoretical Eco-
nomics 146, 649–58.
Williams, L. E. and J. A. Bargh (2008), “Experiencing Physical Warmth Promotes Inter-
personal Warmth”, Science 322(5901), 606–607.