Prison Accountability and Performance Measures, Volokh Emory L. J. 2014

Prison Accountability and Performance Measures, Volokh Emory L. J. 2014

• Feb. 10, 2016 • Locations: United States of America • Topics: Cost of Prison Systems
Download original document:
Brief thumbnail
This text is machine-read, and may contain errors. Check the original document to verify accuracy.
PRISON ACCOUNTABILITY AND
PERFORMANCE MEASURES
Alexander Volokh*
A few decades of comparative studies of public vs. private performance have failed to give a strong edge to either sector in terms of
quality. That supposed market incentives haven’t delivered spectacular
results is unsurprising, since by and large market incentives haven’t
been allowed to work: outcomes are rarely measured and are even
more rarely made the basis of compensation, and prison providers are
rarely given substantial flexibility to experiment with alternative models.
This Article argues that performance measures should be implemented more widely in evaluating prisons. Implementing performance
measures would advance our knowledge of which sector does a better
job, facilitate a regime of competitive neutrality between the public and
private sectors, promote greater clarity about the goals of prisons, and,
perhaps most importantly, allow the use of performance-based contracts.
Performance measures and performance-based contracts have
their critiques, for instance: (1) the theoretical impossibility of knowing
the proper prices, (2) the ways they would change the composition of
the industry, for instance by reducing public-interestedness or discouraging risk-averse providers, and (3) potentially undesirable strategic
behavior that would result, for instance manipulation in the choice of
goals, distortion of effort away from hard-to-measure dimensions or
away from hard-to-serve inmates, or outright falsification of the numbers. I argue that these concerns are serious but aren’t so serious as to
preclude substantial further experimentation.

*
Assistant Professor, Emory Law School, avolokh@emory.edu. I am grateful to Michael J.
Broyde, Russell C. Gabriel, Leonard Gilroy, Linda Hardyman, Erica J. Hashimoto, Christina Mulligan, Carl Nink, Usha Rodrigues, Sarah M. Shalf, and the participants at the Emory/UGA joint faculty colloquium for their input and assistance. I am also grateful to Kedar Bhatia and Julia Hueckel for
their able research assistance, and to the law librarians at Emory Law School. [*Need to insert
Thrower acknowledgment.]

forthcoming EMORY L.J. (2014)

Electronic copy available at: http://ssrn.com/abstract=2336155

2

VOLOKH

TABLE OF CONTENTS
I. Introduction ..................................................................................... 4
II. The Failure of Comparative Effectiveness Studies........................ 9
A. Which Sector Costs Less? ................................................... 10
1. Difficulties in Calculating Costs .................................... 10
2. Competing Cost Estimates ............................................. 12
B. Which Sector Provides Higher Quality? .............................. 15
1. Difficulties in Figuring Out Quality............................... 15
2. Which Sector Leads to Less Recidivism? ...................... 19
C. The Limits of Comparative Effectiveness ........................... 23
III. Why Use Performance Measures?.............................................. 26
A. The Puzzle of Prisons? ........................................................ 26
B. Accountability, Flexibility, and Neutrality .......................... 31
1. To Know What Works ................................................... 31
2. To Implement Competitive Neutrality ........................... 32
3. To Express What We Want ............................................ 34
C. For Performance-Based Contracting.................................... 35
1. Limited Current Efforts .................................................. 35
2. The Range of Possible Contracts ................................... 37
3. The Feasibility of Merit Pay in the Public Sector .......... 46
D. What Measures to Choose ................................................... 47
IV. Concerns and Critiques .............................................................. 55
A. What Prices to Set................................................................ 55
B. Effects on Market Structure ................................................. 58
1. Public-Interestedness ..................................................... 58
2. Risk and Capital Requirements ...................................... 60
C. Undesirable Strategic Behavior ........................................... 65
1. Manipulating the Goals .................................................. 66
2. Distortion Across Dimensions of Performance.............. 68
3. Distortion Across Types of Inmates............................... 73
4. Falsifying Performance Measures .................................. 74
V. Conclusion ................................................................................... 77

Draft—Please do not circulate

Electronic copy available at: http://ssrn.com/abstract=2336155

2013]

PERFORMANCE MEASURES

3

Here arises a feature of the Circumlocution Office, not
previously mentioned in the present record. When that admirable Department got into trouble, and was, by some infuriated members of Parliament . . . attacked on the merits
. . . as an Institution wholly abominable and Bedlamite;
then the noble or right honourable [member] who represented it in the House, would smite that member and cleave
him asunder, with a statement of the quantity of business
(for the prevention of business) done by the Circumlocution
Office. Then would that noble or right honourable [member] hold in his hand a paper containing a few figures, to
which, with the permission of the House, he would entreat
its attention. . . . Then would the noble or right honourable
[member] perceive, sir, from this little document, which he
thought might carry conviction even to the perversest mind
. . . , that within the short compass of the last financial halfyear, this much-maligned Department . . . had written and
received fifteen thousand letters . . . , had written twentyfour thousand minutes . . . , and thirty-two thousand five
hundred and seventeen memoranda . . . . [T]he sheets of
foolscap paper it had devoted to the public service would
pave the footways on both sides of Oxford Street from end
to end, and leave nearly a quarter of a mile to spare for the
park . . . ; while of tape—red tape—it had used enough to
stretch, in graceful festoons, from Hyde Park Corner to the
General Post Office. . . . No one . . . would [then] have the
hardihood to hint that the more the Circumlocution Office
did, the less was done, and that the greatest blessing it
could confer on an unhappy public would be to do nothing.
— Charles Dickens, Little Dorrit1
The results obtained from ENRD’s civil and criminal
cases in fiscal year 2012 alone were outstanding. We secured over $397 million in civil and stipulated penalties,
cost recoveries, natural resource damages, and other civil
1
CHARLES DICKENS, LITTLE DORRIT, bk. 2, ch. 8, at 489–90 (Wordsworth Classics 1996)
[1855–57].

Draft—Please do not circulate

Electronic copy available at: http://ssrn.com/abstract=2336155

4

VOLOKH

monetary relief, including almost $133 million recovered
for the Superfund. We obtained over $6.9 billion in corrective measures through court orders and settlements, which
will go a long way toward protecting our air, water and
other natural resources. We concluded 47 criminal cases
against 83 defendants, obtaining nearly 21 years in confinement and over $38 million in criminal fines, restitution,
community service funds and special assessments.
— DOJ’s Environment & Natural Resources Division
Annual Report, 20122

I. INTRODUCTION
“Isn’t everything to be said on [private prisons] already in
print?” asks Sharon Dolovich.3 She means the question to be merely rhetorical; and so do I.4 The comparative effectiveness debate, to
the extent it’s relevant5—and I think it is6—has stalled, simply because the empirical literature, exhaustive as it is, is so bad. “The
current weight of the evidence on prison privatization in the United
States is so light that it defies interpretation,” write prison researcher Gerald Gaes and his coauthors.7 (The theory isn’t much
better: the same authors characterize prison performance as a “theoretically bereft domain.”8) To intelligently choose between public
and private provision, we should at least know which sector costs

2
U.S. DEP’T OF JUST., ENV’T & NAT. RES. DIV., ENRD ACCOMPLISHMENTS REPORT, FISCAL
YEAR 2012, at 4–5 (2013).
3
Sharon Dolovich, How Privatization Thinks, in GOVERNMENT BY CONTRACT: OUTSOURCING
AND AMERICAN DEMOCRACY 128, 129 (Jody Freeman & Martha Minow eds., 2009).
4
Not that her perspective is the same as mine, but we both agree that there’s still something
left to say on the subject.
5
Dolovich herself is wary of premature engagement with the comparative effectiveness debate
without having sorted through the necessary normative issues beforehand. See id. at 128–29; Sharon
Dolovich, State Punishment and Private Prisons, 55 DUKE L.J. 437, 447 n.20 (2005).
6
See Alexander Volokh, Privatization and the Elusive Employee-Contractor Distinction, 46
UC DAVIS L. REV. 133 (2012).
7
GERALD G. GAES ET AL., MEASURING PRISON PERFORMANCE: GOVERNMENT PRIVATIZATION
AND ACCOUNTABILITY 184 (2004).
8
Id. at 123.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

5

less, but we don’t; and we should at least know which sector provides higher quality, but we don’t have a great sense of that either.9
This seems puzzling: readers of the voluminous debate on private prisons can be forgiven for thinking that market incentives
should make private prison firms either (1) cut wasteful expenditures and produce innovative services10 or (2) cut corners on essential inmate care and security and lead to a humanitarian disaster.11
Let’s focus on the positive claims for private prisons: if the private
sector is so clearly superior, shouldn’t the difference hit us between the eyes?12
On second thought, this isn’t so puzzling after all. The advantages of market provision are often said to be that, what with
the rigidities and low-incentive structure of government agencies,
private firms have greater incentive and greater flexibility to figure
out how to achieve any desired level of quality. But this assumes
that (1) particular levels of quality are desired or encouraged, and
(2) private firms are given the flexibility to achieve these levels. It
turns out that both of these assumptions are wrong.
Let’s take the quality problem first. Why not tally up the quality at a public prison, do the same at a comparable private prison,
and compare the two quality measures? The trouble here is that—
despite the scores of studies that have been produced purporting to
measure quality differences—good performance measures are rarely used. As I document in Part II, this means that comparative
quality studies are hard to interpret if one wants to know which
sector is better. (This hasn’t prevented both partisans and detractors of private prisons from producing loosely reasoned pieces that
oversell the findings of their favorite studies.)

9
These aren’t the only things we should know. For instance, we can also care about where accountability is greater, which sector might be more likely to push the substantive criminal law in a
more pro-incarceration direction, and the like. See, e.g., Developments in the Law—The Law of
Prisons, 115 HARV. L. REV. 1838, 1868–91 (2002) (my student note); Alexander Volokh, Privatization and the Law and Economics of Political Advocacy, 60 STAN. L. REV. 1197 (2008).
10
See, e.g., GEOFFREY F. SEGAL & ADRIAN T. MOORE, WEIGHING THE WATCHMEN: EVALUATING THE COSTS AND BENEFITS OF OUTSOURCING CORRECTIONAL SERVICES, PART II: REVIEWING
THE LITERATURE ON COST AND QUALITY COMPARISONS (Reason Pub. Pol’y Inst., Pol’y Study No.
290, Jan. 2002); Samuel Jan Brakel & Kimberly Ingersoll Gaylord, Prison Privatization and Public
Policy, in CHANGING THE GUARD: PRIVATE PRISONS AND THE CONTROL OF CRIME 125, 134–43
(Alexander Tabarrok ed., 2003).
11
See, e.g., Dolovich, supra note 5, at 474–80.
12
See, e.g., Philippe C. Schmitter, The “Organizational Development” of International Organizations, 25 INT’L ORG. 917, 932 (1971) (“interocular impact test”).

Draft—Please do not circulate

6

VOLOKH

It doesn’t have to be that way. Criminologists have produced
no shortage of performance measures that are appropriate for evaluating prisons, using variables like in-prison violence, the quality
of prison health care, the degree of crowding, and—which I think
is immensely important—recidivism.13 The most important thing
about a performance measure is that it measure performance, that
is, outcomes. Inputs like money spent, guards hired, or programs
offered are of quite limited value, since the whole point is to see
whether the money spent is worthwhile, whether the guards hired
are necessary, and whether the programs are effective. Outputs like
the number of doctor visits or the number of graduates of rehabilitative programs—like the number of memos written by Dickens’s
Circumlocution Office14 or the number of years of prison resulting
from DOJ prosecutions15—are also of limited value. Doctor visits
might just be make-work; the rehabilitative programs may not actually be rehabilitative. (The Circumlocution Office, whose function is to prevent things from being done,16 has a zero or negative
contribution to performance; and the prosecutions that maximize
prison time aren’t necessarily the same as those that most improve
the environment.) What we care about—prisoner health, decent
conditions, actual rehabilitation—are the outcomes that we should
actually measure, to the extent possible. 17
Why should we use performance measures? There are several
reasons, which I canvass in Part III.
First, it’s good just to know whether the public or private sector
has higher quality, for instance in evaluating whether one’s state
should outsource or insource a particular project, or be one of the
19 states that ban private prisons altogether.18 Naturally, many factors determine performance other than the quality of the manage13
I first (briefly) advocated performance measures for prison accountability in my student note.
See Developments, supra note 9, at 1887–88.
14
See text accompanying supra note 1.
15
See text accompanying supra note 2.
16
DICKENS, supra note 1, bk. 1, ch. 10, at 101–18.
17
See also BERYL A. RADIN, CHALLENGING THE PERFORMANCE MOVEMENT: ACCOUNTABILITY, COMPLEXITY, AND DEMOCRATIC VALUES 15–16 (2006) (defining “input,” “output,” “outcome,”
and other terms).
18
See E. ANN CARSON & WILLIAM J. SABOL, PRISONERS IN 2011, at 32 appx. tbl. 15 (U.S.
Dep’t of Just., Bur. of Just. Stats., Dec. 2012) (listing Arkansas, Delaware, Illinois, Iowa, Maine,
Massachusetts, Michigan, Minnesota, Missouri, Nebraska, Nevada, New Hampshire, New York,
North Dakota, Oregon, Rhode Island, Utah, Washington, and West Virginia as states with no inmates in private prisons in 2011).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

7

ment and the facilities: for instance, a prison can have better performance numbers because it was sent a better crop of people. But
certainly having performance measures is better than useless.
Second, it would help to implement a regime of competitive
neutrality, where the public and private sectors could bid against
each other and individual projects could shuffle from one sector to
another. Competitive neutrality might be better than an all-public
or all-private regime, but to implement it properly, the auctions
should be evenhanded, which means that proposed costs and proposed quality targets should be fairly comparable. Performance
measures would allow a winning contractor to commit to deliver a
particular level of performance, and would allow governments to
levy the appropriate contractual fine if this level isn’t achieved.
Third, it would help policymakers express what’s desirable in
prisons. One would think that this had been done already; but prison contracts are written in input and output terms because this is
largely how the industry works and thinks. Performance measures
have been a byproduct of the debate over prison privatization: the
different sides in the debate needed them to argue in favor of or
against privatization; and the development of these measures has in
turn spurred serious thinking about what prisons should accomplish, which has had accountability benefits for the public sector as
well.
Perhaps most importantly, the use of performance measures
would allow the spread of performance-based contracting, where—
instead of levying a fine for not delivering a particular level of performance—the contract fee varies continuously with the level of
performance delivered. Once accountability is tied to actual performance, giving prison providers the flexibility to choose how to
do their job becomes more attractive.
Part IV discusses critiques of using performance measures as
part of a compensation scheme.
One concern is that the true social benefits of various aspects
of performance are unknowable, either in principle or in practice,
so that determining the proper prices will inevitably fail. Where a
service is closely bound up with justice concerns, a focus on efficiency pricing may be inappropriate: it might demean the service
or give insufficient weight to non-efficiency goals.
A second problem is that the use of performance measures will
alter the composition of providers in the industry, in ways that are

Draft—Please do not circulate

8

VOLOKH

perhaps undesirable. One way this might happen is that, in the
presence of monetary incentives, public-interested people may be
less attracted to corrections. A different way performance
measures can alter the composition of the industry is by increasing
risk for providers. Providers can only control inputs, and the connection between inputs and outcomes is highly variable, because it
depends on a great many variables, many of which are beyond the
prison’s control—such as general social conditions or the underlying quality of the inmates. The relationship between any of these
variables and outcomes is not very well known. One might care
about the fairness of rewarding or penalizing providers based on
factors beyond their control, though in an auction system, such
windfalls will be canceled out by competitive bidding. More seriously, the riskiness might bias the set of available providers in favor of the largest and best-capitalized firms, and perhaps discourage experimentation with risky but promising techniques. This
means that the sensitivity of price to outcomes might have to be
limited, which might also limit the incentive effects.
A third problem is that providers may engage in undesirable
strategic behavior. They might manipulate the performance goals
so they reflect goals that are easy to meet. They might focus their
effort on the measurable dimensions of performance and slight the
unmeasurable ones. (For example, what are the true outcomes of
the justice system? Some outcomes, like case backlogs, are measurable, but other important outcomes, like accuracy of adjudication, aren’t—and measuring one runs the risk of distorting the
agency’s effort away from the unmeasured outcomes.19) Similarly,
providers will want to choose the easiest-to-treat populations
(“creaming” or “cherry-picking”), and (given a population) fail to
treat the hardest-to-treat members (“parking”). And, of course, any
system based on particular numbers comes with the risk that someone might try to falsify the numbers.
The good news is that, for prisons, there’s hope that these concerns can be fairly addressed. At the very least, these concerns
don’t seem so serious as to preclude far more experimentation than
19
One might think that the reversal rate is a measure of accuracy of adjudication. But this isn’t
true, because (1) the cases selected for appeal aren’t random (in the absence of some special process
to verify accuracy), and (2) given deferential standards of review, judges can work to insulate their
decisions from appellate review if they’re so inclined—for instance, by making them more intensely
fact-based.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

9

has been happening so far. We actually have access to reasonably
good performance measures that reasonably cover the important
dimensions of prison quality, none of which have to be limited to
efficiency-based measures. These measures should be set by corrections departments, not by contractors. Riskiness can be addressed, at least in part, by only making part of the payment depend on performance. Social Impact Bonds have some promise in
encouraging nonprofit-sector financing; in any event, the prison
market is already highly concentrated, so there is no current vast
population of nonprofits and small companies to lose. Cherrypicking can be addressed by giving contractors no say in what inmates they’re given, and parking can be addressed, at least in part,
by making monetary rewards depend on observable characteristics
of the inmate (if, indeed, it’s a problem at all). Outright falsification of performance measures is a serious problem, which requires
seriously investing in monitoring and ensuring robust disclosure
regimes.
None of these are perfect fixes, but we don’t need perfection;
we just need an improvement over the status quo.
II. THE FAILURE OF COMPARATIVE EFFECTIVENESS STUDIES
It’s somewhat surprising that, for all the ink spilled on private
prisons over the last thirty years, we have precious little good information on what are surely the most important questions: when it
comes to cost or quality, are private prisons better or worse than
public prisons?
It’s safe to say that, so far at least, the political process hasn’t
encouraged rigorous comparative evaluations of public and private
prisons. Some states allow privatization without requiring cost and
quality evaluations at all.20 The 19 states that don’t privatize at all21
might, for all I know, be right to do so, but of course their stance
doesn’t promote comparative evaluation.
When studies are done, they’re usually so inadequate from a
methodological perspective that we can’t reach any firm comparative conclusions. Section A below discusses the problems with cost
20
See Developments, supra note 9, at 1873–74; Alexis M. Durham III, Evaluating Privatized
Correctional Institutions: Obstacles to Effective Assessment, FED. PROBATION, June 1988, at 65, 67.
21
See supra note 18 and accompanying text.

Draft—Please do not circulate

10

VOLOKH

comparison studies, and Section B discusses the problems with
quality comparison studies. Section C takes a broader view and
notes that even well-done comparative effectiveness studies don’t
answer all our questions.
A. Which Sector Costs Less?
1. Difficulties in Calculating Costs
How do we determine whether the private sector costs more or
less than the public sector? Ideally, we could work off of a large
database of public and private prisons and run a regression in
which we controlled for jurisdiction, demographic factors, size,
and the like. In practice, this large database doesn’t exist, and so
the typical study chooses a small set of public and private prisons
that are supposedly comparable.
Unfortunately, this comparability tends to be elusive; the public and private facilities compared often “differ in ways that confound comparison of costs.”22 Sometimes no comparable facilities
exist.23 Even where there are two prisons in the jurisdiction housing inmates of the same sex and security classification, they generally differ in size, age, level of crowding, inmate age mix, inmate
health mix, and facility design.24 In particular, adjusting facilities
to take into account different numbers of inmates is problematic,
since facilities with more inmates, other things equal, benefit from
economies of scale.25
The GAO explained recently that “[i]t is not currently feasible
to conduct a methodologically sound cost comparison of BOP and
private low and minimum security facilities because these facilities
differ in several characteristics and BOP does not collect compara22

DOUGLAS MCDONALD ET AL., PRIVATE PRISONS IN THE
OF CURRENT PRACTICE 33 (Abt Assocs. Inc., 1998).

UNITED STATES: AN ASSESSMENT

23

Id. at 45 (making this claim about the Arizona facilities compared in CHARLES W. THOMAS,
COMPARING THE COST AND PERFORMANCE OF PUBLIC AND PRIVATE PRISONS IN ARIZONA (1997));
see also SCOTT D. CAMP & GERALD G. GAES, PRIVATE PRISONS IN THE UNITED STATES, 1999: AN
ASSESSMENT OF GROWTH, PERFORMANCE, CUSTODY STANDARDS, AND TRAINING REQUIREMENTS
15 (Fed. Bur. of Prisons 2000).
24
Id. at 34–35; see also Robert B. Levinson, Okeechobee: An Evaluation of Privatization in
Corrections, PRISON J., Oct. 1985, at 77.
25
Gerry Gaes, Cost, Performance Studies Look at Prison Privatization, NIJ JOURNAL, Mar.
2008, at 32, 34; Douglas C. McDonald, The Costs of Operating Public and Private Correctional
Facilities, in PRIVATE PRISONS AND THE PUBLIC INTEREST 86, 101 (Douglas C. McDonald ed.,
1990).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

11

ble data to determine the impact of these differences on cost.”26
The data problem mostly comes from the private side: information
collected by BOP from private facilities isn’t necessarily reported
the same way that public data is reported, and the reliability of the
data is uncertain.27 Moreover, “[w]hile private contractors told us
that they maintain some data for their records, these officials said
that the data are not readily available or in a format that would enable a methodologically sound cost comparison at this time.”28
Not only do federal regulations not require that this data be collected,29 but also, and more troublingly, at the time of the GAO
study in 2007, the BOP didn’t believe there was value in developing the data collection methods that would make valid publicprivate cost comparison methods possible.30
Probably more seriously, public and private prisons have accounting procedures that “make the very identification of comparable costs difficult.”31
First, public systems, unlike private ones, don’t spread the
costs of capital assets over the life of the assets, which overstates
public costs when the assets are acquired and understates them in
all other years.32
Second, various public expenditures, including employee benefits and medical, utilities, legal work, insurance, supplies and
equipment, and various contracted services, are often borne by various other agencies in government, which might understate public
costs by 30–40%.33 One of the often-ignored costs in the public
sector is the cost of borrowing capital.34 Conversely, governments
bear some of the costs of private firms, for instance, in various cases, contract monitoring, inspection and licensing, personnel train-

26

COST OF PRISONS: BUREAU OF PRISONS NEEDS BETTER DATA TO ASSESS ALTERNATIVES
FOR ACQUIRING LOW AND MINIMUM SECURITY FACILITIES 4 (Gov’t Accountability Off., 2007).
27

Id. at 12–13.
Id. at 5, 11–12.
29
Id. at 13.
30
Id. at 7, 19, 30. The BOP’s view seems to have been chiefly based on the fact that it used
private contractors to run facilities for criminal aliens and wasn’t expecting to receive funding to run
its own. Id. at 7, 19, 30. The BOP also believed that the Taft cost study, see text accompanying infra
notes 55–58, was already a sufficient cost study. COST OF PRISONS, supra note 26, at 7, 19, 21, 30.
31
MCDONALD ET AL., supra note 22, at 33; McDonald, supra note 25, at 88–89, 97–100.
32
MCDONALD ET AL., supra note 22, at 35.
33
Id. at 36.
34
See McDonald, supra note 25, at 106.
28

Draft—Please do not circulate

12

VOLOKH

ing, inmate transportation, case management, and emergency response teams.35
And third, when public or private prisons incur overhead expenditures, there’s no obvious way of allocating overhead to particular facilities—Gerald Gaes gives a specific numerical example
involving Oklahoma, a high-privatization state, where a difference
in overhead accounting can alter the estimate of the cost of privatization by 7.4%.36
As a bottom-line matter, McDonald says “the uncounted costs
of public operation are probably larger than of private operation”;37
I tend to agree, but it’s hard to say for sure.
2. Competing Cost Estimates
The best way to see the importance of various assumptions is
to look at a handful of cases where different people tried to estimate the same cost. Without committing myself to which way is
correct, I’ll provide three examples: from Texas in 1987, from
Florida in the late 1990s, and from the federal Taft facility in
1999–2002.
a. Texas
In Texas, private prisons were authorized in 1987 with the passage of SB 251,38 which required that private prisons show a 10%
savings to the state compared to public prisons.39 Calculating the
per-diem cost of public incarceration in Texas thus became important, since the maximum contract price for private providers
would be 90% of that cost.
35

MCDONALD ET AL., supra note 22, at 36–37.
Gerald G. Gaes, The Current Status of Prison Privatization Research on American Prisons,
at 17–18 (Aug. 2010); id. at 17 (“Other complications arise from the appropriate treatment of property, sales, or income taxes paid by private contractors, as well as profits from inmate phone calls and
commissary accounts.”); see also MCDONALD ET AL., supra note 22, at 37. Private companies are
also loath to divulge their own financial details. See McDonald, supra note 25, at 89; NSW PARLIAMENT, LEGIS. ASSEMBLY, PUB. ACCOUNTS COMM., VALUE FOR MONEY FROM NSW CORRECTIONAL
CENTRES 23 (Rep. No. 13/53 (No. 156), Sept. 2005); FLA. LEGIS., OFF. OF PROG. POL’Y ANAL. &
GOV’T ACCOUNTABILITY, PERFORMANCE AUDIT OF THE GADSDEN CORRECTIONAL INSTITUTION 2
(Rep. No. 95-48, 1996).
37
McDonald, supra note 25, at 100.
38
C. ELAINE CUMMINS, PRIVATE PRISONS IN TEXAS, 1987–2000, at 15 (doctoral thesis, American Univ., Washington, D.C., 2001) (on file with author).
39
Id. at 42; see also TEX. GOV’T CODE § 495.003(c)(4).
36

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

13

The Texas Department of Corrections40 came up with an estimate of $27.62.41 The Legislative Budget Board, however, proposed a number of additions to this cost, to better take into account
the costs of complying with Ruiz v. Estelle,42 building costs, the
state’s cost to provide additional programs that private firms would
be required to provide, and the like.43 All these adjustments raised
the estimated per-diem cost by about 50%—to $41.67.44 In the
end, contracts were awarded within a range of $28.72 to $33.80—
between the two estimates, though closer to the first one.45
b. Florida
In Florida, the Office of Program Policy Analysis and Government Accountability (OPPAGA) compared two private facilities, Bay Correctional Facility and Moore Haven Correctional Facility, with a public facility, Lawtey Correctional Institution.46 After various adjustments, OPPAGA calculated that the per-diem operating cost was $46.08 at Bay and $44.18 at Moore Haven, versus
$45.98 at Lawtey; that is, Bay was 0.2% more expensive and
Moore Haven 3.9% cheaper than the public facility. 47
The Florida Department of Corrections had come up with its
own numbers: $45.04 at Bay and $46.32 at Moore Haven, versus
$45.37 at Lawtey:48 Bay was 0.7% cheaper and Moore Haven
2.1% more expensive.
The Corrections Corporation of America, which operated Bay,
submitted comments to the OPPAGA report, disputing its analy-

40
Now absorbed into the Texas Department of Criminal Justice. See TEX. DEP’T OF CRIM.
JUST., AGENCY STRATEGIC PLAN, FISCAL YEARS 2013–17, at 2 (2012).
41
CUMMINS, supra note 38, at 155.
42
503 F. Supp. 1265 (S.D. Tex. 1980).
43
CUMMINS, supra note 38, at 156–57.
44
Id. at 156 tbl.9.
45
Id. at 158. One facility received an extra $7.41 for an “intensive substance abuse treatment
program.” Id.; see also GAES ET AL., supra note 7, at 87–88.
46
STATE OF FLA., OFFICE OF PROGRAM POL’Y ANALYSIS & GOV’T ACCOUNTABILITY, REVIEW
OF BAY CORRECTIONAL FACILITY AND MOORE HAVEN CORRECTIONAL FACILITY, at 9 (Report No.
97-68, 1998) [hereinafter OPPAGA].
47
Id. at 9 exh.4.
48
FLA. DEP’T OF CORR., Budget, in 1996–97 ANNUAL REPORT, available at http://www.dc.
state.fl.us/pub/annual/9697/budget.html. These estimates were analyzed in FLA. DEP’T OF CORR.,
PRIVATIZATION IN THE FLORIDA DEPARTMENT OF CORRECTIONS (1998). See GAES ET AL., supra
note 7, at 191 n.4.

Draft—Please do not circulate

14

VOLOKH

sis.49 It disagreed that Lawtey was comparable,50 and suggested its
own adjustments to OPPAGA’s numbers for all three facilities.
Under CCA’s analysis, Bay cost $45.16 and Moore Haven cost
$46.32, versus $49.30 for Lawtey, which comes out to cost savings
of 8.4% for Bay and 6.0% for Moore Haven.51 (OPPAGA, understandably, disupted CCA’s modifications.52)
c. Taft
Perhaps the best example of competing, side-by-side cost studies comes from the evaluation of the federal facility in Taft, California, operated by The GEO Group.
A Bureau of Prisons cost study by Julianne Nelson compared
the costs of Taft in fiscal years 1999 through 2002 to those of three
federal public facilities: Elkton, Forrest City, and Yazoo City.53
The Taft costs ranged from $33.42 to $38.62; the costs of the three
public facilities ranged from $34.84 to $40.71. Taft was cheaper
than all comparison facilities and in all years, by up to $2.42 (about
6.6%)—except in fiscal year 2001, when the Taft facility was more
expensive than the public Elkton facility by $0.25 (about 0.7%).54
Sloppily averaging over all years and all comparison institutions,
the savings was about 2.8%.
A National Institute of Justice study by Douglas McDonald and
Kenneth Carlson55 found much higher cost savings. They calculated Taft costs ranging from $33.25 to $38.37, and public facility
costs ranging from $39.46 to $46.38.56 Private-sector savings
ranged from 9.0% to 18.4%. Again averaging over all years and all
comparison institutions, the savings was about 15.0%: the two cost
49
OPPAGA, supra note 46, at 55–61 (Corrections Corporation of America’s comments, with
OPPAGA’s comments interspersed throughout).
50
Id. at 57.
51
Id. at 61.
52
Id. at 59.
53
JULIANNE NELSON, COMPETITION IN CORRECTIONS: COMPARING PUBLIC AND PRIVATE SECTOR OPERATIONS 10 (CNA Corp., 2005).
54
Id. at 42 fig.5. The study also compared actual GEO costs to hypothetical costs if Taft had
been kept in-house. This comparison gave the edge to the public sector, id. at 25–26, but I don’t
stress this result because it’s based on a comparison with a hypothetical public institution, not on
actual public-sector costs.
55
DOUGLAS C. MCDONALD & KENNETH CARLSON, CONTRACTING FOR IMPRISONMENT IN THE
FEDERAL PRISON SYSTEM: COST AND PERFORMANCE OF THE PRIVATELY OPERATED TAFT CORRECTIONAL INSTITUTION (Abt Assocs. Inc., 2005).
56
Id. at 48 tbl.2.18.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

15

studies differ in their estimates of private-sector savings by a factor
of about five.
Why such a difference? First, the Nelson study (but not the
McDonald and Carlson study) adjusted expenditures to iron out
Taft’s economies of scale from handling about 300 more inmates
each year than the public facilities.57 Second, the studies differed in
what they included in overhead costs, with the Nelson study allocating a far higher overhead rate.58
These examples should be enough to give a sense of the complications in cost comparisons; given these difficulties, it’s not surprising that most studies have fallen short.
B. Which Sector Provides Higher Quality?
1. Difficulties in Figuring Out Quality
Moving on to quality comparisons, the picture is similarly
grim. As with cost comparisons, sometimes no comparable facility
exists in the same jurisdiction.59 Some studies solve that problem
by looking at prisons in different jurisdictions, an approach that
has its own problems.60 (If one had a large database with several
prisons in each jurisdiction, one could control for the jurisdiction,
but this approach is of course unavailable when comparing two
prisons, each in its own jurisdiction.) Many studies just don’t control for clearly relevant variables in determining whether a facility
is truly comparable.61
57

Gaes, supra note 25, at 34.
Id. at 34–35; Gaes, supra note 36, at 20.
59
MCDONALD ET AL., supra note 22, at 54–55 (discussing Arizona facilities compared in
THOMAS, supra note 23); Gerald G. Gaes et al., The Performance of Privately Operated Prisons: A
Review of Research, printed as Appendix 2 with separate page numbering in MCDONALD ET AL.,
supra note 22, at 12 (same).
60
MCDONALD ET AL., supra note 22, at 55 (discussing CHARLES H. LOGAN, WELL KEPT:
COMPARING QUALITY OF CONFINEMENT IN A PUBLIC AND A PRIVATE PRISON (Nat. Inst. of Just.
1991); Charles H. Logan, Well Kept: Comparing Quality of Confinement in Private and Public
Prisons, 83 J. CRIM. L. & CRIMINOLOGY 577 (1992)).
61
Gaes et al., supra note 59, at 5 (criticizing the use of univariate methods in the comparison of
Kentucky facilities in URBAN INSTITUTE, COMPARISON OF PRIVATELY AND PUBLICLY OPERATED
CORRECTIONAL FACILITIES IN KENTUCKY AND MASSACHUSETTS (1989)); id. at 18 (discussing lack
of information on characteristics of inmate populations in WILLIAM G. ARCHAMBEAULT & DONALD
R. DEIS, JR., COST EFFECTIVENESS COMPARISONS OF PRIVATE VERSUS PUBLIC PRISONS IN LOUISIANA: A COMPREHENSIVE ANALYSIS OF ALLEN, AVOYELLES, AND WINN CORRECTIONAL CENTERS
(1996)); id. at 19 (discussing lack of control for differences in number of inmates at some comparison prisons in ARCHAMBEAULT & DEIS, supra); Scott D. Camp & Gerald G. Gaes, Private Adult
Prisons: What Do We Really Know and Why Don’t We Know More?, in PRIVATIZATION IN CRIMI(continued next page)
58

Draft—Please do not circulate

16

VOLOKH

Often, the comparability problem boils down to differences in
inmate populations; one prison may have a more difficult population than the other, even if they have the same security level. Usually prisons have different populations because of the luck of the
draw,62 but sometimes it’s by design, as happened in Arizona,
when the Department of Corrections made “an effort to refrain
from assigning prisons to [the private prison] if they [had] serious
or chronic medical problems, serious psychiatric problems, or
[were] deemed to be unlikely to benefit from the substance abuse
program that is provided at the facility.”63 It’s actually quite common to not send certain inmates to private prisons; the most common restriction in contracts is on inmates with special medical
needs.64 Not that all prisons must have totally random assignment;
it can be rational to tailor prisoner assignment to, say, the programming available at a prison. But such practices do have “the
unintended effect of undermining cost comparisons.”65 Another
practice that undermines cost comparisons is contractual terms limiting the private contractor’s medical costs.66

NAL JUSTICE:

PAST, PRESENT, AND FUTURE 283, 285–87 (David Shichor & Michael J. Gilbert eds.,
2001) (critiquing ARCHAMBEAULT & DEIS, supra, and THOMAS, supra note 23); GAES ET AL., supra
note 7, at 51–53 (discussing ARCHAMBEAULT & DEIS, supra).
62
Gaes et al., supra note 59, at 4 (discussing comparison of Kentucky facilities in URBAN INSTITUTE, supra note 61, where public sector had more difficult adult population while private sector
had more difficult juvenile population); id. at 9 (discussing TENN. SELECT OVERSIGHT CMTE. ON
CORR., COMPARATIVE EVALUATION OF PRIVATELY-MANAGED CCA PRISON (SOUTH CENTRAL
CORRECTIONAL CENTER) AND STATE-MANAGED PROTOTYPICAL PRISONS (NORTHEAST CORRECTIONAL CENTER, NORTHWEST CORRECTIONAL CENTER (1995)); id. at 11 (discussing STATE OF
WASH. LEGIS. BUDGET CMTE., DEPARTMENT OF CORRECTIONS PRIVATIZATION FEASIBILITY STUDY
(Report 96-2, 1996)); id. at 20 (criticizing the use of the Angola facility as a comparison facility in
ARCHAMBEAULT & DEIS, supra note 61); id. at 20–21 (discussing that low urinalysis hit rates in
ARCHAMBEAULT & DEIS, supra note 61, could indicate a population less inclined to use drugs, and
low medical risk scores could indicate a population less in need of serious medical care).
63
THOMAS, supra note 23, at 73.
64
CAMP & GAES, supra note 23, at 21–22 (some restrictions in effect in 62.5% of the contracts
surveyed; special medical needs restriction in 50% of contracts; other restrictions include highpublicity inmates and gang members).
65
THOMAS, supra note 23, at 73.
66
See, e.g., Contract Between the State of Tennessee and Corrections Corporation of America,
RFS-329.44-004 [hereinafter Tennessee CCA 2007 contract] ¶ A.4.g.13)(a), at 13, available at
http://www.capitol.tn.gov/joint/committees/fiscal-review/archives/106ga/contracts/RFS%20329.4400408%20Correction%20%28CCA%20-%20amendment%201%29.pdf (“If the inmate is hospitalized, the Contractor shall not be responsible for inpatient-Hospital Costs which exceed $4,000.00 per
inmate per admission.”); id. ¶ A.4.g.13)(b) (“The Contractor shall not be responsible for the cost of
providing anti-retroviral medications therapeutically indicated for the treatment of inmates with
AIDS or HIV infection.”). By its terms, this contract covers serves at the South Central Correctional
Center, id. at 1, and runs from 2007 to 2010, id. at 22.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

17

Some performance studies rely on surveys administered to a
non-random sample of inmates67 or potentially biased staff surveys,68 or generally to populations of inmates or staff that aren’t
randomly assigned to public and private prisons.69 Survey data
isn’t useless, but it’s rarely used with the appropriate sensitivity to
its limitations.70 The higher-quality survey-based studies don’t give
the edge to either sector.71
Most damningly, many studies don’t rely on actual performance measures,72 relying instead on facility audits that are largely
process-based.73 Some supposed performance measures don’t necessarily indicate good performance,74 especially when the prisons
are compared based on a “laundry list” of available data items (for
instance, staff satisfaction) whose relevance to good performance
hasn’t been theoretically established.75
Gerald Gaes and his coauthors conclude that most studies are
“fundamentally flawed,” and agree with the GAO’s conclusion that
there is “little information that is widely applicable to various correctional settings.”76
67
Gaes et al., supra note 59, at 6 (discussing DALE K. SECHREST & DAVID SHICHOR, FINAL
REPORT: EXPLORATORY STUDY OF CALIFORNIA’S COMMUNITY CORRECTIONAL FACILITIES (Cal.
Dep’t of Corr., Parole & Comm. Servs. Div., 1994)).
68
Gaes et al., supra note 59, at 24 (discussing staff surveys in LOGAN, supra note 60; Logan,
supra note 60).
69
GAES ET AL., supra note 7, at 74–76 (critiquing Judith Greene, Comparing Private and Public Prison Services and Programs in Minnesota: Findings from Prisoner Interviews, 2 CURR. ISS. IN
CRIM. JUST. 202 (1999); Judith Greene, Lack of Correctional Services, in CAPITALIST PUNISHMENT:
PRISON PRIVATIZATION & HUMAN RIGHTS (Andrew Coyle et al. eds., 2003); LOGAN, supra note 60;
Logan, supra note 60); Scott D. Camp et al., Quality of Prison Operations in the US Federal Sector:
A Comparison with a Private Prison, 4 PUNISH. & SOC. 27, 32–34 (2002). For a general discussion
of methods, see Scott D. Camp et al., Creating Performance Measures from Survey Data: A Practical Discussion, 3 CORR. MGMT. Q. 71 (1999).
70
See RICHARD W. HARDING, PRIVATE PRISONS AND PUBLIC ACCOUNTABILITY 115–19
(1997).
71
Scott D. Camp et al., Using Inmate Survey Data in Assessing Prison Performance: A Case
Study Comparing Private and Public Prisons, 27 CRIM. JUST. REV. 26, 31 (2003); Camp et al.,
Quality of Prison Operations, supra note 69, at 49–50; see also GAES ET AL., supra note 7, at 83.
72
Gaes et al., supra note 59, at 9 (discussing TENN. SELECT OVERSIGHT CMTE. ON CORR., supra note 62).
73
Not that prison audits are useless; Gerald Gaes, in fact, who is a big booster of performance
measurement, discusses how audits could be improved to be made more useful. GAES ET AL., supra
note 7, at 31–37.
74
Gaes et al., supra note 59, at 20 (discussing, in the context of ARCHAMBEAULT & DEIS, supra note 61, how a low count of disciplinary actions could indicate either good or bad performance);
id. at 25–27 (discussing similar difficulties in interpreting items in LOGAN, supra note 60; Logan,
supra note 60).
75
Camp & Gaes, supra note 61, at 286.
76
Gaes et al., supra note 59, at 31 (citing PRIVATE AND PUBLIC PRISONS: STUDIES COMPARING
OPERATIONAL COSTS AND/OR QUALITY OF SERVICE 11 (Gen. Account. Off., 1996)).

Draft—Please do not circulate

18

VOLOKH

I would add that accountability mechanisms vary widely—the
standard U.S. model, the Florida model, and the UK model are different,77 and these in turn differ from the French model78 or the
model proposed for prison privatization in Israel before the Israeli
Supreme Court invalidated the experiment.79 When a prison study
finds some result about comparative quality, that tells us something
about comparative quality within that accountability structure; if a
private prison performed inadequately under one accountability
structure, it might do better under a better one.80
Consider, for instance, the performance evaluations of the private federal Taft facility. As with the cost studies discussed
above,81 we have two competing studies, the National Institute of
Justice one by McDonald and Carlson82 and a Bureau of Prisons
study by Scott Camp and Dawn Daggett83—the companion paper
to Julianne Nelson’s cost paper.84
The Bureau of Prisons has evaluated public prisons by the Key
Indicators/Strategic Support System since 1989.85 Taft, alas, didn’t
use that system, but instead used the system designed in the contract for awarding performance-related bonuses.86 Therefore,
McDonald and Carlson could only compare Taft’s performance
with that of the public comparison prisons on a limited number of
dimensions,87 and many of these dimensions—like accreditation of
the facility, staffing levels, or frequency of seeing a doctor 88—
aren’t even outcomes. Taft had lower assault rates than the average
of its comparison institutions, though they were within the range of
77
See, e.g., David E. Pozen, Managing a Correctional Marketplace: Prison Privatization in
the United States and the United Kingdom, 19 J.L. & POL. 253, 276–81 (2003) (comparing American
and British accountability systems); HARDING, supra note 70, at 158–165 (describing the “basic
model” of accountability, the UK model, and the Florida model, and proposing a new model).
78
See JON VAGG, PRISON SYSTEMS: A COMPARATIVE STUDY OF ACCOUNTABILITY IN ENGLAND, FRANCE, GERMANY, AND THE NETHERLANDS 305–07 (1994).
79
See HCJ 2605/05 Acad. Ctr. of Law & Bus., Human Rights Div. v. Minister of Finance
[2009] (Isr.) ¶ 18, http://elyon1.court.gov.il/files_eng/05/050/026/n39/05026050.n39.htm; Volokh,
supra note 6, at 180–85, 198–99 (discussing this opinion).
80
Gaes, supra note 36, at 30, also calls for more study of different accountability structures.
81
See text accompanying supra notes 53–56.
82
MCDONALD & CARLSON, supra note 55.
83
SCOTT D. CAMP & DAWN M. DAGGETT, EVALUATION OF THE TAFT DEMONSTRATION PROJECT: PERFORMANCE OF A PRIVATE-SECTOR PRISON AND THE BOP (2005).
84
NELSON, supra note 53.
85
MCDONALD & CARLSON, supra note 55, at 119; see also text accompanying infra note 297–
298.
86
Gaes, supra note 25, at 35; text accompanying infra note 168.
87
MCDONALD & CARLSON, supra note 55, at 119.
88
Id. at 143.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

19

observed assault rates.89 No inmates or staff were killed.90 There
were two escapes, which was higher than at public prisons.91 Drug
use was also higher at Taft, as was the frequency of submitting
grievances.92 On this very limited analysis, Taft seems neither
clearly better nor clearly worse than its public counterparts.
The Camp and Daggett study, on the other hand, created performance measures from inmate misconduct data,93 and concluded
not only that Taft “had higher counts than expected for most forms
of misconduct, including all forms of misconduct considered together,”94 but also that Taft “had the largest deviation of observed
from expected values for most of the time period examined.”
Camp and Daggett’s performance assessment was thus more pessimistic than McDonald and Carlson’s.
According to Gerald Gaes, the strongest studies include one
from Tennessee, which shows essentially no difference, one from
Washington, which shows somewhat positive results,95 and three
more recent studies of federal prisons by himself and coauthors,
which found public and private prisons to be equivalent on some
measures, higher on others, and lower on yet others.96
2. Which Sector Leads to Less Recidivism?
Recidivism reduction is really just one dimension of prison
quality, though it’s a particularly relevant one that deserves its own
section.
If we found that inmates at private prisons were less likely to
reoffend than comparable inmates at public prisons, this would be
an important factor in any comparison of public and private pris-

89
Id. at 126, 127 fig.4.2. To focus on the three comparison prisons from the cost analyses,
Elkton’s assault rate was similar to what would have been expected, while Taft, like Forrest City and
Yazoo City, had lower rates than what would have been expected. Yazoo City’s was the lowest.
Gaes, supra note 25, at 36.
90
MCDONALD & CARLSON, supra note 55, at 128.
91
Id. at 128.
92
Id. at 143.
93
CAMP & DAGGETT, supra note 83, at 36.
94
Id. at 59.
95
Gaes et al., supra note 59, at 31.
96
Gaes, supra note 36, at 26 (citing Camp et al., supra note 71; Scott D. Camp et al., The Influence of Prisons on Inmate Misconduct: A Multilevel Investigation, 20 JUST. Q. 501 (2003); Camp
et al., Quality of Prison Operations, supra note 69).

Draft—Please do not circulate

20

VOLOKH

ons. Unfortunately, recidivism comparisons haven’t been very
good either.
A study from the late 1990s by Lonn Lanza-Kaduce and coauthors reported that inmates released from private prisons were less
likely to reoffend than a matched sample of inmates released from
public prisons, and had less serious offenses if they did reoffend.97
But this study has been critiqued on various grounds.98 First, not
all the recidivism measures are significant: while various
reoffense-related rates were found to be significantly lower in the
private sector,99 and while the seriousness of reoffending was
found to be significantly lower in the private sector,100 a time-tofailure analysis found that there was no significant difference in the
“length of time that a release ‘survived’ without an arrest during
the 12-month period.”101 Second, the public inmates seem to not
really have been well matched to the private inmates; they only
seemed so when their descriptive variables were described at a
high level of generality (e.g., custody level vs. “the underlying
continuous score measuring custody level,” whether inmates had
two or more incarcerations vs. the actual number of incarcerations,
etc.).102 Third, the authors seem to have made the questionable decision to assign an inmate to the sector he was released from, even
if he had spent time in several sectors: thus, an inmate who spent
years in public prison and was transferred to private prison shortly
before his release was classified as a private prison releasee.103
Fourth, a private releasee who reoffended could take longer to be
97
Lonn Lanza-Kaduce et al., A Comparative Recidivism Analysis of Releasees from Private
and Public Prisons, 45 CRIME & DELINQ. 28 (1999); see also Lonn Lanza-Kaduce et al., The Devil
in the Details: The Case Against the Case Study of Private Prisons, Criminological Research, and
Conflict of Interest, 46 CRIME & DELINQ. 92, 96–97 (2000).
98
The critiques are discussed in GAES ET AL., supra note 7, at 24–26. Gaes et al. argues, see id.
at 27, argues that several of the critiques continue to apply to a later paper with a longer follow-up
period, L. Lanza-Kaduce & S. Maggard, The Long-Term Recidivism of Public and Private Prisoners, paper presented at the National Conference of the Bureau of Justice Statistics and Justice Research and Statistics Association, New Orleans, 2001.
99
The difference in rearrest rates is significant at the 1% level and the difference in resentencing rates is significant at the 5% level, but the differences in reincarceration rates and for any indication of recidivism are only significant at the 10% level. Lonza-Kaduce et al., supra note 97, at 36–
37.
100
Id. at 37–38.
101
Id. at 38–41.
102
GAES ET AL., supra note 7, at 25 (citing FLA. DEP’T OF CORR., BUR. OF RES. & DATA
ANALYSIS, PRELIMINARY ASSESSMENT OF A STUDY ENTITLED “A COMPARATIVE RECIDIVISM
ANALYSIS OF RELEASEES FROM PRIVATE AND PUBLIC PRISONS IN FLORIDA” (1998)).
103
Id. at 26.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

21

entered in the system than a public releasee,104 so the truly comparable number of private recidivists may well have been larger than
reported.
A later study by David Farabee and Kevin Knight105 that “corrected for some of these deficiencies”106 found no comparative difference in the reoffense or reincarceration rates of males or juveniles over a three-year post-release period, though women had
lower recidivism in the private sector.107 However, this study may
still suffer from the problem of the attribution of inmates who
spent some time in each sector, as well as possible selection bias to
the extent that private prisons got a different type of inmate than
public prisons did.108
Another study by William Bales and coauthors,109 even more
rigorous,110 likewise found no statistically significant difference
between public-inmate and private-inmate recividism.111
A more recent study, by Andrew Spivak and Susan Sharp, reported that private prisons were (statistically) significantly worse
in six out of eight models tested.112 But the authors noted that some
skepticism was in order before concluding that public prisons necessarily did better on recidivism. Populations aren’t randomly assigned to public and private prisons: that private prisons engage in
“cream-skimming” is a persistent complaint.113 Recall the case in
104

Id.
DAVID FARABEE & KEVIN KNIGHT, A COMPARISON OF PUBLIC AND PRIVATE PRISONS IN
FLORIDA: DURING- AND POST-PRISON PERFORMANCE INDICATORS (Query Research, 2002).
106
GAES ET AL., supra note 7, at 27.
107
FARABEE & KNIGHT, supra note 105, at ii–iii, 20–25.
108
GAES ET AL., supra note 7, at 28.
109
William D. Bales et al., Recidivism of Public and Private State Prison Inmates in Florida, 4
CRIMINOLOGY & PUB. POL’Y 57 (2005).
110
See Gaes, supra note 36, at 9.
111
Bales et al., supra note 109, at 69, 72, 74.
112
Andrew L. Spivak & Susan F. Sharp, Inmate Recidivism as a Measure of Private Prison
Performance, 54 CRIME & DELINQ. 482, 500 tbl.5, 501 (2008).
113
See, e.g., GAES ET AL., supra note 7, at 28; Dolovich, supra note 5, at 505; John J. DiIulio,
Jr., The Duty to Govern: A Critical Perspective on the Private Management of Prisons and Jails, in
PRIVATE PRISONS AND THE PUBLIC INTEREST, supra note 25, at 155, 166–67 (stating that private
firms “engage in correctional creaming when they bid,” meaning that they avoid bidding on facilities
that they expect will “bring negative media attention, legislative inquiries, staff unrest, lawsuits, and
judicial intervention,” “the Atticas and Riker Islands of the country”); Richard A. Oppel Jr., Private
Prisons Found to Offer Little in Savings, N.Y. TIMES, May 18, 2011 (discussing Arizona Department of Corrections study stating that private prisons “often house only relatively healthy inmates”
and quoting State Representative Chad Campbell calling this practice “cherry-picking”); STATE OF
ARIZONA, OFFICE OF THE AUDITOR GENERAL, DEPARTMENT OF CORRECTIONS—PRISON POPULATION GROWTH 20 (Report No. 10-08, 2010) (“[P]rivate prisons do not accept inmates in need of
more serious medical care . . . .”); ARIZONA DEP’T OF CORR., FY 2009 OPERATING PER CAPITA
(continued next page)
105

Draft—Please do not circulate

22

VOLOKH

Arizona, where the Department of Corrections made “an effort to
refrain from assigning prisons to [the private Marana Community
Correctional Facility] if they [had] serious or chronic medical
problems, serious psychiatric problems, or [were] deemed to be
unlikely to benefit from the substance abuse program that [was]
provided at the facility.”114 But the phenomenon can also run the
other way. One of the authors of the recidivism study, Andrew
Spivak, writes that he was “a case manager at a medium-security
public prison in Oklahoma in 1998, he noted an inclination for case
management staff (himself included) to use transfer requests to
private prisons as a method for removing more troublesome inmates from case loads.”115
Moreover, recidivism data is itself often flawed.116 Recidivism
has to be not only proved (which requires good databases) but also
defined.117 Recidivism isn’t self-defining—it could include arrest;
reconviction; incarceration; or parole violation, suspension, or revocation; and it could give different weights to different offenses
depending on their seriousness.118 Which definition one uses
makes a difference in one’s conclusions about correctional effectiveness,119 as well as affecting the scope of innovation.120 The
choice of how long to monitor obviously matters as well: “[m]ost
severe offences occur in the second and third year after release.”121
Recidivism measures might also vary because of variations in, say,
COST REPORT 2 (2010) (discussing inmates “returned to state prisons due to an increase of their
medical scores that exceeds contractual exclusions”); id. at 4 (same); id. at 10 n.1 (similar); id. at
12–16 (discussing medical, mental health, and other restrictions on inmates that can be sent to particular private prisons). Compare Gaes et al., supra note 59, at 34–35 (stressing that the federal Taft
facility, the subject of the comparative study reported at text accompanying supra notes 53–58 and
supra notes 81–92, will house inmates equivalent to those at the comparison facilities).
114
THOMAS, supra note 23, at 73; see text accompanying supra note 63.
115
Spivak & Sharp, supra note 112, at 503–04.
116
MICHAEL D. MALTZ, RECIDIVISM 58–60 (1984); PUBLIC AND PRIVATE PRISONS, supra note
76, at 30–31 (discussing SECHREST & SHICHOR, supra note 67) (“Sufficient data were not available
to adequately complete the analysis comparing the inmates released from the community correctional facilities to inmates released from other correctional institutions in the state.”); Gaes et al., supra
note 59, at 7 (same).
117
Brakel & Gaylord, supra note 10, at 154.
118
MALTZ, supra note 116, at 62.
119
Id. at 63; see also JAMES DICKER, PAYMENT-BY-OUTCOME IN OFFENDER MANAGEMENT 16
(2020 Public Services Trust at the RSA, Case Study 2, Feb. 2011) (“[N]either reconviction nor reimprisonment rates capture all re-offending behaviour, as only about 45% of offenders who are
reconvicted are incarcerated and it is possible to be recalled to prison for breaching license conditions without being reconvicted.”)
120
DICKER, supra note 119, at 18.
121
Id. at 16–17.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

23

enforcement of parole conditions, independent of the true recidivism of the underlying population.122
The study of the comparative recidivism of the public and private sector could thus use a lot of improvement.123
C. The Limits of Comparative Effectiveness
After having read the foregoing, one should be fairly dismayed
at the state of comparative public-private prison research.124 In
fact, it gets worse. An overarching problem is that most studies
don’t simultaneously compare both cost and quality, not both. It’s
hard to draw strong conclusions from such studies, even if they’re
state-of-the-art at what they’re examining.125
If we find that a private prison costs less, how do we know that
they didn’t achieve that result by cutting quality? (This is the
standard critique of private prisons.126) If we find that a private
prison costs more, how do we know that they didn’t cost more because of the fancy and expensive programs they implemented?127
(According to Douglas McDonald, this was exactly the problem
with the cost comparison of the Silverdale Detention Center in
Hamilton County, Tennessee.128)

122

MALTZ, supra note 116, at 66–67.
See also Gaes, supra note 36, at 9–12 (discussing these studies).
124
Some studies are actually meta-analyses. See Gaes, supra note 36, at 3–6 (discussing metaanalyses and literature reviews). Two recent meta-analyses showed little difference between the
public and private sectors. One, only analyzing costs, found no statistical difference between the
public and private sectors. Travis C. Pratt & Jeff Maahs, Are Private Prisons More Cost-Effective
Than Public Prisons? A Meta-Analysis of Evaluation Research Studies, 45 CRIME & DELINQ.
358, 365 & 366 tbl.2 (1999). Another, looking at both cost and quality, found that the private sector
was both slightly cheaper and slightly worse; but with such small effects, the authors concluded that
“prison privatization provides neither a clear advantage nor disadvantage.” Brad W. Lundahl et al.,
Prison Privatization: A Meta-Analysis of Cost and Quality of Confinement Indicators, 19 RES. ON
SOC. WORK PRACTICE 383, 392 (2009). A third—more a literature review than a metaanalysis—reported that the comparison was “inconclusive,” Dina Perrone & Travis C. Pratt, Comparing the Quality of Confinement and Cost-Effectiveness of Public Versus Private Prisons: What
We Know, Why We Do Not Know More, and Where to Go from Here, 83 PRISON J. 301 (2003);
and in any event there was no formal attempt to control for differences between the public and private prisons compared.
Given that many of the underlying studies are flawed in various ways, it’s not clear how you
do better by aggregating them. When studies done in vastly different ways and subject to different
sources of bias are aggregated in a meta-analysis, the results are “garbage in, garbage out.”
125
See PUBLIC AND PRIVATE PRISONS, supra note 76, at 13.
126
See, e.g., Dolovich, supra note 5, at 474–80.
127
See Developments, supra note 9, at 1875; MCDONALD ET AL., supra note 22, at 34–35.
128
McDonald, supra note 25, at 91.
123

Draft—Please do not circulate

24

VOLOKH

Our goal should be to determine the production function for
public and private prisons; this is the only way we’ll find out
whether privatization moves us to a higher production possibilities
frontier or merely shifts us to a different cost-quality combination
on the existing frontier.129 Realizing this allows us to throw out a
lot of studies from the outset.
At least people are taking more seriously the need to develop
valid comparisons. Governments need to mandate, by regulation or
by contract, that the information necessary to do valid comparisons
become available, even if collecting this extra data would add to
private facilities’ cost.130 Until we get a better handle on what
works, public and private prisons should be required to live up to
the same standards to facilitate comparisons. Private prisons
should get the same types of inmates as public prisons—neither
better nor worse131—and they should be restricted in whom they
can transfer out.132
Having spent so long bemoaning the paucity of good comparative effectiveness studies, I should note that there’s more to life
than comparative effectiveness. Even ignoring any differences between the public and private sectors, privatization can have systemic effects, altering how the public sector works.133
For one thing, privatization can, for better or worse, change the
public sector as well. Suppose private prisons are better than public
prisons but competitive pressures lead public prisons to improve as
well.134 A comparative study may not be able to find any differ129
Cf. Caroline M. Hoxby, School Choice and School Competition: Evidence from the United
States, 10 SWED. ECON. POL’Y REV. 1, 42–43 (2003) (“If school choice is to be public policy, and
not merely an experiment, then the question we need to answer is whether students’ achievement
would rise if they attended voucher or charter schools that had resources like those available to them
in regular public schools. In other words, we should ask the achievement question, holding resources
constant (as well as holding students’ ability, motivation, and other characteristics constant.”).
130
COST OF PRISONS, supra note 26, at 5, 13–14, 17, 19–20, 30.
131
See text accompanying supra notes 113–115.
132
See STATE OF FLORIDA, OFF. OF PROG. POL’Y ANAL. & GOV’T ACCOUNTABILITY, REVIEW
OF CORRECTIONAL PRIVATIZATION 4 (Rep. No. 95-12, 1995) (recommending restrictions on transfers out of private prisons).
133
Cf. Hoxby, supra note 129, at 19 (“[school] choice can affect productivity through a variety
of long-term, general equilibrium mechanisms that are not immediately available to an administrator,” like bidding up the wages of successful teachers and altering the mix of people who choose
teaching as a career, making parents into more informed consumers by encouraging the spread of
information about schools, altering what curricula are adopted, and the like).
134
Charles W. Thomas, Correctional Privatization in America: An Assessment of Its Historical
Origins, Present Status, and Future Prospects, in CHANGING THE GUARD, supra note 10, at 57, 59.
See also infra Part III.A (privatization can improve accountability of public sector).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

25

ence between the two sectors, and yet one can still say that privatization was a success.135 (Indeed, one study does suggest that for
prisons, privatization might drive public agencies to be more efficient,136 though the statistical significance of this effect seems
highly sensitive to the precise specification,137 and selection bias is
a confounding issue.138) Similarly, if private prisons really do cost
less and therefore allow for greater increases in capacity, thus relieving overcrowding across the board, that won’t show up in a
comparative study.139 Likewise if best practices migrate from one
sector to another through a process of cross-fertilization:140 Richard Harding calls this “the paradox of cross-fertilization—that regimes progressively become more similar than dissimilar to each
other.”141
Alternatively, what if privatization leads to a race to the bottom? If private prison cost-cutting is harmful, and if public prisons
have to cut costs to stay competitive, we may have lower quality,
including higher recidivism, across the board.142
135
Cf. Hoxby, supra note 129, at 43 (suggesting that concentrating on the effect on student
achievement of private schooling vs. public schooling is wrongheaded in the school choice debate,
because school choice can be a success if, through competition, it leads to improvements in the
public sector, so that there never emerges any difference between public and private school outcomes).
136
James F. Blumstein et al., Do Government Agencies Respond to Market Pressures? Evidence from Private Prisons, 15 VA. J. L. & SOC. POL’Y 446 (2008); see also James F. Blumstein &
Mark A. Cohen, The Interrelationship Between Public and Private Prisons: Does the Existence of
Prisoners Under Private Management Affect the Rate of Growth in Expenditures on Prisoners Under Public Management? (April 2003).
137
See Blumstein et al., supra note 136, at 465 (insignificant effect with two different specifications, significant effect with a third).
138
The authors estimate the effect using a two-stage regression where the first stage represents
the probability of privatizing, but this method doesn’t always take care of selection effects. See Alexander Volokh, Do Faith-Based Prisons Work?, 63 ALA. L. REV. 43, 67–73 (2011). Gaes also
critiques the study, see Gaes, supra note 36, at 12–14. I’ve discussed or critiqued selection bias in
many places. See Volokh, supra note 9, at 1245–47; Alexander Volokh, Choosing Interpretive
Methods: A Positive Theory of Judges and Everyone Else, 83 NYU L. REV. 769, 803–19 (2008);
Alexander Volokh, Privatization, Free-Riding, and Industry-Expanding Lobbying, 30 INT’L REV. L.
& ECON. 62, 68 (2010); Alexander Volokh, The Effect of Privatization on Public and Private Prison
Lobbies, in 3 PRISON PRIVATIZATION: THE MANY FACETS OF A CONTROVERSIAL INDUSTRY 7, 24–
26 (Byron Eugene Price & John Charles Morris eds., 2012).
139
Developments, supra note 9, at 1875.
140
I discuss cross-fertilization at greater length below, see text accompanying infra note 190.
141
Richard Harding, Private Prisons, in 28 CRIME AND JUSTICE: A REVIEW OF RESEARCH 265,
334 (2001). But see Tony Ward, Book Review, 3 THEO. CRIMINOLOGY 125, 126 (1999) (reviewing
HARDING, supra note 70) (conceding that Harding’s cross-fertilization argument is valid but noting
that “[t]here seems to be a ‘heads I win, tails you lose’ quality to [Harding’s cross-fertilization]
argument (if public prisons turn out to be better than private ones, that just proves that competition is
good for them!)”).
142
Gerald G. Gaes, Prison Privatization in Florida: Promise, Premise, and Performance, 4
CRIMIN. & PUB. POL’Y 83, 87 (2005); GAES ET AL., supra note 7, at 108; HARDING, supra note 69, at
(continued next page)

Draft—Please do not circulate

26

VOLOKH

In either of these two cases, good empirical evaluations are
necessary, though detecting such dynamic, system-wide effects
will require before-and-after studies, not comparative snapshots.
Finally, to step back a bit from the privatization debate, regardless of what comparative effectiveness analysis show, both sectors
may fall short of the ideal, so this exercise shouldn’t blind us to the
continuing need to reform the whole system.143 I’ll add that, even
if the public and private sectors are equivalent, one can argue
against privatization on the grounds that—assuming it costs less—
it enables greater expansion of the prison system and therefore may
increase incarceration and hinder the search for alternative penal
policies.144
III. WHY USE PERFORMANCE MEASURES?
A. The Puzzle of Prisons?
The moral so far is that the whole empirical literature on public
and private prisons is highly inconclusive.145 As I noted in the Introduction, this should be somewhat of a puzzle for activists on
both sides who claim that privatization should turn prisons into either humanitarian disaster zones or models of quality and efficiency.146
Of course, that the empirical literature is inconclusive doesn’t
mean the sectors are equivalent; it means that current methods haven’t been good enough to detect the difference. A methodologically deficient literature could hide evidence of either good or bad
quality. But if the differences are great enough, you’d think they
might show through even with bad methods.147
The tentative conclusion I draw from the literature, though, is
that there may be modest quality differences between the sectors,
but not huge; the public sector is better on some dimensions and
worse on others, and there’s no good evidence that either sector
138 (noting that reductions in public prisons’ staffing levels in response to competition could be
alternatively characterized as “cross-fertilization” or “industrial blackmail”).
143
Dolovich, supra note 5, at 442.
144
See Volokh, supra note 6, at 142–43 & n.30 (collecting sources making this argument).
145
See Alexander Volokh, The Modest Effect of Minneci v. Pollard on Inmate Litigants, 46
AKRON L. REV. 287, 324 (2013).
146
See supra text accompanying notes 10–11.
147
See supra note 12.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

27

does better at reducing recidivism. And while the private sector is
probably cheaper, it remains to be seen whether the cost savings is
on the order of 15% (respectable) or on the order of 3% (somewhat
negligible).148
But this puzzle largely disappears when we consider the institutional environment of private prisons. In many areas, the private
sector has been good at delivering better results at a lower cost.
This is because private producers are accountable to customers
who care about the quality of the end product, and because they
have the flexibility to change how they do things in response to
problems they may encounter. Neither of these conditions is true
for private prisons—not even slightly, not even as a first approximation.
I’ve noted above that there’s limited evidence of private firm
innovation.149 But this is because private prisons are highly constrained in how they operate. Private prison contracts essentially
“‘governmentalize’ the private sector,”150 reproducing public prison regulations in the private contract. Privatization can come to
resemble an exercise in who can better pretend to be a public prison.151
For instance, back in 1985, Robert Levinson complained of a
contract with the Eckerd Foundation for the management of the
Okeechobee School for Boys in which “[v]irtually every” contract
item:
concerned input activities and pertained to administrative/operational functions. Thus, Eckerd could have been in
total compliance with all contractual provisions even if every released client committed a new offense on the first day
in the community. Moreover, at no point in the contract
148

See supra text accompanying notes 53–56.
See, e.g., Camp & Gaes, supra note 61, at 287; Dolovich, supra note 5, at 476; Scott D.
Camp, Editorial Introduction: Private Prisons & Recidivism, 4 CRIMIN. & PUB. POL’Y 55, 55
(2005).
150
Thomas, supra note 134, at 64; see also id. at 82, 116 n.15, 100–02.
151
MCDONALD ET AL., supra note 125, at 49; DOUGLAS MCDONALD & CARL PATTEN JR.,
GOVERNMENTS’ MANAGEMENT OF PRIVATE PRISONS 18 (Abt Assocs. Inc., 2003); Gaes et al., supra
note 59, at 12 (“Generally speaking, the contract [discussed in THOMAS, supra note 23] stipulates
that [the private provider] run the . . . facility in a manner similar to that in which the state would
have operated the prison.”); id. at 17 (“Basically, the State of Arizona has taken the position that a
private contractor should be given the opportunity to demonstrate it can out perform the state in
running an Arizona prison according to Arizona Department of Corrections policy.”); Durham,
supra note 20, at 67; Harding, supra note 141, at 303.
149

Draft—Please do not circulate

28

VOLOKH

were the criteria for noncompliance stated nor its consequences specified.152
More recently, in Arizona, an auditor general report stated:
The Department requires that private prisons mirror
state-operated facilities, and performs extensive oversight
activities to ensure that its contractors meet its requirements. In order to maintain uniform standards for state and
private prisons, the Department requires contractors to follow Department Orders, Director’s Instructions, Technical
Manuals, Institution Orders, and Post Orders. These requirements extend to specific details, such as following the
same daily menus as state-operated facilities. Contractors
may request waivers from the Department for policies that
are not applicable to private prisons, such as state fiscal
management practices, employee evaluations, and employee benefits.153
The same daily menus! In Tennessee, “it even appears that private sector innovation was deliberately thwarted by making the
private sector provider . . . abide by [state Department of Corrections] policy” in running the facility.154
Subjecting private contractors to public regulations is actually
quite common;155 one exception to this trend is Florida, where public and private prisons are controlled by different agencies,156 and
the agency that regulates private prisons tries to balance “setting
policy and encouraging innovation.”157 More generally, input specification in private-prison contracts is routine, though of course the
level of inputs specified can (and should) be “output-driven” in the
sense that it’s “related to output objectives.”158 For instance, one
152

Levinson, supra note 24, at 87; see also id. at 88.
DEBRA K. DAVENPORT, PERFORMANCE AUDIT: ARIZONA DEPARTMENT OF CORRECTIONS,
PRIVATE PRISONS, at 9 (2001); see also Thomas, supra note 134, at 101.
154
Gaes et al., supra note 59, at 10.
155
CAMP & GAES, supra note 23, at vii (“[P]rivate contractors were typically obligated to use
the training standards and policies of the public agencies.”) ;id. at 28. But see id. at ix (“[T]he private
sector, even when there is no contractual obligation, has adopted the standards and policies of their
public sector counterparts.”); id. at 32.
156
Id. at x, 32–33; see also HARDING, supra note 70, at 161.
157
CAMP & GAES, supra note 23, at x; see also Harding, supra note 141, at 303–04 (similar
situation in Western Australia).
158
HARDING, supra note 70, at 67–68; see also Peter H. Kyle, Contracting for Performance:
Restructuring the Private Prison Market, 54 WM. & MARY L. REV. 2087, 2111 (2013) (“[S]ome
states have started to require the provision of vocational services . . . .”). Harding doesn’t distinguish
between outputs and outcomes, see text accompanying supra note 17, so when he refers to outputs
(continued next page)
153

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

29

can find liquidated damages provisions for certain input-based
breaches like not complying with the state’s policies or not filling
certain required positions.159
If inputs and procedures are highly regulated, it’s not surprising
that the evidence for private-sector improvements isn’t overwhelming. The market is a discovery process; one shouldn’t expect
different methods to emerge unless innovation is permitted.
And not only permitted: one shouldn’t expect different methods to emerge unless the incentives favor it. If the premise of privatization is that incentives work, particularly given the greater
flexibility of private industry, micromanaging inputs and failing to
incorporate the full range of desirable outcomes into the contract
price means giving up on much of the possible benefit of privatization.
But the efforts to measure performance in various areas of government in various areas, from the Job Training Partnership Act of
1982160 and the Government Performance and Results Act of
1993161—and the limited efforts to make funding contingent on
those performance measures162—have largely passed prisons by.
Outcome measures aren’t totally absent. Contracts do include a
limited range of outcome measures—for instance, limited penalties
for escapes.163 But by and large, outcome-based compensation is
here, he means something like outcomes. Harding also suggests “intermediate outputs” as a synonym for “output-driven inputs,” HARDING, supra note 69, at 67–68; perhaps this concept is close to
what I refer to as simply “outputs.” See also RADIN, supra note 17, at 15 (defining “output” and
“intermediate outcome” differently).
159
Leonard Gilroy, Innovators in Action 2012: Creating a Culture of Competition to Improve
Corrections, REASON FOUND. (May 31, 2012), available at http://reason.org/news/printer/ohiocorrections-competition.
160
Pub. L. No. 103-62, 107 Stat. 285 (codified in scattered sections of 5 U.S.C. and 31 U.S.C.);
see also Matthew S. Schoen, Good Enough for Government Work?: The Government Performance
Results Act of 1993 and Its Impact on Federal Agencies, 32 SETON HALL LEGIS. J. 455, 467 (2008).
161
Pub. L. No. 97-300, § 106(b)(1), 96 Stat. 1322, 1333 (creating 29 U.S.C. § 1516 (repealed
by Workforce Investment Act of 1998, Pub. L. No. 105-220, § 199(b)(2), 112 Stat. 936, 1059))
(permissible performance measures for job-training organizations include “(A) placement in unsubsidized employment, (B) retention in unsubsidized employment, (C) the increase in earnings, including hourly wages, and (D) reduction in the number of individuals and families receiving cash welfare
payments and the amounts of such payments”); Laurence E. Lynn, Jr., Requiring Bureaucracies to
Perform: What Have We Learned from the U.S. Government Performance and Results Act (GPRA)?,
17 POLITIQUES ET MANAGEMENT PUBLIC 1, 3 (1999).
162
See infra Part IV.C.1.
163
See, e.g., Tennessee CCA 2007 contract, ¶ A.4.x.2), at 17 (“In the event of an escape resulting in whole or in part from Contractor’s failure to perform pursuant to the provisions of this Contract, the State may seek damages in a court of competent jurisdiction.”) (note that there’s no provision for paying for escapes not stemming from non-performance; ¶ A.4.x.1, at 17, only requires that
the contractor “exercise its best efforts to prevent escapes”).

Draft—Please do not circulate

30

VOLOKH

rare.164 And to the extent there are outcome-based rewards or penalties, “the amounts involved commonly have little or no correlation with the true magnitude of what independent contractors accomplished or failed to accomplish,” and “the dollar value of the
reward or sanction is often too trivial to encourage superior performance or to deter defective performance.”165 Even developing
outcome measures hasn’t been a high priority.166
In 1998—not that long ago—Douglas McDonald and his coauthors identified two exceptional cases of performance-based compensation: the Bureau of Prisons’ contract with Wackenhut167 for
the operation of the Taft Correctional Institution, “which contain[ed] provisions for an award-fee incentive worth up to 5 percent of paid invoices,” and a District of Columbia contract with
CCA for the Correctional Treatment Facility, “which permit[ted]
financial rewards for meeting targets based on performance indicators.”168
Florida recently would have taken a good step in this direction,
if the bill in question169 hadn’t been defeated. The bill would have
required that private prison contracts make provision for measuring
a number of dimensions of performance (though note that some of
these are output measures): number of batteries, number of major
disciplinary reports, percentage of negative random drug tests,
number of escapes, percentage of inmates in “a facility that provides at least one of the inmate’s primary program needs,” and so
on.170 The number of escapes also showed up in a more specific
way: the contractor would have been required to reimburse the
state for the costs of escapes.171 The Florida bill also listed required

164
See Pozen, supra note 77, at 282–83; Kenneth L. Avio, The Economics of Prisons, 6 EUR.
J.L. & ECON. 143, 150 (1998); Thomas, supra note 134, at 107 (“[I]f there are contracts that include
product-oriented requirements that go beyond mere evidence of participation, then they are contracts
I have never read.”).
165
Thomas, supra note 134, at 109.
166
See Durham, supra note 20, at 67.
167
“Wackenhut Corrections Corp. changed its name to The GEO Group in November 2003 under the terms of a share purchase agreement with another company.” See Volokh, supra note 9, at
1229 n.131.
168
See MCDONALD ET AL., supra note 125, at 52.
169
Florida Senate, SB 2038, first engrossed version (Feb. 13, 2012), http://flsenate.gov/Session
/Bill/2012/2038/BillText/e1/PDF.
170
Id. § 1, at 10–12 (creating FLA. STAT. § 944.7115(8)(f)(1)(a)–(r)).
171
Id. § 1, at 13 (creating FLA. STAT. § 944.7115(11)).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

31

various performance measures for work release centers.172 (I discuss various other performance measures below.173)
The following sections develop these themes and discuss two
distinct benefits of using performance measures. The first set of
advantages of using performance measures, discussed in Section B,
is a pure accountability advantage: we, as citizens and policymakers, would know how well our prisons are doing, we’d be better
informed in deciding which sector to choose, either systemwide or
on discrete projects, and we could think more clearly about what
prisons should be doing. The second type of advantage, discussed
in Section C, goes more to harnessing incentives to improve the
system over time: incorporating such measures into contracts, and
tying providers’ compensation to how well they do, would give
providers a reason to care about quality and simultaneously let us
grant them greater flexibility. Section D discusses what the normative issues involved in choosing the actual measures.
B. Accountability, Neutrality, and Goal Setting
1. To Know What Works
We all want to improve prisons. But forget about that for a
moment. Even before any of these improvements were possible,
performance measures would have the obvious effect of allowing
us to measure performance. This would be a great step forward in
researchers’ ability to conduct quality studies. We would have a
better sense of which sector provides better quality; combine that
with better cost studies that take into account the pitfalls described
above,174 and we’d be better able to decide whether to be, or not
be, one of the 19 states that ban private prisons entirely.175 If we do
ban private prisons entirely, performance measures would help us

These were: “(a) The percent of employment of supervised individuals; (b) The illegal substance use by supervised individuals; (c) The victim restitution paid by supervised individuals; (d)
Compliance by supervised individuals with no-contact orders; (e) The number of serious incidents
occurring at the facility; and (f) The number of absconders.” Id. § 1, at 12 (creating FLA. STAT.
§ 944.7115(8)(f)(2)(a)–(f)).
173
See infra Part III.D.
174
See supra Part II.A.
175
See supra note 18 and accompanying text.
172

Draft—Please do not circulate

32

VOLOKH

determine which public prisons performed badly and where to look
for improvement.176
2. To Implement Competitive Neutrality
Suppose we decide not to ban private prisons entirely. Should
we then contract out the entire prison system? Probably not: someone has to be able to run a facility if the current contractor has fallen down on the job or gone bankrupt,177 and given how concentrated the private prison industry currently is,178 it may not always be
realistic to count on being able to easily bring in a competitor when
this happens.
How much of the system, then, should we privatize? The
standard way to proceed is to choose particular prisons to privatize
and put them up to bid to private firms, or to contract with private
firms to use their own prisons. A more beneficial approach,
though, would be to have a regime of “competitive neutrality,”
where the public and private sector compete on the same projects.179 The best system may be one of mixed public and private
management, where private programs “complement existing public
programs rather than replace them.”180 (Health-care reformers’ advocacy of the “public option” in health insurance was premised on
a similar idea: that public participation can make competition more
fair by disciplining private providers more than they would discipline each other.181)
176
See Marc Holzer & Arie Halachmi, Measurement as a Means of Accountability, 19 INT’L J.
PUB. ADMIN. 1921 (1996) (measurement improves accountability of public sector); Aloysius Bavon,
Innovations in Performance Measurement Systems: A Comparative Perspective, 18 INT’L J. PUB.
ADMIN. 491, 493, 502 (1995) (performance measurement arose as a result of the perceived inefficiency of the public sector).
177
HARDING, supra note 70, at 158 (“The state must in the last resort be able to reclaim private
prisons.”); Michael J. Gilbert, How Much Is Too Much Privatization in Criminal Justice?, in PRIVATIZATION IN CRIMINAL JUSTICE, supra note 61, at 41, at 76–77.
178
Volokh, supra note 9, at 1237–38.
179
WILLIAM D. EGGERS, COMPETITIVE NEUTRALITY: ENSURING A LEVEL PLAYING FIELD IN
MANAGED COMPETITIONS 6 (Reason Found., How-to Guide No. 18, 1998); Gaes, supra note 36, at
24.
180
Patrick Anderson et al., Private Corrections: Feast or Fiasco?, PRISON J., Oct. 1985, at 32,
38.
181
See JACOB S. HACKER, THE CASE FOR PUBLIC PLAN CHOICE IN NATIONAL HEALTH REFORM: KEY TO COST CONTROL AND QUALITY COVERAGE 1–2 (Berkeley Law Ctr. on Health, Econ.,
& Fam. Security, Dec. 16, 2008), available at http://ourfuture.org/report/case-public-plan-choicenational-health-reform; see also WILLIAM A. NISKANEN, JR., BUREAUCRACY AND REPRESENTATIVE
GOVERNMENT 217 (1971) (“In the 1930’s, the primary case for the creation of public power authorities was to provide a “yardstick” with which to evaluate private electric utility monopolies.”).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

33

For instance, Gary Mohr, director of the Ohio Department of
Rehabilitation and Correction, has talked about creating a “culture
of competition” in corrections.182 Ohio has pursued a combination
of outsourcing and insourcing: some public prisons have been sold
or their management has been contracted out to the private sector,
while one private prison has been taken in-house.183 The result,
according to Mohr, is that one can “ratchet[] up the best practices
that can be created from both the public sector and multiple private
vendors.”184
But for this sort of system to work, we have to be able to fairly
compare private-sector and public-sector bids before the fact. The
cross-fertilization that’s supposed to result from competitive neutrality depends on flexibility, otherwise both sectors will try to do
the same thing. But, without performance measures, flexibility undermines the ability to do the comparative analysis of bids that’s
necessary to successfully implement cross-fertilization; the most
straightforward way of making efficiency comparisons without
performance measures is to mandate that the private sector replicate every public-sector procedure, down to the tiniest detail. And
indeed, this is what Mohr did when selling the North Central Correctional Complex facility to the private sector.185
But with performance measures—and with an understanding of
how proposed programs and methods translate into performance—
he would have been to take different proposals, translate them into
expected performance, and thus have a basis for comparison, even
if the proposals were radically dissimilar.186 (The beliefs about expected performance would then have to be verified by evaluating
the winning contractor’s performance after the fact.)
In particular, recall the problems involved in figuring out the
public sector’s true costs:187 the same problems can make for unfair competitions if public providers’ bids don’t include the costs
182

Gilroy, supra note 159.
Id.
184
Id.
185
Id. (“[I]n the [request for proposals], . . . [w]e replicated the post assignments and the staffing pattern and the policies and the food requirements. We basically said, ‘you must identify a minimum of a 5 percent savings’ from exactly the cost of what it has cost us to operate North Central.”).
186
Ohio actually has performance metrics, which are a combination of output and outcome
measures, covering “everything from violence indicators, to use of force indicators, to program
completion indicators (GED, etc.), to recidivism data.” Id. But they apparently weren’t used in the
way described above.
187
See supra Part II.A.1.
183

Draft—Please do not circulate

34

VOLOKH

they bear that are paid for by other departments, their different tax
treatment, and the like.188 So it’s not surprising that such a regime
is rare in the United States.189
One of the advantages of competitive neutrality is that—as in
Ohio—prisons can be both outsourced and insourced at different
times, depending on who wins the contract, so particular prisons
can “churn” between the public and private sectors. The result, according to Richard Harding, would be a “process of positive crossfertilization,”190 where best practices migrate from one sector to
another. “The opening up of the private sector,” Harding writes,
“may heighten awareness of how sloppy public accountability has
often been in the past, leading to the creation of innovative mechanisms applicable to both the private and the public sectors.”191 In
fact, Harding argues, systemic improvement has been one of the
best consequences of privatization,192 so narrowly focusing on
which sector is better in a static sense is almost beside the point.193
3. To Express What We Want
Measuring performance would do more than just let us know
which sector is better and promote cross-fertilization by facilitating
a competitive neutrality regime. On an even higher level, it would
encourage governments to better conceptualize what makes for a
good prison—an exercise that’s long overdue.194
Jon Vagg, for instance, argues that, in the UK, private prisons
“were a key factor in persuading the administration that standards
were necessary, if only for the purpose of monitoring contractual

188

See EGGERS, supra note 179, at 1, 8–11.
Thomas, supra note 134, at 81; id. at 86 (“I am aware of no example in the United States
that reveals fair competition between public and private providers of correctional services. Until both
of these policy failures are corrected, achieving many of the potential benefits of privatization will
be impossible.”); Harding, supra note 141, at 334 (also rare in Australia and UK).
190
HARDING, supra note 70, at 115; see also id. at 162; Developments, supra note 9, at 1890–
91; Gilroy, supra note 159.
191
HARDING, supra note 70, at 22–23.
192
Richard Harding, Private Prisons, in 28 CRIME AND JUSTICE: A REVIEW OF RESEARCH 265,
272–73, 331–36 (2001).
193
There remains the fear that, instead of system-wide improvement through cross-fertilization,
we’ll get a race to the bottom, as Gaes worries. See text accompanying supra note 142. But good
performance measures help avoid that problem.
194
See DICKER, supra note 119, at 6 (“[P]ayment-by-outcome . . . compels commissioners to
state explicitly the goals of policy.”).
189

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

35

compliance.”195 And that example isn’t just a fluke. Prisons have
been operating for centuries,196 and yet it was the experience of
privatization that spurred the development of performance
measures, as private-prison critics made arguments that privatization harmed quality and private-prison advocates made arguments
to the contrary.197 Now that performance measures exist, one can
use them to evaluate both the private and the public sectors, to the
benefit of both.
C. For Performance-Based Contracting
With performance measures, we can go further than just knowing how good public and private prisons are, implementing competitive neutrality, and formulating the proper goals of the prison
system—important as all that is. We can also incorporate the performance measures into contracts and make compensation contingent on performance, finally giving prison providers strong incentives to deliver high quality.
1. Limited Current Efforts
Performance-based compensation is being implemented in the
United States to a very limited extent. As noted above,198 5% of the
contract price at the Bureau of Prisons’ Taft facility was performance-based. Taft was a demonstration project, which should give
one a sense of how new this enterprise is.199
The UK is now on the forefront of performance-based compensation, which it calls “[p]ayment-by-outcome” or “payment-byresults.”200 The idea was floated in a 2008 Conservative Party
195

VAGG, supra note 78, at 307.
See, e.g., RALPH B. PUGH, IMPRISONMENT IN MEDIEVAL ENGLAND (1968); G. GELTNER,
THE MEDIEVAL PRISON: A SOCIAL HISTORY (2008); Edward M. Peters, Prison Before the Prison:
The Ancient and Medieval Worlds, in THE OXFORD HISTORY OF THE PRISON: THE PRACTICE OF
PUNISHMENT IN WESTERN SOCIETY 3 (Norval Morris & David J. Rothman eds., 1995).
197
HARDING, supra note 70, at 22; GAES ET AL., supra note 7, at xi, 153, 180; cf. NISKANEN,
supra note 181, at 217 (“[T]he case for the private supply of some public services is . . . to provide a
yardstick to evaluate the performance of budget-maximizing monopoly bureaus.”).
198
See text accompanying supra notes 167–168.
199
Also, in Kansas, SB14 rewards community corrections agencies for reductions in recidivism
beyond a set target. See CONSERVATIVE PARTY, PRISONS WITH A PURPOSE: OUR SENTENCING AND
REHABILITATION REVOLUTION TO BREAK THE CYCLE OF CRIME 74 (Security Agenda, Policy Green
Paper No. 4, 2008).
200
DICKER, supra note 119, at 6.
196

Draft—Please do not circulate

36

VOLOKH

Green Paper201 and, once the Conservative Party came into power,
developed in a 2010 Green Paper from the Ministry of Justice.202
Payment-by-results is being introduced in three prisons: two private prisons, Peterborough203 and Doncaster,204 and a public prison, Leeds,205 though the plan is to extend the model to all prisons
by 2015.206 The measure is the 12-month reconviction rate,207
compared to a matched comparison group. At Peterborough, performance-based “[p]ayments start when the reconviction rate of the
intervention group is 7.5% less than that of the matched comparison group, with increasing returns up to a maximum rate of
13%.”208 “The Peterborough pilot is the first in the world where
private investors have assumed financial risk for reducing reoffending.”209 In addition to having access to a range of prison
programs to prevent recidivism, offenders at Doncaster are assigned case managers to support them during their sentence and
after release, offering advice and help on employment, housing,
and benefits issues.210 (Earlier experience with payment-by-results
was “primarily limited to the welfare to work market[,] where success [was] varied and limited.”211)
A parallel program focused on finding jobs for offenders,
called Job Deal, compensates providers based on employment
rates.212 Compensation is 70% fixed and 30% conditional; a third
of the conditional payment is for an output measure, “successfully
201

CONSERVATIVE PARTY, supra note 199, at 49, 72–75.
UK MIN. OF JUST., BREAKING THE CYCLE: EFFECTIVE PUNISHMENT, REHABILITATION AND
SENTENCING OF OFFENDERS (2010).
203
Id. at 13.
204
Wesley Johnson, Payment-by-Results Project Bid to Cut Reoffending, INDEP., Oct. 11, 2011,
available at http://www.independent.co.uk/news/uk/crime/paymentbyresults-project-bid-to-cutreoffending-2368793.html; UK Min. of Just., Innovative Rehabilitation—Payment by Results at
Doncaster Prison, Oct. 13, 2011, available at https://www.gov.uk/government/news/innovativerehabilitation-payment-by-results-at-doncaster-prison.
205
Joe Inwood, State-Run Leeds Prison to Be Paid on Results, BBC NEWS, Oct. 27, 2011,
available at http://www.bbc.co.uk/news/uk-england-leeds-15479570. Leeds Prison is also called
Armley. Id.
206
Id.
207
DICKER, supra note 119, at 13 & 30 n.29.
208
Id. at 13. At Doncaster, payments start when the reduction is 5%. UK Min. of Just., supra
note 204.
209
DICKER, supra note 119, at 13.
210
There’s also a 24-hour help line. Johnson, supra note 204; UK Min. of Just., supra note 204.
211
CHRIS NICHOLSON, REHABILITATION WORKS: ENSURING PAYMENT BY RESULTS CUTS
REOFFENDING 5 (CentreForum, 2011); see also id. at 21–24 (discussing the experience with payment by results in the welfare to work context, characterizing the “Pathways to Work” program as
unsuccessful and the “Enterprise Zones” program as reasonably successful).
212
DICKER, supra note 119, at 13.
202

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

37

enrolling offenders” in the program; another third is for “a combination of outputs and processes” such as “helping clients open
bank accounts”; and another third is “for achieving ‘hard outcomes.’”213 Note, though, that even these “hard outcomes” are
softer than they might seem, because they include finding a job but
also include “enrolling in further learning.”214 Some additional
payment-by-results programs have also been proposed by the government or by the Social Market Foundation, focusing either on reoffending rates or on other outcomes or outputs like “drug use cessation or employment.”215
2. The Range of Possible Contracts
a. General Considerations
These examples suggest how performance-based contracts
could be structured. The contract could provide that the contract
price is not just the usual flat per-diem per prisoner,216 but an incentive payment that—as a simple example—could vary (positively) with how many inmates find jobs or (negatively) with how
many inmates are rearrested within two years.217
Outcome measurements may not always be available for all
dimensions of quality, so some measurement of inputs may continue to be necessary.218 But as far as possible, the ideal should be to
make compensation contingent not on inputs like guard training, or
213

Id. at 14.
Id.
Id.
216
Dolovich, supra note 5, at 474; see also Tennessee CCA 2007 contract, supra note 66,
¶ C.3, at 22 (laying out schedule of per-diems).
217
Kenneth L. Avio, On Private Prisons: An Economic Analysis of the Model Contract and
Model Statute for Private Incarceration, 17 NEW ENG. J. ON CRIM. & CIV. CONFIN. 265, 294–95
(1991); Daniel L. Low, Nonprofit Private Prisons: The Next Generation of Prison Management, 29
NEW ENG. J. ON CRIM. & CIV. CONFINEMENT 1, 46 (2003); Gaes, supra note 36, at 23 (citing GAES
ET AL., supra note 7); Kyle, supra note 158, at 2111–12.
218
Durham suggests that “process-oriented monitoring methods” continue to be used: “For instance, a system of frequent accounting of staffing levels can detect shortfalls in staffing that may
lead to a diminution in service provision. . . . If the change in staffing levels is detected relatively
quickly, efforts can be made to either restore institutional staff to initial levels or to alter the evaluation design.”). Durham, supra note 20, at 66; see also Sidney A. Shapiro & Rena Steinzor, Capture,
Accountability, and Regulatory Metrics, 86 TEX. L. REV. 1741, 1775, 1779 (2008); DICKER, supra
note 119, at 16 (suggesting intermediate outcomes such as drug misuse, stability of relationships, or
becoming debt-free). Cf. Shapiro & Steinzor, supra, at 1768 (in context of EPA and GPRA); GEN.
ACCOUNTING OFFICE, PERFORMANCE-BASED ORGANIZATIONS: LESSONS FROM THE BRITISH NEXT
STEPS INITIATIVE 7 (1997) (in context of British Next Steps agencies).
214
215

Draft—Please do not circulate

38

VOLOKH

even on outputs like the number of GEDs granted or the number of
rehabilitative programs offered or ACA accreditation,219 but primarily on actual outcomes like the extent of unconstitutional conditions or how well prisoners are actually rehabilitated or how
many prisoners get jobs.220
The amount of the bonus can be a flat fee, or it could be more
complicated—in the case of recidivism bonuses, the bonus could
be inmate-specific, depending on “the probability and social cost
of recidivism for each inmate”—or it could even be determined by
competitive bidding.221 It’s often charged that private prisons have
little incentive to invest in rehabilitation, and in fact have an incentive to try to increase recidivism, so that they can get (at least some
of) the same inmates back later; if this is so, the bonuses should be
at least high enough to counteract this incentive, so rehabilitating
inmates is affirmatively attractive to prison firms.222
Though I focus here on monetary rewards and penalties, there
are other possibilities. High performance could, instead of increasing a firm’s compensation in the individual contract, merely confer
a reputational benefit, increasing its probability of winning future
bids.223 One could give out certificates224 or “even simply publiciz[e] league tables of recidivism performance.”225 Or one could
reward good performers by giving them more flexibility in future
contracts.226

219
See MCDONALD ET AL., supra note 125, at 49 (“Correctional administrators . . . reported
that 57 of the contracts in force at the end of 1997 required that facilities achieve ACA accreditation
within a specified time.”).
220
Kyle, supra note 158, at 2112–13.
221
Low, supra note 217, at 46; see also infra Part IV.C.3.
222
Pozen, supra note 77, at 283–84; Avio, supra note 164, at 150; James T. Gentry, Note, The
Panopticon Revisited: The Problem of Monitoring Private Prisons, 96 YALE L.J. 353, 362–63
(1986).
223
DICKER, supra note 119, at 25; CONSERVATIVE PARTY, supra note 199, at 73–74 (describing Avon Park Youth Academy in Florida as “a prison rewarded by results,” even though its only
reward was having its contract renewed, “a decision clearly influenced” by its lower recidivism
results).
224
Burt S. Barnow, The Effects of Performance Standards on State and Local Programs, in
EVALUATING WELFARE AND TRAINING PROGRAMS 277, 286 (Charles F. Manski & Irwin Garfinkel
eds., 1992).
225
Pozen, supra note 77, at 283.
226
Barnow, supra note 224, at 286.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

39

b. Rewards or Penalties
Going back to monetary incentives, one can choose between
penalties for bad performance and rewards for good performance,227 though the difference needn’t be that important.
Consider a “rewards” contract that offers a $1 per diem reward
for each unit of quality on a hypothetical zero-to-ten scale, so the
potential reward is $0 to $10. Suppose Acme Corrections Corp.
expects to achieve a quality level of 5 at a total cost of $35 per diem.228 Then it would be willing to submit a bid of $30 or above for
the project; it would just cover its costs with the $30 payment plus
the $5 reward. (Recall that prison bids are bids on how much money the contractor will get from the government; a $30 per diem
winning bid means that the contractor will be paid $30 per inmateday.) Suppose bidding is competitive, other firms have similar
technology, and Acme is the most efficient firm; then Acme wins
the auction with its $30 bid.229 (A less efficient firm, say one that
would require $36 per diem to achieve quality level 5, wouldn’t
bid below $31, so Acme, as a more efficient firm, would be automatically rewarded up front for its higher quality by having a better chance of winning the auction.230 The bids don’t tell us the true
social cost, the true cost to the government, or the true quality—
that requires waiting for the actual realized level of quality, which
determines the level of the reward—but they do signal which firm
is (or believes that it is) more efficient.231)
Now consider an alternative “penalties” contract that offers a
$1 penalty for each unit of quality below 10 (i.e., 7 units of quality
lead to a $3 penalty). This contract has equivalent incentive effects
to the previous one: a provider will invest in a unit of quality as
long as its cost of doing so is under $1.232 Therefore, these incentives, as before, make Acme expect to achieve the same quality
level of 5, which we have seen carries a total cost of $35 per diem.
227

Thomas, supra note 134, at 108–09.
This is taking into account the incentive effects of the $1-per-unit reward. Perhaps earlier,
with fixed-price contracts, Acme only achieved, say, a quality level of 3 at a total cost of $32.
229
I discuss auction-theoretic considerations like the winner’s curse at text accompanying infra
note 249.
230
See Gentry, supra note 222, at 363.
231
See also text accompanying infra notes 249–250.
232
Here, I’m abstracting away from behavioral factors that might make rewards more attractive
than punishments. See BEHAVIORAL LAW AND ECONOMICS (Cass R. Sunstein ed., 2000).
228

Draft—Please do not circulate

40

VOLOKH

Now Acme is willing to submit a bid of $40 or above for the project; it would just cover its cost with the $40 payment minus the $5
penalty. Again, with the competitive bidding assumptions listed
above, Acme wins the auction with its $40 bid.
So even though the contracts look different, they have essentially identical incentives, and any superficial differences between
them are, roughly speaking, ironed out in the bidding process. The
provider’s degree of risk aversion doesn’t change the result. The
government can offer contracts with penalties, but then it will pay
more to the winning bidder; or it can offer contracts with rewards,
and the winning bidder will be satisfied with less. (One difference
might be in the timing of the payments: if the base price is paid up
front while rewards or penalties are processed some time later, the
first contract is somewhat less valuable than the second because its
payments are more delayed.233)
c. Controlling for Baselines
In the same way, it probably doesn’t make a huge difference
whether the compensation takes into account the baseline level of
quality.
Controlling for baselines is a huge issue in the literature on performance measures.234 For instance, an early paper on performance
measures, by Gloria Grizzle and coauthors,235 discussed methodological issues regarding what makes for a good performance measure. A large part of the discussion focused on doing the proper
econometric modeling to figure out the causal factors behind a performance measure. Figuring out these causal factors is important at
least for two reasons (beyond merely understanding the process).
One is to have a sense of what input or output measures to use if
233

See text accompanying infra note 384.
See, e.g., Kyle, supra note 158, at 2112 (controlling for “age, prior criminal history, and
sex”); id. at 2113 & n.136 (controlling for crime rates); DICKER, supra note 119, at 20 & fig.2 (use
“performance of control groups” or a whole range of control methods); Barnow, supra note 224, at
281 (“[P]erformance management systems [could] measure outcomes relative to [a] standard [that is]
set to take into account what would have occurred in the absence of the program . . . .”); GAES ET
AL., supra note 7, at 159 (citing C.J. Heinrich, Outcomes-Based Performance Management in the
Public Sector: Implications for Government Accountability and Effectiveness, 62 PUB. ADMIN. REV.
712 (2002) (questioning “whether outcome measures in the absence of a control or comparison
group can provide meaningful information” in context of Job Training Partnership Act).
235
GLORIA A. GRIZZLE ET AL., BASIC ISSUES IN CORRECTIONS PERFORMANCE 4 (Nat’l Inst. of
Just., 1982).
234

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

41

the outcome measures aren’t available in a given case.236 Another
is to be able to properly assign credit, so providers who get a bad
(or good) population of inmates aren’t blamed (or praised) for bad
(or good) results.237
Similarly, Gerald Gaes and his coauthors argue that “social scientists should push ultimate outcomes as far as they can be
pushed,”238 but that, in light of the other factors that affect recidivism, “[i]t is also desirable to have more direct measures of intermediate changes to human behavior that precede desistance, and
that may be influenced by criminal justice interventions.”239 They
don’t directly list desirable performance measures—they give an
example of performance measures for the specific element of
“Prison Security Performance,”240 though they stress that one
should do a similar exercise for other elements of prison performance such as health care.241 The main characteristic of their approach is its emphasis on adequately modeling prison performance
in terms of individual-level and institutional-level independent variables, so that one can properly attribute credit where credit is due,
avoid blaming prisons for factors beyond their control like the
characteristics of the inmates, and figure out what inputs are actually important in producing prison performance.242 For instance,
for health care, rather than measure (or in addition to measuring)
the prevalence of a disease in the prison, which indicates the potential for transmission, it would be useful to use the number of cases
in the incoming population as a baseline, and measure the number
of new cases.243
Is all this necessary? Let’s do our numerical example again:
Consider the rewards contract discussed above, with a $1 per diem
reward for every unit of quality on a zero-to-ten scale;244 the winning bidder, who expected to deliver quality level 5 at a cost of
$35, would have won the contract with a bid of $30. Now consider
236

See infra Part III.D.
GRIZZLE ET AL., supra note 235, at 91.
238
GAES ET AL., supra note 7, at 7.
239
Id.
240
Id. at 142 tbl.10.1.
241
Id. at 141.
242
Id. at 144 (discussing differences with Logan model); see also id. at 4 (suggesting “develop[ing] an expected rate of crime for a community or an expected rate of misconduct for a prison
based on characteristics of the people and inmates”).
243
Id. at 38.
244
See text accompanying supra notes 228–229.
237

Draft—Please do not circulate

42

VOLOKH

a rewards contract that controls for the baseline level of quality;
suppose the expected level of quality for this prison is 4, so a quality level of 5 would yield a reward of $1.
The only effect of the quality adjustment is to reduce reward
payments by $4. A bidder who was willing to bid $30 on the unadjusted contract would be willing to bid $34 on the adjusted contract, to take into account the $4 reduction in the expected reward.
Either way, the payoff is the same to the contractor—and the price
is the same to the government. The government saves $4 on reward
payments but pays it all out again in the base contract price that
emerges from the auction. Jeremy Bentham argued against controlling for baselines two centuries ago:
I would make [the contractor] pay so much for every one
that died, without troubling myself whether any care of his
could have kept the man alive. To be sure he would make
me pay for this in the contract; but as I should receive it
from him afterwards, what it cost me in the long run would
be no great matter. . . .
. . . [Under this system,] you need not doubt of his fondness
of these his adopted children; of whom whosoever may
chance while under his wing to depart this vale of tears,
will be sure to leave one sincere mourner at least . . . .245
To be sure, the bidder has to have a way to figure out that the
expected level of quality is 4. This requires two things. First, the
bidder should have a belief about the proper model to predict the
baseline quality level; different bidders can have competing beliefs
about reality that lead them to different predictions. Second, it
needs to know have enough information about the population of
inmates to be able to plug into its model. Where either of these is
absent, the contractor won’t know how much to bid—this might
lead to excessive payments from the taxpayer’s point of view or
insufficient payments from the contractor’s point of view—but the
incentive effects will remain the same.
So while adjusting for the baseline is relevant for various reasons—it allows one to more accurately assign praise or blame,
rank different facilities,246 and so on—it doesn’t seem absolutely
245

Gentry, supra note 222, at 362 n.52 (quoting 1 J. BENTHAM, PANOPTICON, at 71–73 (Dublin

1791)).
246

See GAES ET AL., supra note 7, at 144 (concern with rank-ordering institutions).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

43

necessary for a compensation scheme to provide the proper incentives for improvement.
Moreover, risk aversion makes a difference here,247 but not in
the way one would expect. Controlling for baselines might even
increase risk, depending on the uncertainty in the calculation of the
baseline.248
If the contractor gets too little, there is the concern that it might
not be able to fund the project and might go bankrupt within the
contractual term. But this is the same concern that happens with all
bidding. Whether or not we adjust the payment for the baseline, the
winning bid under a low-bid system will be subject to the “winner’s curse.”249 As a simple example, consider many firms with
identical technology. They each have slightly different models for
predicting how profitable a prison will be, and firms with higher
predictions will submit lower bids. At most one of these models is
correct; everyone else’s model is incorrect to some degree. The
lowest bid will thus come from the bidder who makes the most
wildly incorrect overestimate of his profits. Sophisticated bidders
adjust their bids to take the winner’s curse into account, but the
winning bidder might either be unsophisticated or end up not having adjusted his bid enough. So the threat of contractors who go
bankrupt—or of contractors who bid low and then try and hold the
government up for more money250—is real. But, again, this hap247
Recall that it didn’t in the reasoning establishing the equivalence of reward and penalty contracts, see supra Part III.C.2.b.
248
Without controlling for baselines, the winning contractor gets a contract price of P and a
performance-based reward R, bears costs of C, and and his payoff is P + R – C; the variance of the
payoff is var(R) + var(C) if R and C are independent. Now let’s control for baselines; for simplicity,
assume this just involves subtracting an adjustment A from the reward, where A is determined by the
expected baseline level of performance. The contract price becomes P', and the contractor’s new
payoff is P' + R – A – C. If A has no randomness—everyone knows how the government’s formula
and everyone knows the underlying data that the government is plugging into the formula—then
var(A) = 0 and the variance of the new payoff is the same var(R) + var(C). But if the data or the
formula is somewhat uncertain, var(A) is positive, so the variance of the new payoff is var(R) +
var(A) + var(C) if R, A, and C are independent, which is greater.
This doesn’t necessarily have to happen. Suppose, for instance, that R, A, and C aren’t independent, but instead there’s some negative covariance among R, A, and C. Then the randomness of
A might cancel out some of the randomness of R and C, and the adjustment can indeed reduce risk.
The point in the text, though, is that this needn’t be the case, and the adjustment, though often defended as a risk-reducing move for contractors, could end up doing the opposite.
249
See, e.g., PATRICK BOLTON & MATHIAS DEWATRIPONT, CONTRACT THEORY 283–85
(2005).
250
See Robert W. Poole Jr., Privatization, CONCISE ENCYCLOPEDIA OF ECONOMICS (2007),
http://www.econlib.org/library/Enc/Privatization.html; Mary Sigler, Private Prisons, Public Functions, and the Meaning of Punishment, 38 FLA. ST. U. L. REV. 149, 155 (2010); Jody Freeman, The
Private Role in Public Governance, 75 NYU L. REV. 543, 574 (2000). See generally OLIVER HART,
(continued next page)

Draft—Please do not circulate

44

VOLOKH

pens regardless of whether we adjust for baselines. The solution is
instead to require bonds, to rely on a track record of past performance (and restrict complete newcomers to small projects until
they’ve proven themselves), or otherwise to try to weed out financially unsophisticated or untrustworthy parties.
d. Discrete vs. Continuous Measures
Note that, in the preceding example, the contract price varied
continuously with the level of quality.251 Another possibility would
have been to use a binary compensation scheme, where the reward
or penalty is contingent on whether one reaches a particular target.
This could look like “Get a fixed reward only if you achieve less
than 50% recidivism.”252
These binary schemes, while easier to implement, are problematic in several ways. Providers who don’t expect to be able to reach
anywhere near the target have little incentive to try to achieve anything at all.253 Providers who do expect to be able to reach the target quite comfortably have little incentive to try to achieve anything additional.254 Providers who may or may not be able to reach
the target are subjected to more risk than they would bear under a
continuous scheme.255 Perhaps a large corporation might act
somewhat risk-neutrally, so risk won’t matter; but smaller firms or
nonprofits may refrain from bidding, or may require more money

FIRMS, CONTRACTS AND FINANCIAL STRUCTURE (1995) (discussing opportunistic behavior in contract relationships).
251
Well, the example as worded involved discrete jumps, but one can easily imagine the prorated version. The “continuous” scheme is also called a “distance travelled” scheme. DICKER, supra
note 119, at 16; see text accompanying infra notes 411–414, 425.
252
See also HARDING, supra note 70, at 68 (“x per cent of participants [in a remedial literacy
class] reaching attainment level y in z months”).
253
DICKER, supra note 119, at 19 (a continuous measure “may incentivise providers to engage
with high-risk offenders who are unlikely to achieve absolute desistance”); HARDING, supra note 70,
at 68.
254
On the other hand, incentives are very large for those who could be just under the cutoff but
could also reach the cutoff; but even then, unless the cutoff is a magical point, it’s probably more
socially optimal to provide continuous incentives.
255
Kyle also notes the following advantage of a sliding scale: it “would reduce the likelihood
that private companies would receive an undeserved windfall—the farther in standard deviations
from the mean the private prison is, the more likely a causal relationship that should be rewarded
exists.” Kyle, supra note 158, at 2112. More accurately, this depends on the likely effect of rehabilitative measures versus the likely magnitude of unobserved factors: it could be that a truly exceptional performance in fact reflects an unusually (and unobservedly) good or rehabilitable crop of inmates.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

45

to take the project, or may be reluctant to try high-expected-value
but risky strategies.256
(Of course, one could also imagine intermediate reward
schemes: for example, the reward could be almost flat for any level
of recidivism above 50% and increase rapidly at or below 50%, for
instance “Get a reward of $0.01 for every percentage-point reduction of recidivism below 100% and down to 50%, and then a reward of $1.00 for every percentage-point reduction beyond
50%.”257 British performance contracts, where payments don’t
start until the decrease in recidivism is 5% or 7.5%, and where
payments are capped once the decrease is high enough, fit this
mold.258 At this point I won’t do anything more than signal the existence of such contracts, though the optimal slope of the compensation scheme is something I’ll return to below when I discuss risk
allocation.259)
The same is true of penalties that may occur during the contractual term. Governments can terminate their contracts260—this is
a form of binary scheme—though this is a rare remedy that tends
to be reserved for the most extreme abuses.261 Providing for graduated financial penalties for abuses of different severity is probably
a better solution than merely providing for contract rescission, because draconian penalties are less likely to be used. Not that termination isn’t appropriate in extreme cases—governments should
always retain the ability to take over a prison if a contract is terminated;262 the need to retain a credible threat of termination is one
reason to prefer that governments, not prison firms, own the prisons.263

256
See also infra Part IV.B.2. Some also mention the possibility that the public could see the
cotinuous measure as being “too lenient.” DICKER, supra note 119, at 20.
257
DICKER, supra note 119, at 24 (“minimum threshold of achievement that providers must attain before payments commence.”); id. at 25 (discussing a “target accelerator,” where increases are
rewarded at an increasing rate).
258
See supra note 208 and accompanying text.
259
See infra Part IV.B.2.
260
See Tennessee CCA 2007 contract, supra note 66, ¶ D.3, at 24 (“The State may terminate
this Contract without cause for any reason.”); id. ¶ D.4, at 24 (“If the Contractor fails to properly
perform its obligations under this Contract in a timely or proper manner, or if the Contractor violates
any terms of this Contract, the State shall have the right to immediately terminate the Contract and
withhold payments in excess of fair compensation for completed services.”).
261
See Developments, supra note 9, at 1883–84; Dolovich, supra note 5, at 495–500.
262
See text accompanying supra note 177.
263
See Levinson, supra note 24, at 90.

Draft—Please do not circulate

46

VOLOKH

3. The Feasibility of Merit Pay in the Public Sector
Note, also, that while I’ve been primarily concentrating on incentives for private firms, there’s no inherent reason why performance-based compensation can’t also be considered for public
prison wardens264—consider the example of Leeds noted
above265—especially if we simultaneously pursue competitive neutrality.266 As John Donahue says, “the fundamental distinction is
between competitive output-based relationships and noncompetitive input-based relationships rather than between profit-seekers
and civil servants per se.”267 Proposals to reward public servants
for high performance aren’t rare,268 and merit-based compensation
in the public sector has increased in recent years,269 but it’s still
hard to find in corrections.270
Researchers differ on how feasible merit pay is in the public
sector;271 I won’t resolve the argument here, except to note that the
Government Performance and Results Act of 1993 has a procedure
by which agencies can make proposals “to waive administrative
procedural requirements and controls, including specification of
personnel staffing levels, limitations on compensation or remuneration, and prohibitions or restrictions on funding transfers . . . in
return for specific individual or organization accountability to
achieve a performance goal.”272 Any such proposal, according to
the statute, must “describe the anticipated effects on performance
264

Rick Hills, Merit Pay for Prison Wardens?, PRAWFSBLAWG, Mar. 3, 2008, http://
prawfsblawg.blogs.com/prawfsblawg/2008/03/tying-the-salar.html.
265
See text accompanying supra notes 205.
266
See text accompanying supra notes 179–192.
267
JOHN D. DONAHUE, THE PRIVATIZATION DECISION: PUBLIC ENDS, PRIVATE MEANS 82
(1989) (italics omitted).
268
See NISKANEN, supra note 181, at 201–09; Lynn, supra note 161, at 11; Barnow, supra note
224, at 307–08; cf. also David N. Figlio & Lawrence W. Kenny, Individual Teacher Incentives and
Student Performance, 91 J. PUB. ECON. 901 (2007) (examining effects of teacher merit pay).
269
See Jon D. Michaels, Privatization’s Progeny, 101 GEO. L.J. 1023, 1048–49 & nn.124–25
(2013).
270
Thomas, supra note 134, at 109.
271
Compare Harding, supra note 141, at 304 (“The financial incentive should drive performance in a way that is impossible in the state-funded public sector.”), and MCDONALD & PATTEN,
supra note 151, at xxvii (“When structuring contracts, [governments] have opportunities to create
incentives and mechanisms for accountability that are more difficult to implement in existing public
organizations.”), with GAES ET AL., supra note 7, at 151 (““There is certainly no reason why public
administrators cannot award bonuses to the best performing public prison managers and their employees, while also demoting, firing, or transferring the managers who are substandard.”), and id. at
180.
272
31 U.S.C. § 9703(a).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

47

resulting from greater managerial or organizational flexibility, discretion, and authority, and . . . quantify the expected improvements
in performance resulting from any waiver,”273 “precisely express
the monetary change in compensation or remuneration amounts,
such as bonuses or awards, that shall result from meeting, exceeding, or failing to meet performance goals,”274 and be “endorsed by
the agency that established the requirement.”275 Just reading the
statutory language—and this is a statute that purports to encourage
flexibility—doesn’t exactly give one confidence that public-sector
flexibility is easy to come by, at least in the federal system.
At the very least, though, to the extent performance-based
compensation is a good idea in the private sector, it may well also
be a good idea in the public sector.276 How feasible that is is a
question of the relevant state or federal law.
D. What Measures to Choose
The earlier discussion of how to define recidivism277 shows
that a lot rides on choosing the outcome measures judiciously. This
applies across the board, not just to recidivism. This section considers two distinct aspects of performance measures. The first is
that wherever outcome measures have been used, output measures
haven’t been abandoned. The second is that what outcomes to
measure—and even whether something counts as an output or outcome measure—is inevitably a value-laden question, which must
be resolved for a performance-based compensation scheme to go
forward. The inevitable incompleteness of outcome measures—
and therefore the need to supplement outcomes with outputs—can
give rise to undesirable strategic behavior, which I discuss in a later section.278

273

Id. § 9703(b).
Id. § 9703(c).
275
Id. § 9703(d).
276
Some of the disadvantages of performance-based compensation may apply with different
force in the public than in the private sector. For instance, the concern that market incentives will
discourage public-interested people from entering the industry, see infra Part IV.B.1, seems to not
apply at all to private providers, who are presumably already profit-motivated.
277
See text accompanying supra notes 116–122.
278
See infra Part IV.C.2. This section only covers what measures should rationally be chosen,
not the real-world possibilities for manipulation in the choice of goals. That sort of strategic behavior
is covered infra Part IV.C.1.
274

Draft—Please do not circulate

48

VOLOKH

Adopting specific outcomes to measure is equivalent to adopting what John DiIulio calls an “operational” goal—“an image of a
desired future state of affairs that can be compared unambiguously
to an actual or existing state of affairs.”279 “‘Improving the quality
of public education in America’ is a nonoperational goal; ‘Increasing the average verbal and math SAT scores of public school students by 20% between the year 1992 and the year 2000’ is an operational goal.”280 Similarly, “[r]eforming criminals” is nonoperational, while “[d]oubling the rate of inmate participation in prison
industry programs” is operational.281 That last goal was outputbased, but there’s no reason we can’t, as in the education example,
adopt an outcome-based goal—we could just agree on a convenient if arbitrary measure of how well criminals are reformed, such
as the two-year reconviction rate. Moreover, there’s no reason to
adopt a numerical target as the goal (which would be binary); the
goal might merely be (thinking more continuously) to reduce the
rate as far as possible.282 And there’s no reason to adopt a unique
goal: multiple operational goals can be implemented in one part of
an overall index that determines compensation.283
A useful way to explore this question is to examine some existing prison performance measures.
Perhaps one of the oldest formal approaches284 to measuring
prison performance is the Correctional Institutions Environment
Scale285 developed by Rudolph Moos in the late 1960s286 and often

279

John J. DiIulio, Jr., Measuring Performance When There Is No Bottom Line, in PERFOR142, 144 (John J. DiIulio et al. eds., Bur. of
Just. Stats. 1993).
280
Id.
281
Id.
282
See text accompanying supra notes 251–256.
283
Of course, one should also set the weights to be put on the various measures in the index.
See infra Part IV.A. Cf. also Barnow, supra note 224, at 284 (“Even if the program has a single
objective, it may be advantageous to use several measures as proxies if an ideal measure cannot be
developed.”); GRIZZLE ET AL., supra note 235, at 80.
284
A survey article in 1975 reviewed 231 studies of particular performance measures, but at
that time, in the authors’ opinions, there had apparently never been any comprehensive approach.
(Presumably the Moos approach, if it was considered, was thought to be insufficiently comprehensive or not performance-oriented.) The American Correctional Association had published comprehensive standards in the late 1970s, but they were primarily process-oriented. GRIZZLE ET AL., supra
note 235, at 4 (citing DOUGLAS LIPTON ET AL., THE EFFECTIVENESS OF CORRECTIONAL TREATMENT: A SURVEY OF TREATMENT EVALUATION STUDIES (1975); AM. CORR. ASS’N, MANUAL OF
STANDARDS FOR ADULT CORRECTIONAL INSTITUTIONS (1977)).
285
Michael Montgomery, Performance Measures and Private Prisons, in 3 PRISON PRIVATIZATION, supra note 138, at 187, 193.
MANCE MEASURES FOR THE CRIMINAL JUSTICE SYSTEM

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

49

used in the 1970s.287 The Moos scale contains several subscales:
“Involvement,” “Support,” “Expressiveness,” “Autonomy,” “Practical Orientation,” “Personal Problem Orientation,” “Order and
Organization,” “Clarity,” and “Staff Control.”288 These elements
generally aren’t true performance measures, and it’s immediately
apparent from their definitions that some are highly impressionistic. The “Involvement” variable measures “how active and energetic residents are . . .”; the “Support” variable measures “the extent to which residents are encouraged to be helpful and supportive
. . .”; and so on, with an emphasis on measuring the extent of supportiveness and encouragement.289 The scale was criticized because it wasn’t clear what the difference between some of the elements was and to what extent they were correlated,290 and even to
what extent they described a real phenomenon.291 Some critics
wrote that “when the CIES is administered and the individual
scores are tallied and averaged, we really have no idea what the
scores on the nine subscales indicate.”292 Ultimately, the scale was
“determined not to possess acceptable validity.”293
A later approach, described in 1980 by in a report by Martha
Burt, uses five types of measures: “Measures of Security” including the escape rate and escape seriousness, “Measures of Living
and Safety Conditions” such as victimization, overcrowding, and
sanitation, “Measures of Inmate Health” (both physical and mental), “Intermediate Products of Programs and Services” like improvements in basic skills and vocational education completed, and
“Measures of Post-Release Success” including employment success and recidivism.294 Only the fourth category is explicitly labeled “Intermediate Products,”295 but some of the other measures

286
Kevin N. Wright & James Boudouris, An Assessment of the Moos Correctional Institutions
Environment Scale, 19 J. RES. IN CRIME & DELINQ. 255, 255 (1982).
287
Id. (citing sources using the Moos scale in the 1970s).
288
Id. at 257 (quoting RUDOLF H. MOOS, EVALUATING CORRECTIONAL AND COMMUNITY SETTINGS 41 (1975)).
289
Id.
290
Id. at 256; Elaine Selo, Book Review, 4 J. CRIM. JUST. 348, 349 (1976) (reviewing MOOS,
supra note 288).
291
Wright & Boudouris, supra note 286, at 258.
292
Id. at 274.
293
Montgomery, supra note 285, at 193.
294
MARTHA R. BURT, MEASURING PRISON RESULTS: WAYS TO MONITOR AND EVALUATE
CORRECTIONS PERFORMANCE ii (Final Report, Nat’l Inst. of Just., 1980).
295
Id. at 97–105.

Draft—Please do not circulate

50

VOLOKH

are also outputs, not outcomes—see, for instance, the use of hospitalizations and sick days in the measures of inmate health.296
The mixing of output and outcome measures is fairly typical;
John DiIulio criticizes BOP’s Key Indicators/Strategic Support
System297 for also “indiscriminate[ly] mix[ing] . . . process [i.e.,
input or output] and performance [i.e., outcome] measures.”298 But
DiIulio himself has measured prison quality in terms of “order
(rates of individual and collective violence and other forms of misconduct), amenity (availability of clean cells, decent food, etc.),
and service (availability of work opportunities, educational programs, etc.)”:299 note the output measures in the inclusion of the
availability (not the effectiveness) of programming.
The MTC Institute, the research arm of the private prison firm
Management & Training Corp. (MTC), likewise calls for holding
prisons accountable for “outcomes”; but these “outcomes” include
not only assaults, escapes, recidivism, overcrowding, and the like,
but also outputs like “[s]ubstance abuse education/treatment completions” and “[p]roportion of inmates participating in spiritual development program(s).”300
The American Correctional Association’s performance-based
standards for correctional health care301 raise the same issue. Some
of these are true outcomes, like “the rate of positive tuberculin skin
tests”302 or the suicide rate,303 though others are process measures
or expected practices, like whether an offender “is informed about
access to health systems and the grievance procedure.”304 And the
Prison Social Climate Survey, which is based on inmate and staff
surveys, likewise mixes outcomes (such as crowding305 or safe296

Id. at 72.
See WILLIAM G. SAYLOR, DEVELOPING A STRATEGIC SUPPORT SYSTEM: MONITORING THE
BUREAU’S PERFORMANCE VIA TRENDS IN KEY INDICATORS (1988).
298
DiIulio, supra note 279, at 150–52.
299
John J. DiIulio, Jr., Recovering the Public Management Variable: Lessons from Schools,
Prisons, and Armies, 49 PUB. ADM. REV. 127, 129 (1989) (citing JOHN J. DIIULIO, JR., GOVERNING
PRISONS: A COMPARATIVE STUDY OF CORRECTIONAL MANAGEMENT (1987)).
300
MTC INST., MEASURING SUCCESS: IMPROVING THE EFFECTIVENESS OF CORRECTIONAL FACILITIES 5 (2006).
301
AM. CORR. ASS’N, PERFORMANCE-BASED STANDARDS FOR CORRECTIONAL HEALTH CARE
IN ADULT CORRECTIONAL INSTITUTIONS (2002). These standards are discussed in GAES ET AL.,
supra note 7, at 37–38.
302
GAES ET AL., supra note 7, at 37.
303
Id. at 38.
304
Id. at 37.
305
Michael W. Ross et al., Measurement of Prison Social Climate: A Comparison of an Inmate
Measure in England and the USA, 10 PUNISH. & SOC. 447, 461 (2008).
297

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

51

ty306) with outputs (such as whether the prison is a pleasant place
to work for staff307).
It’s clear, then, that outcomes and output measures tend to go
together; no doubt this is because not all outcomes are well measurable. Moreover, the choice of measures, and even the basic question of whether to classify a measure as an output or an outcome,
are inevitably value-laden. We can see this clearly by examining
Charles Logan’s “quality of confinement” index, one of the more
highly regarded prison performance measures.308 Logan’s performance indicators focus on eight broad categories:
1. “Security (‘keep them in’).”
2. “Safety (‘keep them safe’).”
3. “Order (‘keep them in line’).”
4. “Care (‘keep them healthy’).”
5. “Activity (‘keep them busy’).”
6. “Justice (‘do it fairly’).”
7. “Conditions (‘without undue suffering’).”
8. “Management (‘as efficiently as possible’).”309
Each of these categories contains a number of subdimensions:
for instance, the “security” category contains the subdimensions of
security procedures, drug use, significant incidents, community
exposure, freedom of movement, and staffing adequacy.”310 The
“safety” category contains safety of inmates, safety of staff, dangerousness of inmates, safety of environment, and (again) staffing
adequacy.”311
And, finally, Logan decomposes these subdimensions into specific numerical measures: number of escapes, proportion of staff
306

Id. at 466.
WILLIAM G. SAYLOR ET AL., PRISON SOCIAL CLIMATE SURVEY: RELIABILITY AND VALIDITY ANALYSES OF THE WORK ENVIRONMENT CONSTRUCTS 3–8 (1996); see also text accompanying
supra note 85.
308
Charles H. Logan, Criminal Justice Performance Measures for Prisons, in PERFORMANCE
MEASURES FOR THE CRIMINAL JUSTICE SYSTEM, supra note 279, at 19. GAES ET AL., supra note 7,
at xi (calling Logan’s approach “one serious attempt to develop a coherent theoretical and empirical
approach to prison performance measurement”); id. at 5–8 (discussing Logan’s model). Joan Petersilia has also developed performance measures for community corrections. See Joan Petersilia,
Measuring the Performance of Community Corrections, in PERFORMANCE MEASURES FOR THE
CRIMINAL JUSTICE SYSTEM, supra, at 60. But many of these are input measures (“Number and type
of supervision contacts”), output measures (“Number of hours/days performed community service”),
or outcome measures that can be easily gamed (“Number of arrests and technical violation[s] during
supervision”). Id. at 77–78.
309
Logan, supra note 308, at 27–32.
310
Id. at 34.
311
Id.
307

Draft—Please do not circulate

52

VOLOKH

who have observed staff ignoring inmate misconduct, ratio of resident population to security staff, drug-related incidents, and so
on.312 In all—over all eight dimensions—there are a few hundred
measures.313 Logan used this index to evaluate three women’s
prisons in New Mexico and West Virginia.314
None of Logan’s measures involve how many inmates get rehabilitated. But this is also intentional. First, actual rehabilitation is
out of the direct control of prisons. Logan has a preference for
measuring things that are within prisons’ “direct sphere of influence”;315 what we measure “ought to be achievable and measurable
mostly within the prison itself.”316 Second, including rehabilitation
endorses the rehabilitative model of criminal punishment, and Logan makes it clear that his model is retributive, not rehabilitative.317 Prisons, in his view, shouldn’t “add to (any more than . . .
avoid or . . . compensate for) the pain and suffering inherent in being forcibly separated from civil society[;] . . . coercive confinement carries with it an obligation to meet the basic needs of prisons
at a reasonable standard of decency.”318
Logan’s concern for focusing on what a prison can control and
focusing on the rehabilitative goal merge in the following statement: “a prison does not have to justify itself as a tool of rehabilitation or crime control or any other instrumental purpose at which
an army of critics will forever claim it to be a failure.”319 (Of
course “[i]t would be very nice if the prison programs [counted in
the ‘activity’ dimension] had rehabilitative effects,” and perhaps
they do, but whether they do or don’t doesn’t enter into the index.320)
Fair enough. What this illustrates is that you can’t judge particular measures to be desirable unless you have a normative theory
that proclaims certain goals to be desirable, and such a political
discussion is necessary before one can commit oneself to a particu-

312

Id. at 42–43.
Id. at 43–57.
314
LOGAN, supra note 60, at 7–11, 13, 17; Logan, supra note 60, at 577–78, 583 fig.1.
315
Logan, supra note 308, at 24.
316
Id.
317
Id. at 19, 21, 24.
318
Id. at 25.
319
Id. at 26.
320
Id. at 29 n.7.
313

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

53

lar form of performance measures.321 “[W]ithout declared goals,
we cannot hold a jurisdiction accountable, and performance measurement is meaningless.”322
This normative issue arises wherever performance measurements are used. John DiIulio describes how John Chubb and Terry
Moe “measure school performance strictly in terms of pupils’
achievements on a battery of standardized tests, accepting the
schools’ value as instruments of socialization and civics training as
important but secondary.”323 On the relative value of test scores vs.
socialization, your mileage may vary.
Likewise, for the correctional system, there is a great variety of
available goals;324 prisons should punish, rehabilitate, deter, incapacitate, and reintegrate—all, says John DiIulio, “without violating
the public conscience (humane treatment), jeopardizing the public
law (constitutional rights), emptying the public purse (cost containment), or weakening the tradition of State and local public administration (federalism).”325 So we need to have a political discussion about what the appropriate goals are.
One’s normative theory also affects whether a particular measure is an output or an outcome; this classification,326 which I’ve
been using casually so far as if it were value-neutral, is in fact anything but. If we didn’t care about inmates but only cared about the
outside world, perhaps only recidivism would be relevant. The
quality of living conditions or inmate literacy would merely be
outputs, which we would care about only to the extent that they
affected recidivism; they wouldn’t need to independently enter the
compensation function as long as we already counted recidivism.
But we might independently care about inmates’ living conditions

321
John DiIulio thus seems incorrect when he states that Logan’s work “dispels the worry that
any such measurement scheme is bound to be based exclusively on one or another moral or ideological view of the ‘ends of criminal justice’” and that his measures “encompass and satisfy every major
school of thought about ‘what prisons are for.’” DiIulio, supra note 279, at 152.
322
GAES ET AL., supra note 7, at xii.
323
DiIulio, supra note 279, at 129 (citing John E. Chubb, Why the Current Wave of School Reform Will Fail, PUB. INTEREST, Winter 1988, at 28; JOHN E. CHUBB & TERRY M. MOE, POLITICS,
MARKETS, AND AMERICA’S SCHOOLS (1990)).
324
GAES ET AL., supra note 7, at 10–16 tbl.1.1; see also text accompanying infra note 283.
325
John J. DiIulio, Jr., Rethinking the Criminal Justice System: Toward a New Paradigm, in
PERFORMANCE MEASURES FOR THE CRIMINAL JUSTICE SYSTEM, supra note 279, at 1, 6 (italics
omitted).
326
See text accompanying supra note 17.

Draft—Please do not circulate

54

VOLOKH

for many reasons; if we do, living conditions become an actual
outcome of the system.
Thus, some of Logan’s dimensions, like “activity,” which I’m
inclined to call an output measure,327 might be an outcome measure given Logan’s normative perspective. The same goes for variables like prison employees’ job satisfaction328 (which I consider an
output measure because it’s only instrumentally relevant to prison
quality, but which others who care about labor conditions might
treat differently) or whether inmates have difficulty concentrating329 (which—unlike, say, overcrowding or physical safety330—
many may not consider an appropriate dimension for prison evaluation).
Some of the measures, though, for instance the number of urinalysis tests that conducted based on suspicion, are output
measures under any definition, and these have the problem that it’s
ambiguous whether they’re good or bad. Do we want more or fewer urinalysis tests based on suspicion? More tests could mean that
drug use has gone up; or it could mean that prison authorities are
getting more serious about controlling drug use. Even worse, prison authorities’ stringency is something prison authorities themselves can control; this is a serious problem, which I discuss below.331
As a final note, I’ll mention that while it’s vitally important to
have good cost measures that are adequate for comparing public
and private prisons, it’s not necessary to include cost in the private
contractor’s compensation. If we couldn’t measure quality, perhaps
there would be a role for rate-of-return regulation, which might at
least limits some of the private sector’s harmful cost-cutting
tendencies.332 But if we’re going to engage in quality measurement, we might as well enforce quality directly by getting the re-

See DiIulio, supra note 279, at 152 (distinguishing between certain “process measures” and
certain “performance measures” within Logan’s “security” dimension); see also Gaes, supra note 36,
at 23 (“[J]urisdictions that buy prison services are most concerned about internal performance
measures such as order, health, case management, program services, and safety.”).
328
See text accompanying supra note 307.
329
Ross et al., supra note 305, at 464.
330
See text accompanying supra note 305.
331
See infra Part IV.C.2.
332
Cf. W. KIP VISCUSI ET AL., ECONOMICS OF REGULATION AND ANTITRUST 430–36 (4th ed.
2005) (discussing the theory of traditional rate-of-return regulation, primarily in the context of electric utilities).
327

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

55

wards or penalties “right”;333 let the private firms worry about their
own costs.334
IV. CONCERNS AND CRITIQUES
Despite the advantages discussed in the previous section, the
use of performance measures has its pitfalls.
One concern, so obvious as not to merit its own section heading, is the issue of administrative costs. Recidivism-based contracts
require that one track released prisoners adequately; perhaps there
would be substantial startup costs.335 But if performance-based
contracting is beneficial at all, its benefits are probably great
enough that these startup costs are worthwhile.336
This Part focuses on other concerns and critiques. First, there is
the concern that one can’t set the proper prices in a theoretically
defensible way. Second, there’s the concern that performancebased compensation will affect market structure, either by driving
out the public-interested or by driving out the risk-averse. Third,
there’s the concern that performance-based compensation will lead
to undesirable strategic behavior, for instance via manipulation of
the choice of performance goals, by distorting effort across various
dimensions of performance, by distorting effort across various
types of inmate, and by encouraging outright falsification.
A. What Prices to Set
The focus on performance measures might seem grating to
those who criticize the turn toward efficiency analysis and comparative effectiveness and stress moral considerations.337 But one
can support performance measures without endorsing efficiency in
333

See infra Part IV.A.
Cf. Shapiro & Steinzor, supra note 218, at 1767 (questioning whether reducing regulatory
cost to the private sector should be a GPRA performance measure for the FDA).
335
See Durham, supra note 20, at 66; id. at 67 (“‘At none of the sites we examined were attempts made by government to evaluate rehabilitative success.’ (quoting COUNCIL OF STATE GOV’TS
& URBAN INST., ISSUES IN CONTRACTING FOR THE PRIVATE OPERATION OF PRISONS AND JAILS 115
(1987))).
336
Cf. Low, supra note 217, at 64. One might also measure a random sample of inmates, see id.
at 46, though this might exacerbate risk issues. See infra Part IV.B.2.
337
Sharon Dolovich critiques “comparative efficiency” analysis and stresses moral considerations, see, e.g., Dolovich, supra note 3; Dolovich, supra note 5, though to my knowledge she hasn’t
opined on performance measures.
334

Draft—Please do not circulate

56

VOLOKH

any way—in fact, as a better way of achieving particular moral
goals.
I myself have been critical of a focus on efficiency in the context of regulatory cost-benefit analysis,338 another example of hardnumbers-based accountability. To restate the problems of costbenefit analysis in the prison context: What’s the social value of
having less recidivism? To ask this in an economic context, we’d
have to know either the maximum amount people would be willing
to pay to reduce crime, or the minimum amount people would accept to acquiesce in an increase in crime. These are in general different amounts, and the choice between them is value-laden.339
Suppose we choose one of these numbers to measure; we may find
that, when surveyed, some people—who reject the very notion of
paying or being paid for reductions or increases in crime—give
answers of zero or infinity for their willingness to pay or accept;
the number we’re seeking may just not exist for these people.340
Some people may have true willingnesses to pay or accept, but
they don’t even know what they are: we only come to know such
numbers because of our experience paying for and consuming
goods and services in the real world, but increases and decreases in
crime generally aren’t traded in markets. So the very act of asking
for the number may bring some number into being, but there’s no
reason to suppose it’s accurate.341 Or, people may know the number, but there’s no incentive for them to truthfully reveal it in surveys.
Even if we use non-survey-based estimation methods—how
much higher are house prices in lower-crime areas? how much do
people pay to avoid crime?—econometric analysis isn’t good
enough to give us the correct number.342 The political process is
also likely to manipulate the numbers.343 Moreover, concerns that
are hard to quantify can be systematically slighted.344
338

See Alexander Volokh, Rationality or Rationalism? The Positive and Normative Flaws of
Cost-Benefit Analysis, 48 HOUS. L. REV. 79 (2011).
339
See id. at 82–83.
340
See id. at 84.
341
See id. at 85–86.
342
See id. at 86–88.
343
See, e.g., Frank Ackerman & Lisa Heinzerling, Pricing the Priceless: Cost-Benefit Analysis
of Environmental Protection, 150 U. PA. L. REV. 1553, 1580 (2002) (regulated industry has incentive to overstate costs).
344
See id. at 1579–80. This gives rise to potentially serious strategic behavior, which I address
in infra Part IV.C.2.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

57

In short, “[w]hile cost-benefit analysis may look like rationality, perhaps it’s merely rationalism.”345 And these are just the problems for people who accept the utilitarian basis of cost-benefit
analysis. The problems for those who reject utilitarianism as a
moral philosophy are even greater.346 Surely corrections policy, of
all things, should be decided with respect to morality and human
values rather than numbers?
These are real problems with cost-benefit analysis, and they
potentially infect performance-based contracting as well. Setting
the incentives in a performance-based contract means either setting
the relative weights of every component of performance,347 or
(equivalently) setting the separate rewards or penalties for every
component of performance.348 Getting the prices “right,” in an efficiency sense, requires knowing the social value of the different
components of performance;349 if that social value doesn’t exist or
can’t be measured, it’s an impossible task.
I agree and disagree with this critique.
As to the moral objection: even though moral values have an
extremely important place in criminal law and policy, I have no
essential problem with using economic incentives to improve outcomes in the process. I’ve argued elsewhere that the valid arguments for or against private prisons generally are essentially empirical;350 measuring performance is an essential part of that debate,
even though the choice of outcomes to measure is a value-laden
enterprise;351 and attaching incentives to those performance
measures is eminently justifiable if the result is a morally more just
correctional system.
345

Volokh, supra note 338, at 88.
See id. at 88–91.
347
See supra note 283 and accompanying text.
348
These two approaches are identical. Let xi be the ith component of performance and pi be
the reward for that component. Then the total performance-based component of compensation is
Σpixi. Let P be the sum of the prices (P=Σpi). Then the performance-based component of compensation can be expressed as P Σ(pi/P)xi = P Σwixi, where wi = pi/P is the weight placed on the ith component of performance and P is the price attached to the overall performance index Σwixi,.
349
Not that the price necessarily has to be equal to the social value—paying the price requires
incurring the deadweight losses involved in raising tax money, and making incentives so highpowered might make the contract too risky. See infra Part IV.B.2 for a discussion of optimal risk
allocation. But at least the optimal prices (or at least the relative optimal prices of the different components of performance), from an efficiency perspective, will probably bear some relation to social
value.
350
See generally Volokh, supra note 6.
351
See text accompanying supra notes 321–324.
346

Draft—Please do not circulate

58

VOLOKH

As to the theoretical incoherence objection, I’m sympathetic.
But the enterprise can still be salvaged if we adopt a humble attitude.352 Rather than trying to achieve incentives that are correct in
some abstract sense,353 we can just try to muddle through and ameliorate the problems of the current system by attaching some
weight to factors that traditionally haven’t been rewarded. None of
this requires buying into the efficiency norm.354 Maybe the weights
will be wrong, but “[t]he basic question . . . is whether the dangers
of providing improper incentives through imperfect models outweigh the benefits of providing program direction and accountability.”355 Is adding this element of imperfect, numbers-based accountability better than not? The remaining sections in this Part
address this question.
B. Effects on Market Structure
This section discusses how performance-based compensation
can change the composition of providers. First, it will attract providers who respond better to market incentives, which might affect
the overall public-interestedness of the industry. Second, because
performance-based compensation is riskier than flat-rate compensation, it will discourage the more risk-averse providers.
1. Public-Interestedness
Todd Henderson and Fred Tung address this concern in the
context of performance-based compensation for regulators. If regulators are currently public-interested, introducing market incentives
might change the culture within the agency. “Once diligence has
been priced, perhaps some regulators will slack.”356
352
Cf. Christopher C. DeMuth & Douglas H. Ginsburg, Rationalism in Regulation, 108 MICH.
L. REV. 877, 885 (2010) (Richard Revesz and Michael Livermore “regard regulatory cost-benefit
analysis as a device for social engineering. . . . Our view of cost-benefit analysis is much more modest. . . . [W]e think that many important political questions . . . cannot be effectively decided by costbenefit analysis.”).
353
See DiIulio, supra note 279, at 146.
354
Barnow, supra note 224, at 279.
355
Barnow, supra note 224, at 307; see also Henderson & Tung, supra note 356, at at 36 (“We
make no attempt to offer firm prescriptions for the optimal ratio [between debt and equity]. The mix
should induce regulators to care about bank profits but not at the expense of risk shifting to creditors.”).
356
M. Todd Henderson & Frederick Tung, Pay for Regulator Performance, 85 S. CAL. L. REV.
1003, 1056–57 (2012).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

59

This form of compensation will also affect the mix of people
who choose to be regulators. “Public service motives might be displaced by financial motivations among new hires . . . . Eventually,
the composition of the regulatory agency could change for the
worse.”357
Henderson and Tung conclude, citing the crowding out literature,358 that this is possible, though not necessary: “public spiritedness and financial reward [might not be] mutually exclusive, up to
a point.”359 Moreover, changing the mix of individuals “could be a
good,” given the failures of the current crop of people.360
The same arguments can be applied to performance-based
compensation for prison providers. I would add that, to the extent
we’re considering performance-based compensation for private
firms rather than public servants,361 we don’t need to worry about
making providers any more mercenary than they already are: if
there’s one thing advocates and opponents of private prisons agree
on, it’s that private prison providers are a profit-oriented bunch.
Not that the profit motive is inconsistent with publicinterestedness: public servants “profit” from their employment too
without being accused of thereby necessarily becoming mercenaries;362 moreover, corrections professionals move between the public and private sectors and presumably take their professionalism
with them. Finally, as I discuss further below,363 performancebased compensation, combined with social impact bonds, allows
nonprofits to raise money from private investors, so to this extent,
introducing the profit motive may turn out to be a great boon for
charitable and public-interested providers.

357

Id. at 1057.
Id. at 1057 n.182 (citing Ernst Fehr & Armin Falk, Psychological Foundations of Incentives, 46 EUR. ECON. REV. 687, 688 (2002); Uri Gneezy & Aldo Rustichini, A Fine Is a Price, 29 J.
LEGAL STUD. 1, 14 (2000)).
359
Id. at 1057.
360
Id.
361
But see supra Part III.C.3 (discussing possibilities for merit pay for public prison wardens).
362
See Volokh, supra note 6, at 178–85.
363
See infra Part IV.B.2.
358

Draft—Please do not circulate

60

VOLOKH

2. Risk and Capital Requirements
a. The Risk Is in the Slope
We’ve seen, in the discussion of Charles Logan’s approach
above,364 the concern that performance measures be based on factors that the relevant actor can actually control. Such concerns crop
up frequently;365 James Q. Wilson even says, in the context of police departments, that public order and safety aren’t “‘real’
measures of overall success” because whatever about them is
measurable “can only partially, if at all, be affected by police behavior.”366 When he does favor a “micro-level measure of success”
of whether the neighborhood is becoming safer and more orderly,367 he still limits it to cases where the level of danger and disorder is “amenable . . . to improvement by a given, feasible level of
police and public action.”368 The concern in the literature over controlling for baselines is similarly motivated.369
This seems mistaken: overall public order and safety are
measures of the success of police departments, and (given that
prison programs and conditions affect recidivism to some extent370) lower recividism is a measure of the success of prisons.371
364

See text accompanying supra notes 327–316.
See, e.g., Petersilia, supra note 308, at 66; DICKER, supra note 119, at 17; GRIZZLE ET AL.,
supra note 235, at 48–49.
366
James Q. Wilson, The Problem of Defining Agency Success, in PERFORMANCE MEASURES
FOR THE CRIMINAL JUSTICE SYSTEM, supra note 279, at 156, 159; see also DiIulio, supra note 325,
at 1–2, 13.
367
Id. at 160–62.
368
Id. at 161.
369
See text accompanying supra notes 234–243.
370
See Camp et al., supra note 96; DIIULIO, supra note 365, at 106–45; DiIulio, supra note
325, at 2; M. Keith Chen & Jesse M. Shapiro, Do Harsher Prison Conditions Reduce Recidivism? A
Discontinuity-Based Approach, 9 AM. L. & ECON. REV. 1, 17–21 (2007); Francesco Drago et al.,
Prison Conditions and Recidivism, 13 AM. L. & ECON. REV. 103, 120–25 (2011); Daniel S. Nagin et
al., Imprisonment and Reoffending, 38 CRIME & JUST. 115, 115 (2009); Rafael Di Tella & Ernesto
Schargrodsky, Criminal Recidivism After Prison and Electronic Monitoring 28 (Nat'l Bureau of
Econ. Research, Working Paper No. 15602, 2009, rev. 2010). See also GAES ET AL., supra note 7, at
124 (citing S.D. Bushway et al., An Empirical Framework for Studying Desistance, 39 CRIMINOLOGY 491 (2001); J. Grogger, The Effect of Arrest on the Employment and Earnings of Young Men, 110
Q.J. ECON. 51 (1995); J. Kling, The Effect of Prison Sentence Length on the Subsequent Employment
and Earnings of a Criminal Defendant (Woodrow Wilson Sch. Econ. Disc. Paper 208, 1999); R.J.
SAMPSON & J.H. LAUB, CRIME IN THE MAKING: PATHWAYS AND TURNING POINTS THROUGH LIFE
(1993)); id. at 129 (citing A.R. Piquero et al., Assessing the Impact of Exposure Time and Incapacitation on Longitudinal Trajectories of Criminal Offending, 16 J. ADOL. RES. 54 (2001)); id. at 136
(citing G.G. Gaes & N. Kendig, The Skill Sets and Health Care Needs of Releasing Offenders, paper
presented at the National Policy Conference, From Prison to Home: The Effect of Incarceration and
Reentry on Children, Families, and Communities, Jan. 30–31, 2002).
365

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

61

It’s true that these measures come with a lot of noise attached—
that is, with a lot of omitted variables reflecting the contribution of
other people’s efforts, as well as environmental variables.372 But
that doesn’t mean it’s wrong to use them for purposes of accountability, or even to tie compensation to them.
There are two concerns about using these noisy measures: first,
that the level of the unobserved variables at the beginning of the
contract might establish a high-recidivism baseline, for which the
contractor will have to be compensated very highly, or a lowrecidivism baseline, for which the contractor will collect more than
it deserves; and second, that variation in the unobserved variables
might create a lot of risk for the contractor.373
As to the first concern, recall the earlier discussion about
whether to control for baselines.374 Whether or not we adjust the
contract price to take into account the baseline expected level of
performance should have little effect on government expenditures:
a high baseline translates into less quality being attributed to the
contractor and thus to lower payments, and so the contractor will
demand more money at the bidding stage, and vice versa.
The same reasoning addresses the second concern: because
controlling for baselines doesn’t affect the contractor’s payout—it
basically amounts to adding or subtracting a constant, which is
subtracted or added right back at the bidding stage—it also doesn’t
necessarily affect risk.375
What definitely affects risk is not the level of compensation,
but its slope. A contract that compensates the contractor based on
the portion of performance he was able to control isn’t necessarily
less risky than one that doesn’t, but a contract where the perquality-unit price is lower is less risky. Thus, in the numerical example discussed earlier,376 a contract with a $1 reward per quality
unit (regardless of the fixed component of the contract) is riskier
DiIulio, supra note 325, at 5 (“[C]rime rates and recidivism rates are indeed important[,
though not the only,] measures of the system’s performance, which ought to be continually used and
refined.”).
372
Barnow, supra note 224, at 281 (these are “gross outcome measures . . . in the sense that
they do not necessarily reflect gains from the program”).
373
HARDING, supra note 70, at 68 (“[T]he human variables are too volatile for any contractor
to be expected to stand or fall by outputs alone . . . .”); Kyle, supra note 158, at 2112; Lynn, supra
note 161, at 12.
374
See supra Part III.C.2.c.
375
See text accompanying supra notes 248.
376
See text accompanying supra notes 228–232, 244.
371

Draft—Please do not circulate

62

VOLOKH

than a contract with a $0.50 reward per quality unit; an even less
risky contract is one with a $0 reward per quality unit, that is, a
fixed-price contract, which is close to the norm; and the least risky
possible contract is the cost-plus contract typical of rate-of-return
regulation.377 Compensation based on a continuous quality measure is less risky than compensation based on a discrete quality
measure (as long as the provider has some chance of being on either side of the cutoff);378 thus, “$1 for each quality unit” is less
risky than “$5 but only if you get five quality units.”
Do we care? Perhaps large corporations like CCA or The GEO
Group, which are publicly traded379 and diversified across many
contracts,380 can handle the risk; and they cover three-quarters of
the industry.381 Smaller, privately held companies like MTC382
may be more sensitive to risk. Various potential entrants, especially nonprofits,383 must be even more sensitive. Adopting highpowered (i.e., high-slope) contracts may scare away the most risksensitive potential bidders, leaving the field to a few large corporations. (And it isn’t just a matter of risk: if the fixed part of the contract is paid up front while the reward is paid later, possibly a few
years later once recidivism statistics come in, this might disadvantage small companies or nonprofits with limited access to capi-

377

See text accompanying supra note 332.
See text accompanying supra notes 251–263.
379
See CCA, About CCA, http://www.cca.com/about/ (CCA joined the NYSE in 1994); The
GEO Group, Inc., Historic Milestones, http://www.geogroup.com/history (GEO joined the NYSE in
1996).
380
See CCA, supra note 379 (“CCA houses more than 80,000 inmates in more than 60 facilities . . . . CCA currently partners with all three federal corrections agencies . . . , 16 states, more than
a dozen local municipalities, and Puerto Rico and the U.S. Virgin Islands.”); The GEO Group, Inc.,
Who We Are, http://www.geogroup.com/about_us (“GEO's operations include the management
and/or ownership of 95 correctional, detention and residential treatment facilities encompassing
approximately 72,000 beds.”).
381
See Volokh, supra note 9, at 1237 & n.182 (data from 1999).
382
See Management & Training Corp., Overview, http://www.mtctrains.com/about-mtc/
overview (“Management & Training Corporation (MTC) is a privately- held company”); Volokh,
supra note 9, at 1237 & n.182 (5–8% share for MTC in 1999).
383
For discussions of the possibility of nonprofit prisons, see Low, supra note 217; Richard
Moran, A Third Option: Nonprofit Prisons, N.Y. TIMES, Aug. 23, 1997, at 23. Compare with discussions of the advantages of nonprofit schools: see Byron W. Brown, Why Governments Run Schools,
11 ECON. EDUC. REV. 287, 293–96 (1992); John Morley, Note, For-Profit and Nonprofit Charter
Schools: An Agency Approach, 115 YALE L.J. 1782 (2006). Cf. also Education: Raising the Bar,
ECONOMIST, June 15, 2013, at 30 (discussing risk issues for schools and teachers resulting from
educational accountability schemes).
378

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

63

tal markets.384) This has potential implications for the competitiveness of the industry,385 possibilities for innovation,386 and the political influence that drives changes in criminal law.387
But the contract doesn’t have to be especially high-stakes.388
The optimal level of risk transfer is probably less than 100%. Rewarding the contractor for increases in quality with a price equal to
the social value of quality gives the contractor great incentives but
also (since the per-unit reward will be high) subjects him to high
risk.389 Flat-fee contracts are relatively low-risk390 but also lowincentive. Some moderate level of risk transfer will optimally balance incentives with risk.391 Thus, the incentive-based portion of
the contract is only 10% of the contract price in UK’s Doncaster
prison,392 and was only 5% in the federal Bureau of Prisons’ Taft
demonstration project.393 Recall that in Britain’s Job Deal program,
30% of the payment is conditional, and only a third of that is related to “hard outcomes,” and even some of those outcomes are
slightly “soft.”394
For the cash-flow issue noted above,395 one can also “[c]hange
the timing of payments to providers,” for instance by making “a
payment every six months for each offender who has not been reconvicted.”396

384
NICHOLSON, supra note 211, at 6 (“The working capital requirements of a [payment-byresults] system will cause problems for Small and Medium Size Enterprises and the third sector [i.e.,
nonprofits] in bidding for contracts.”).
385
DICKER, supra note 119, at 24 (high incentives, through high risk, will “reduce the diversity
of the market” by making it less attractive for nonprofits or small companies).
386
Id. at 23. On the relationship between market concentration and innovation, see Richard
Gilbert, Looking for Mr. Schumpeter: Where Are We in the Competition-Innovation Debate, in 6
INNOVATION POLICY AND THE ECONOMY 159 (Adam B. Jaffe et al. eds., 2006) (relationship is inconclusive).
387
See Volokh, supra note 9, 1213–14 (arguing that the degree of concentration of the industry
can affect the political influence the industry exerts); Volokh, Privatization, Free-Riding, supra note
138, at 64 (same); Volokh, The Effect of Privatization, supra note 138, at 10–11 (same).
388
DICKER, supra note 119, at 6.
389
See supra note 349.
390
Though not zero-risk: recall that the least risky contracts are cost-plus. See text accompanying supra note 377.
391
DICKER, supra note 119, at 23–24; NICHOLSON, supra note 211, at 6–7. See generally BOLTON & DEWATRIPONT, supra note 249, at 13 (“[W]hen both employer and employee are risk averse,
they will optimally share business risk.”).
392
Johnson, supra note 204.
393
See text accompanying supra notes 167–168.
394
See text accompanying supra note 213.
395
See text accompanying supra note 384.
396
DICKER, supra note 119, at 24.

Draft—Please do not circulate

64

VOLOKH

b. Financing Nonprofits: Social Impact Bonds
The need to encourage the nonprofit sector calls for innovative
funding mechanisms. Nonprofit prisons have been suggested397
though never implemented.398 But in light of the widespread concern that private prison firms will cut quality to save money, 399 the
nonprofit form seems like an obvious alternative.
Ed Glaeser and Andrei Shleifer discuss the value of nonprofit
status: by weakening the provider’s incentives to maximize profits,
nonprofit status can be a valuable signal of quality when quality
itself is non-verifiable. (Even using performance measures, it’s
reasonable to suppose that some aspects of quality will remain
non-verifiable; the value of nonprofit status depends on how important these remaining non-verifiable components are.400) Moreover, altruistic entrepreneurs will tend to be attracted to the nonprofit form.401
And Timothy Besley and Maitreesh Ghatak show that, when
both a provider and the government can make productive investments in a project, and when the provider is altruistic, then the
provider should own the project if it values it more than the government does.402 Privatization can thus be more beneficial in the
presence of altruistic providers.
But banks or private equity houses are unlikely to finance such
nonprofits, especially when the nonprofits don’t have much of a
track record.403
Social Impact Bonds have been proposed as funding mechanism for nonprofits. Rather than contracting directly with a provider, the government contracts with a middleman. This middleman, a
“social impact bond-issuing organization,”404 has two functions.
397

See sources cited supra note 383.
See Low, supra note 217, at 5 (suggesting creation of nonprofits prisons on “an experimental basis”).
399
See, e.g., Dolovich, supra note 5, at 474–80.
400
See infra Part IV.C.2.
401
Edward L. Glaeser & Andrei Shleifer, Not-for-Profit Entrepreneurs, 81 J. PUB. ECON. 99
(2001).
402
Timothy Besley & Maitreesh Ghatak, Government Versus Private Ownership of Public
Goods, 116 Q.J. ECON. 1343, 1347 (2001).
403
NICHOLSON, supra note 211, at 6–7.
404
JEFFREY B. LIEBMAN, SOCIAL IMPACT BONDS: A PROMISING NEW FINANCING MODEL TO
ACCELERATE SOCIAL INNOVATION AND IMPROVE GOVERNMENT PERFORMANCE 2 (Ctr. for Am.
Prog., 2011).
398

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

65

First, it hires the staff to provide the service. Second, it sells bonds
to investors, particularly philanthropic ones;405 these bonds are essentially claims to a portion of the performance-based compensation. If the service provider fulfills the performance-based goals
and receives its reward from the government, the investors make
money; otherwise they don’t. At the Peterborough prison in the
UK, the government doesn’t pay anything unless recidivism is
7.5% less than in a comparison group,406 and payments are capped
when the difference reaches 13%;407 at Doncaster, payments don’t
start until the difference is 5%.408 The provider’s employees may
well be paid something like a flat wage, so their monetary incentives aren’t great; but the bond-issuing organization and the philanthropic investors (whose money is on the line) are probably better
at monitoring the staff than the government would be. It remains to
be seen, though, whether the philanthropic sector will provide
enough funds for nonprofit prison providers to be a viable alternative to for-profit corporations.409
C. Undesirable Strategic Behavior
Perhaps the biggest disadvantage of using performance-based
compensation is the strategic behavior it may spawn. This strategic
behavior may come in several flavors. First, there is the possibility
of manipulating the performance goals themselves. Second, effort
may be distorted away from some dimensions and toward others.
Third, effort may be distorted away from some groups of inmates
and toward others. And fourth, performance measures may simply
be falsified.

405

Though social impact bonds in the U.S. have been funded by non-philanthropic types such
as Goldman Sachs. Social Impact Bonds: Being Good Pays, ECONOMIST, Aug. 18, 2012, at 28.
406
LIEBMAN, supra note 404, at 2.
407
See text accompanying supra note 208.
408
See supra note 208.
409
NICHOLSON, supra note 211, at 16, 18; Social Market Found., Big Hurdles to Be Overcome
if Social Impact Bonds to Move from Margins of Public Services, Says Think Tank (July 31, 2013);
http://www.smf.co.uk/media/news/big-hurdles-be-overcome-if-social-impact-bonds-move-marginspublic-services-says-think-tank/; Tom Clougherty, Pioneering Social Impact Bonds in the United
Kingdom, REASON FOUND. (Aug. 13, 2013), http://reason.org/news/show/pioneering-social-impactbonds.

Draft—Please do not circulate

66

VOLOKH

1. Manipulating the Goals
The Government Performance and Results Act of 1993410 is
one example of a recent effort to inject performance measures into
government agencies that hasn’t lived up to the hopes of its supporters.
One of the problems was that setting the performance goals
was left to the agencies that were to be evaluated. Agencies “tr[ied]
to protect themselves by devising euphemistic performance goals
in order to ensure that they [could] ‘pass’ their own grading criteria.”411 The Patent and Trademark Office, faced with rising backlogs, set itself progressively longer targets of “average total pendency” from year to year, rising from 27.7 months in fiscal year
2003 to 29.8 months in 2004, 31.0 months in 2005, and 31.3
months in 2006.412 (John DiIulio had warned of a similar danger:
“that measurement-driven government workers will, so to speak,
‘set up the target in order to facilitate shooting.’”413 The similar
problem was observed in the UK, where “Next Steps agencies,” a
type of performance-based organization, set their own targets,
which often reflected merely an incremental improvement rather
than an assessment of what was possible.414
Why would agencies set goals in such unambitious ways? Perhaps because agencies feared being punished for bad performance
with budget cuts.415 Various politicians have indeed suggested that
agencies’ funding be tied to their performance results,416 and agencies’ performance results have indeed been relevant to the admin-

410

See supra note 160.
Shapiro & Steinzor, supra note 218, at 1744; see also id. at 1760 (“[A]gencies compelled to
function in an antiregulatory, even hostile, political atmosphere are predictably reluctant to tell the
truth to power. Instead, their goal has become convincing congressional and White House overseers
that they are performing well despite budgets that are inadequate for effective implementation of
their missions.”).
412
Schoen, supra note 160, at 480.
413
DiIulio, supra note 279, at 154.
414
GEN. ACCOUNTING OFFICE, supra note 218, at 7.
415
Shapiro & Steinzor, supra note 218, at 1744.
416
Schoen, supra note 160, at 464 (citing The Results Act: Are We Getting Results?: Hearings
Before the H. Comm. On Gov’t Reform, 105th Cong. 42 (1997), at 20 (statement of Rep. Dick
Armey, H. Majority Leader)); id. at 465 (citing Seven Years of GPRA: Has the Results Act Provided
Results?: Hearing Before the Subcomm. On Gov’t Mgmt., Info., and Tech. of the H. Comm. on Gov’t
Reform, 105th Cong. 21 (2000) (statement of Rep. Pete Sessions, Chairman, Results Caucus)); id. at
466 (citing OMB, THE PRESIDENT’S MANAGEMENT AGENDA 29 (2002), available at http://www.
whitehouse.gov/omb/budget/fy2002/mgmt.pdf).
411

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

67

istration’s budget proposals,417 so this fear may have been reasonable—though it’s also possible that performance scores have merely
given political cover for cuts to programs that the administration
wanted to defund for other reasons.418 On the other hand, the link
between funding and performance results isn’t that tight,419 so
agencies’ concern to look good may also have been a matter of
good public relations.
The problem here is that agencies were allowed to think up
their own performance goals; that they weren’t required to meet
those goals (and indeed, that often the performance information
simply wasn’t used in decisionmaking420); and that the goals were
binary rather than continuous outcome measures,421 e.g. that the
EPA “will achieve and maintain at least 95 percent of the maximum score on readiness evaluation criteria in each region”422 or
“complete an additional 975 Superfund-lead hazardous substance
removal actions.”423
These problems have easy fixes, though perhaps they weren’t
so easy in the context of the GPRA, where the problem was primarily giving performance incentives to public agencies. Prison
417
EILEEN C. NORCROSS & KYLE MCKENZIE, MERCATUS CENTER, GEORGE MASON UNIV., AN
ANALYSIS OF THE OFFICE OF MANAGEMENT AND BUDGET’S PROGRAM ASSESSMENT RATING TOOL
(PART) FOR FISCAL YEAR 2007, at 22 (May 2006); Eileen Norcross & Joseph Adamson, An Analysis of the Office of the Office of Management and Budget’s Program Assessment Rating Tool (PART)
for Fiscal Year 2008, working paper, Mercatus Center, at 25 (2007).
418
John B. Gilmour & David E. Lewis, Does Performance Budgeting Work? An Examination
of the Office of Management & Budget’s PART Scores, 66 PUB. ADMIN. REV. 742, 751 (2006);
Norcross & Adamson, supra note 417, at 29–30.
419
See, e.g., Jerry Ellig, Has GPRA Increased the Availability and Use of Performance Information?, Mercatus Ctr. Working Paper No. 09-03, Mar. 2009, at 5; Teresa Curristine, Reforming the
U.S. Department of Transportation: Challenges and Opportunities of the Government Performance
and Results Act for Federal-State Relations, 32 PUBLIUS 25, 42 (2002).
420
Ellig, supra note 419, at 1 (citing Jerry Brito & Jerry Ellig, Toward a More Perfect Union:
Regulatory Analysis and Performance Management, forthcoming FLA. ST. U. BUS. REV.); id. at 2
(citing GAO, GOVERNMENT PERFORMANCE: LESSONS LEARNED FOR THE NEXT ADMINISTRATION
ON USING PERFORMANCE INFORMATION TO IMPROVE RESULTS (2008)); Schoen, supra note 160, at
466 (citing 10 Years of GPRA—Results, Demonstrated: Hearings Before the Subcomm. On Gov’t
Efficiency and Fin. Mgmt. of the H. Comm. on Gov’t Reform, 108th Cong. 4 (2004) (statement of
Rep. Edolphus Towns, Member, Subcomm. on Gov’t Efficiency and Fin. Mgmt. of the H. Comm.
on Gov’t Reform)).
421
See supra Part III.C.2.d.
422
Shapiro & Steinzor, supra note 218, at 1764 (quoting EPA, 2006–2011 EPA STRATEGIC
PLAN: CHARTING OUR COURSE 67 (2006), available at http://www.epa.gov/ocfo/plan/2006/entire_
report.pdf).
423
Id. at 1765 (quoting 2006–2011 EPA STRATEGIC PLAN, supra note 422, at 67); see also id.
at 1773 (“[A]ttain water quality standards for all pollutants and impairments in more than 2,250
water bodies . . . . [R]emove at least 5,600 . . . specific causes of water body impairment . . . .
[I]mprove water quality conditions in 250 . . . impaired watersheds nationwide . . . .”) (quoting
2006–2011 EPA STRATEGIC PLAN, supra note 422, at 67).

Draft—Please do not circulate

68

VOLOKH

contracts—or merit pay systems for public prison wardens424—
should be set by the Department of Corrections or the relevant contracting authority; goals shouldn’t be set by those who we want to
comply with them. No one should be “required” to meet any performance standard, but compensation should be tied to these
measures; providers’ self-interest should take care of the rest. And
adopting continuous outcome measures, rather than binary goals,
reduces the ability to choose easy goals: one can game “achieve
x% recidivism” by setting an appropriately high level of x, but it’s
harder to game the general effort of reducing recidivism where additional reductions are met with additional rewards.425
2. Distortion Across Dimensions of Performance
Everyone agrees that, in most areas, performance has multiple
dimensions.426 Each dimension, in a performance-based contract,
will have its price,427 and the relative prices of different dimensions will determine how the contractor will allocate his effort
among them.428
So far, so good, as long as the set of performance measures is
complete. But what if some dimensions of performance are unmeasurable?429 Just as cost-benefit analysis is accused of slighting
the soft factors,430 so might performance measures be biased in
favor of the measurable. The result is that the contractor’s work
effort will be biased in the direction of increasing the measurable
dimensions of performance.431

424

See supra Part III.C.3.
See also Barnow, supra note 224, at 287 (discussing “whether the size of the award should
vary with the extent to which standards are exceeded”); id. at 291–92 (“The national standards are
set, based on experience in prior years, so that approximately 75 percent of the nation’s [providers]
will exceed the standards . . . .”).
426
See text accompanying supra notes 283, 325–324.
427
See text accompanying supra note 348.
428
See Bengt Holmstrom & Paul Milgrom, Multitask Principal-Agent Analyses: Incentive Contracts, Asset Ownership, and Job Design, 7 J.L., ECON., & ORG. 24, 25 (special issue 1991) (“In
general, when there are multiple tasks, incentive pay serves not only to allocate risks and to motivate
hard work, it also serves to direct the allocation of the agents’ attention among their various duties.”).
429
See text accompanying supra note 344 (noting retributivism as a possible unmeasurable dimension).
430
See text accompanying supra note 344.
431
GRIZZLE ET AL., supra note 235, at 50–51.
425

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

69

Consider a hypothetical example involving education. Suppose
there are two measures of educational quality: “hard” (e.g.,
knowledge of facts) and “soft” (e.g., citizenship, critical thinking,
socialization). Without hard accountability, it might be hard to give
teachers serious incentives, so they will slack in their overall work
effort, but divide their time between hard and soft education in a
balanced way. With hard accountability, teachers can get much
higher-powered incentives, but these incentives will tend to be
skewed toward the hard measures of education. Thus, the teachers
will provide more overall work effort, but their time will be
skewed toward hard education.432
How serious is this problem? It depends how important it is to
have a balance between hard and soft factors, how hard the soft
factors really are to measure, and how harmful the status quo of
low work effort is.433 It also depends on whether the one type of
education makes the other type easier or harder for the teacher; an
excessively high-powered accountability system focusing, say, on
standardized test scores could easily promote a “teaching to the
test” strategy that can be antithetical to critical thinking (at the very
least by taking up class time that could be otherwise used);434 this
isn’t necessarily so, but it may be likely.435 Providing highpowered but skewed accountability may be beneficial in severely
dysfunctional school systems where neither hard nor soft factors
are taught well, but it may be harmful in better school systems.
Analogously, in the prison context, one can imagine two dimensions of quality: humane in-prison conditions and low recidivism after prison. Suppose one of these is harder to measure than
432
See generally Holmstrom & Milgrom, supra note 428, at 25 (“It would be better, . . . critics
argue, to pay a fixed wage without any incentive scheme than to base teachers’ compensation only
on the limited dimensions of student achievement that can be effectively measured.”) (italics omitted); see also Education: Raising the Bar, supra note 383; Peter Smith, On the Unintended Consequences of Publishing Performance Data in the Public Sector, 18 INT’L J. PUB. ADMIN. 277, 284
(1995) (discussing “tunnel vision”).
433
See Holmstrom & Milgrom, supra note 428, at 26 (“[T]he desirability of providing incentives for any one activity decreases with the difficulty of measuring performance in any other activities that make competing demands on the agent’s time and attention.”).
434
This assumes that test scores really are a true outcome measure, even if a partial one. Perhaps this is too charitable, though: it may be better to characterize test scores as proxy measures for a
type of intelligence, and “teaching to the test” as a form of manipulation, as described below. See
text accompanying infra note 446.
435
See Holmstrom & Milgrom, supra note 432, at 25; id. at 32–33 (desirability of incentives
for measurable task depends on whether measurable and unmeasurable tasks are complements or
substitutes in agent’s cost function).

Draft—Please do not circulate

70

VOLOKH

the other. In-prison conditions could be harder to measure if effective monitoring is difficult;436 or perhaps recidivism is harder to
measure if there aren’t good databases of offenders, especially if
released inmates often commit their crimes in other states.437
Whichever one turns out to be less measurable, we can expect effort to be skewed toward the more measurable one.
Would it make a difference if prison policies were skewed toward humane conditions or toward reducing recidivism? If the two
go together—if humane conditions are, on balance, effective at reducing recidivism438—then the inability to monitor both dimensions can be harmless. On the other hand, if bad prison conditions,
on balance, reduce recidivism through a general deterrent effect,439
a focus on recidivism could lead to bad prison conditions—in
which case there’s no guarantee that high-powered accountability
would improve overall quality in the absence of effective in-prison
monitoring. Since the precise determinants of recidivism aren’t
well understood, this shows the importance of properly monitoring
whatever is considered desirable in the prison.440
In the extreme case, where some tasks remain completely unmeasurable and shirking on that task is highly detrimental to overall quality, we should junk the idea of high-powered incentives: the
traditional input-and-output approach may then be optimal.441
If an unmeasurable outcome is represented in the accountability scheme by some inputs or outputs as proxies, the possibilities
for undesirable strategic behavior multiply. The previous examples
436

See infra part IV.C.4.
[*Cite source on problems with recidivism monitoring.]
438
See sources cited supra note 370.
439
See, e.g., Lawrence Katz et al., Prison Conditions, Capital Punishment, and Deterrence, 5
AM. L. & ECON. REV. 318, 331 (2003); Kelly Bedard & Eric Helland, The Location of Women's
Prisons and the Deterrence Effect of “Harder” Time, 24 INT'L REV. L. & ECON. 147, 159–61
(2004); Alexander Volokh, Prison Vouchers, 160 U. PA. L. REV. 779, 843–45 (2012). But see TOM
R. TYLER, WHY PEOPLE OBEY THE LAW 64 (1990) (“The most important normative influence on
compliance with the law is the person's assessment that following the law accords with his or her
sense of right and wrong; a second factor is the person's feeling of obligation to obey the law and
allegiance to legal authorities.”); Paul H. Robinson & John M. Darley, The Role of Deterrence in the
Formulation of Criminal Law Rules: At Its Worst When Doing Its Best, 91 GEO. L.J. 949, 953–56
(2003).
440
See infra Part IV.C.4.
441
See Holmstrom & Milgrom, supra note 428, at 27 (“[I]ncentives for a task can be provided
in two ways: either the task itself can be rewarded or the marginal opportunity cost for the task itself
can be lowered by removing or reducing the incentives on competing tasks. Constraints are substitutes for performance incentives and are extensively used when it is hard to assess the performance
of the agent.”).
437

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

71

involved ignoring the unmeasurable elements and maximizing the
measurable component of performance, rather than maximizing
overall performance. Replacing unmeasurable elements with proxies within the provider’s direct control leads to pursuing the proxies for their own sake—which one can uncharitably call “manipulating” the proxy measures.
For example, consider recidivism rates, which I’ve been treating throughout as a true outcome measure. In reality, no one knows
true recidivism rates; we don’t know that a released inmate has
committed a crime unless we catch him (and, depending on the recidivism measure we’re using, unless we convict him or reincarcerate him442). So in reality, rather than using the unmeasurable
dimension of recidivism, we’re using the measurable proxy of, say,
rearrest rates. If the relationship between rearrest rates and true recidivism is stable, using this proxy can be harmless; but more important still is that the contractor not be able to manipulate the rates
in ways that don’t correspond to true social improvements.
Thus, if in-prison misconduct is penalized, corrections officers
will use their discretion very differently when deciding whether to
write up an offense.443 If urinalysis tests based on suspicion are
rewarded, we can magically expect more inmates to seem suspicious. Perhaps the output (drug tests based on suspicion) seems to
have a straightforward correlation with the outcome (inmate drug
use, if one chooses to consider that an outcome444); but make it a
subject of compensation, and you can’t rely on that correlation anymore. Administrators will start pursuing the output for its own
sake.
Similarly, in the context of community corrections, Joan Petersilia criticizes the use of recidivism rates as an outcome measure: if
the number of arrests increases, is that bad because more people
are committing offenses? Or is it good because probation officers
are better at detecting technical violations and sending released
offenders back to prison?445 If we decided that increased arrest
rates were bad and attached penalties to that variable, we might
442

See text accompanying supra note 118.
GAES ET AL., supra note 7, at 51.
444
I prefer to think of drug use as neutral in itself, though one can want to control inmate drug
use instrumentally for the sake of outcomes like violence or rehabilitation.
445
Petersilia, supra note 308, at 66–67; see also GAES ET AL., supra note 7, at 23–24; see also
supra note 74.
443

Draft—Please do not circulate

72

VOLOKH

find arrest rates plummeting, but merely because probation officers
stopped supervising their charges very closely.
Recidivism may thus be a bad measure for the accountability of
probation officers. But it can be a good measure for the accountability of prisons, provided that prisons leave supervision and rearrest to entirely separate actors. This is a reason to insist on the separation of prisons and probation officers, not granting contracts to
criminal justice providers that are too integrated, and more generally preventing prisons from giving any incentives at all, even subtle
ones, to probation officers.446 Similarly, the results of drug testing
can be an acceptable measure, but random testing is better than
testing based on suspicion. In-prison misconduct can be an acceptable measure, but it should be the type of serious misconduct
that’s least likely to be overlooked or characterized as something
else.
We might even have to guard against other kinds of gaming: if
prisons can affect where prisoners are released, for instance by
partnering with post-release job placement programs that have
good contacts in particular areas, they can try to have prisoners released in areas where policing is weaker. For understandable political economy reasons, a state Department of Corrections might
choose to ignore the welfare of people in other states and tie compensation only to an in-state measure of recidivism; then, the prison does better by finding out-of-state jobs for its inmates. A prison
might also try to prevent recidivism by “paying offenders to desist,” but this might be controversial.447
(Of course, even if we only use performance measures to reward providers, providers will inevitably have to translate these
incentives into specific input or output-based incentives to reward
their own staff, at least in part—there are limits to the possibilities
of stock options.448 But presumably then the provider will have
better incentives and better ability to monitor its own staff than the
government has to monitor the provider.)

446

See also Smith, supra note 432, at 286 (discussing “suboptimization” and “measure fixa-

tion”).
447
448

DICKER, supra note 119, at 19.
On the use of stock options in private prisons, see Volokh, supra note 6, at 174.

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

73

3. Distortion Across Types of Inmates
One common complaint about high-powered outcome-based
incentives is that they’ll lead to two related phenomena: “creaming”—only taking the easiest inmates—and “parking”—not
providing services to the most difficult inmates.449 There’s an easy
way to prevent providers from taking the easiest inmates: insist
that providers take all comers,450 limit opportunities for providers
to transfer inmates it doesn’t like out of the prison, and have assigning agencies not discriminate either in favor of or against particular providers in assignment.451 There remains, though, the concern that providers will be, for instance, more enthusiastic about
providing rehabilitative services to those that can more likely benefit from them.
There are two lines of response to this concern. Clearly, paying
the same rate, regardless of how hard the offender is to serve, will
lead to parking;452 one can therefore provide payments that are inmate-specific, where a harder-to-serve inmate’s desistance from
crime is rewarded more generously than an easier-to-serve inmate’s. These payments can be based on the observable characteristics of the inmate; some characteristics might be illegal to consider while others can be better observed by the provider than by
the government, so there will inevitably be some degree of mismatch.453 But a system of non-uniform rewards can generally alleviate parking.
The second line of response would question whether parking is
even bad. Suppose some inmates are hard to rehabilitate, so prisons—in the presence of uniform rewards—will tend to spend less
time trying to rehabilitate them. Is this bad? Some nonuniformity
of rewards will be inevitable—presumably a murder by a released
449
DICKER, supra note 119, at 23; see also Inwood, supra note 205; Kyle, supra note 158, at
2112; Barnow, supra note 224, at 287, 297–98, 305–06; Pozen, supra note 77, at 283; RICHARD A.
MCGOWAN, PRIVATIZE THIS?: ASSESSING THE OPPORTUNITIES AND COSTS OF PRIVATIZATION 166
(2011).
450
See Gilroy, supra note 159 (“So literally, you have the private vendor take over the exact
same population, and then use the same metrics you use to assess the public facilities.”); cf. Volokh,
supra note 439, at 806–07.
451
See text accompanying supra notes 131–132.
452
DICKER, supra note 119, at 24; NICHOLSON, supra note 211, at 6–7; David Boyle, The Perils of Obsessive Measurement, RSA: 21ST CENTURY ENLIGHTENMENT, Nov. 1, 2010, available at
http://comment.rsablogs.org.uk/2010/11/01/perils-obsessive-measurement/.
453
Cf. Volokh, supra note 439, at 806–07.

Draft—Please do not circulate

74

VOLOKH

inmate will be penalized more heavily than a minor crime. But
suppose there’s a group of inmates whose recidivism is equally
harmful. Wouldn’t it be socially beneficial for the provider to concentrate its resources on the ones whose crimes can be prevented
most cheaply, so that more inmates can be treated at the same cost?
At least, so an efficiency framework might counsel. If one subscribes to a certain form of equity where everyone should have
some amount of (even ineffective) rehabilitation, one might want
to fall back on the solution I mentioned above: offering higher
payments for the harder-to-treat inmates454 or, if that can’t be done
reliably, mandating some amount of inputs or outputs.
4. Falsifying Performance Measures
Finally, when high-stakes compensation depends on numbers,
there’s an obvious incentive to falsify the numbers themselves.455
Reports of school cheating scandals are commonplace.456 Similarly, in the prison context, private providers plausibly prefer to underreport incidents, at least if they wouldn’t inevitably become
known.457 Failure to report is grounds for contract termination,
which can cut in the other direction, but contract termination is a
strong remedy that’s rarely used.458 Public prisons, on the other
hand, might have an incentive to overreport to get more funds, unless they’re in competition with private facilities.459
Whichever way the incentives cut, the fact that compensation
will inevitably be to some extent based on variables reported by
the provider means that it’s important to seriously invest in monitoring. Currently, monitoring practices vary quite a lot, “from minimal attention from a centrally located contract administrator to a
454

Id. at 25.
Boyle, supra note 452; Smith, supra note 432, at 292.
See, e.g., Emily Richmond, Did High-Stakes Testing Cause the Atlanta Schools Teaching
Scandal?, THEATLANTIC.COM, Apr. 3, 2013, http://www.theatlantic.com/national/archive/2013/04/
did-high-stakes-testing-cause-the-atlanta-schools-cheating-scandal/274619/.
457
See Gaes et al., supra note 59, at 18; Developments, supra note 9, at 1884; JOEL DYER, THE
PERPETUAL PRISONER MACHINE: HOW AMERICA PROFITS FROM CRIME 211, 221 (2000); Low, supra note 217, at 39 (citing JOHN L. CLARK ET AL., REPORT TO THE ATTORNEY GENERAL: INSPECTION AND REVIEW OF THE NORTHEAST OHIO CORRECTIONAL CENTER, at VII.B.2 (1998) (reporting
that CCA’s legal counsel advised administrators against writing reports about incidents because of
concern over legal liability); id. at VIII, XI; HARDING, supra note 70, at 323–24).
458
See text accompanying supra notes 260–261.
459
See Gaes et al., supra note 59, at 18.
455
456

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

75

combination of a contract administrator and one or more on-site
monitors.”460 The monitors themselves may have responsibility for
more than one facility, which puts them on site at any particular
prison once a quarter, once a week, or daily.461 Instead, contracts
should provide for a full-time, on-site monitor462 with “unlimited
access to the correctional facilities and assigned correctional
units,”463 who isn’t the provider’s employee (even if the contract
might mandate that the provider pay his salary as part of the
deal).464
Because the capture of monitors is an enduring concern,465 other forms of monitoring are possible: a public-interest group could
be given inspection rights,466 the surrounding community might be
designated as a third-party beneficiary,467 or the constitutional tort
regime for prisons could be strengthened (rather than weakened,
which is the current trend).468
A strong disclosure regime is also probably a good idea.469
One way of guaranteeing disclosure is to subject private prisons under contract with the federal government to the Freedom of
Information Act,470 perhaps along the lines of the often-proposed
Private Prison Information Act. Private prison firms themselves
aren’t “agencies” for the purposes of FOIA,471 and the Bureau of
Prisons isn’t covered if it hasn’t “created and retained” or doesn’t

460

MCDONALD ET AL., supra note 125, at 50.
Id. at 50, 51 tbl. 4.1.
462
Thomas, supra note 134, at 109.
463
Fla. SB 2038, supra note 169, § 1, at 9 (creating FLA. STAT. § 944.7115(8)(d)).
464
Id.; see also Gilroy, supra note 159 (full-time monitor at each private prison in Ohio plus
surprise inspections by Correctional Institution Inspection Committee); Nicole B. Cásarez, Furthering the Accountability Principle in Privatized Federal Corrections: The Need for Access to Private
Prison Records, 28 U. MICH. J.L. REFORM 249, 293 (1995) (citing Ira P. Robbins, The Legal Dimensions of Private Incarceration, 38 AM. U. L. REV. 531, 752 (1989)) (Robbins’s Model Contract
“calls for an employee of the contracting agency to have access to prison facilities and all records
kept by the contractor at all times”); Low, supra note 217 (citing CLARK ET AL., supra note 457, at
R-24, ch. XI).
465
Cásarez, supra note 464, at 295; Dolovich, supra note 5, at 490–95.
466
See Low, supra note 217, at 38.
467
See Freeman, supra note 482, at 1317.
468
See Volokh, supra note 145.
469
See also id. at 293 (American Correctional Association requires that certain records be
maintained “for facility accreditation and the contracting agency”).
470
5 U.S.C. § 552.
471
Cásarez, supra note 464, at 268–79; Forsham v. Harris, 445 U.S. 169 (1980) (whether private firm subject to FOIA depends on whether subject to extensive, day-to-day government control).
461

Draft—Please do not circulate

76

VOLOKH

actually possess the documents.472 Even after these hurdles, much
qualifying information, like contracts or incident reports, would be
exempt under exemption 4, which protects “trade secrets and
commercial or financial information . . . [that is] privileged or confidential.”473 Exemption 4 could be applied either if “disclosure
could impair the reliability of data,”474 or if “disclosure would
cause substantial competitive injury to the provider.”475 The competitive injury justification could be fairly broad—knowing the
terms of a contract, for instance, can reveal the terms of the winning proposal to the winning firm’s competitors.476 Indeed, FOIA
has been criticized as “a lawful tool of industrial espionage.”477 On
the other hand, says Cásarez, FOIA provides for the disclosure of
“reasonably segregable portion[s]” of documents,478 which “should
include monitoring and reporting requirements.”479 Logan counsels
against “saddl[ing] private prison operators with expensive monitoring requirements ‘far beyond those that exist for government
prisons,’”480 but FOIA applicability would cut in the direction of
establishing parity.
Similar legislative fixes are possible in the states: for instance,
in Florida and Georgia, open records acts “already apply to private
organizations that act on behalf of state agencies.”481 All of this (as
well as any relevant public-law value) could also be imposed on
private contractors by contract; Jody Freeman calls this process
“publicization.”482
Another possibility is to assure access to the prison by the public and the press.483 Bentham, who had smart things to say about
472

Cásarez, supra note 464, at 279–84; Kissinger v. Reporters Comm. for Freedom of the
Press, 445 U.S. 136 (1990) (FOIA requires agency to disclose only documents it has “created and
retained”).
473
Cásarez, supra note 464, at 284–91; 5 U.S.C. § 552(b)(4).
474
Cásarez, supra note 464, at 287 (citing Critical Mass Energy Project v. NRC, 975 F.2d 871,
878 (D.C. Cir. 1992)).
475
Id.
476
Id. at 289; see also text accompanying supra note 36.
477
Id. at 292 (quoting Stephen S. Madsen, Note, Protecting Confidential Business Information
from Federal Agency Disclosure After Chrysler Corp. v. Brown, 80 COLUM. L. REV. 109, 113
(1980)).
478
5 U.S.C. § 552(b).
479
Cásarez, supra note 464, at 289.
480
Id. at 260 (quoting CHARLES LOGAN, PRIVATE PRISONS: CONS AND PROS 147 (1990)).
481
Id. at 296 (quoting FLA. STAT. ANN. § 119.011(2); GA. CODE ANN. § 50-18-70(a)).
482
Pronounced [pŭb'lĭ-kĭ-zā'shən]. Jody Freeman, Extending Public Law Norms Through Privatization, 116 HARV. L. REV. 1285, 1285 (2003).
483
Id. at 299 (citing Robbins, supra note 464, at 752–53 (Model Contract § 6(B))).

Draft—Please do not circulate

2013]

PERFORMANCE MEASURES

77

the bidding process two centuries ago,484 also argued for “essentially unrestricted public access”485 to (private) facilities. His prison design:
enables the whole establishment to be inspected almost at a
view; it would be my study to render it a spectacle, as persons of all classes would, in the way of amusement, be curious to partake of, and that not only on Sundays at the time
of Divine service, but on ordinary days at meal times or
times of work, providing therefore a system of inspection,
universal, free, and gratuitous, the most effectual and permanent securities against abuse.486
I don’t want to endorse watching prisoners as a source of
amusement, but the idea of public access does seem to have some
advantages in terms of accountability.
V. CONCLUSION
The failure of the comparative effectiveness studies, therefore,
is completely understandable. Aside from the methodological
problems, it’s quite plausible that the results of prison privatization
have been inconclusive because the changes in prison management
that would lead to better performance are often neither permitted
nor rewarded.
Using performance measures would change this by helping us
do valid comparative studies, enabling the fair public-private competitions that are a hallmark of competitive neutrality, and pushing
policymakers to clearly formulate what we want out of prisons.
Even better, using performance measures directly to drive compensation has the potential to radically alter prison outcomes by rewarding good performance and penalizing bad performance; this
definitely has applicability for private prisons but could possibly
be used for public prison wardens as well.
The critiques are serious, but I don’t believe they undermine
the experiment too seriously.
The information necessary to calculate the True Social Values
in an efficiency framework may never be available, but we can ap484
485
486

See text accompanying supra note 245.
Durham, supra note 20, at 69.
Id. (quoting JEREMY BENTHAM, A BENTHAM READER 200 (Mary Peter Mack ed., 1969)).

Draft—Please do not circulate

78

VOLOKH

proach the exercise with an air of humility, seeking only to improve incentives at the margins, not to achieve optimal social engineering.
The use of market incentives probably won’t alter the publicinterestedness of those who work at private prison firms, but it
might alter the mix of people who choose to work in the public
sector; on the other hand, combined with social impact bonds, performance-based compensation can also spur the growth of nonprofit providers. Because small firms and nonprofits are particularly sensitive to risk, the incentives should only be moderately highpowered, to trade off incentives and risk tolerance.
Performance-based compensation will give rise to certain possibly undesirable strategic behavior. If providers can set their own
goals, they’ll be inclined to set them in ways that are easy to meet;
this is why providers shouldn’t set the goals at all, and in any event
compensation should be based on the level of a continuous variable, not a binary goal. If some dimensions of quality are hard to
measure, performance-based compensation will bias providers’
effort toward the more measurable aspects of performance; this
means that some reliance on inputs and outputs will still be necessary, having due regard for the need to avoid choosing measures
that can be easily and undesirably manipulated by providers. Compensation schemes might lead providers to concentrate on treating
certain inmates and neglect others; even if this is bad (which isn’t
clear), the problem can be alleviated by inmate-specific rewards.
Finally, the levels of the measures themselves can be falsified,
which points to the need for serious investments in monitoring and
robust disclosure regimes.
These concerns are real, but the lesson to take from them is that
more experimentation is required to see how much of a real-world
effect they have and to what degree they really vitiate the promise
of performance incentives. The status quo, where the level of experimentation is close to zero, is unlikely to be optimal.

Draft—Please do not circulate
The Habeas Citebook Ineffective Counsel Side