The Vitality of Mythical Numbers

August 16, 2006  |  Edward Tufte
14 Comment(s)

Max Singer published a classic article on policy-making numbers that over-reach and
exaggerate the scope and urgency of the problems: “The Vitality of Mythical Numbers,
Public Interest (Spring, 1971), 3-9.

For years in teaching evidence for policy-making at Princeton and Yale, I used this article
as the first item on my reading list. The conclusion that a particular policy number over-reaches
has of course to be earned by some relevant evidence (and not by the generality that policy numbers over-reach). Such evidence can sometimes come
from simple calculations, approximations, and cross-checks, as Singer illustrates in his
paper.

Journalists pick up, quote, and repeat scary numbers in faux-trend stories.

Edward Tufte

Here is Singer’s classic article:

THE VITALITY OF MYTHICAL NUMBERS

by Max Singer

It is generally assumed that heroin addicts in New York City steal
some two to five billion dollars worth of property a year, and
commit approximately half of all the property crimes. Such
estimates of addict crime are used by an organization like RAND,
by a political figure like Howard Samuels, and even by the
Attorney General of the United States. The estimate that half the
property crimes are committed by addicts was originally attributed
to a police official and has been used so often that it is now
part of the common wisdom.

The amount of property stolen by addicts is usually estimated
in something like the following manner:

There are 100,000 addicts with an average habit of $30.00 per
day. This means addicts must have some $1.1 billion a year to pay
for their heroin (100,000 x 365 x $30.00). Because the addict
must sell the property he steals to a fence for only about a
quarter of its value, or less, addicts must steal some $4 to $5
billion a year to pay for their heroin.

These calculations can be made with more or less
sophistication. One can allow for the fact that the kind of
addicts who make their living illegally typically spend upwards of
a quarter of their time in jail, which would reduce the amount of
crime by a quarter. (_The New York Times_ recently reported on
the death of William “Donkey” Reilly. A 74-year-old ex-addict who
had been addicted for 54 years, he had spent 30 of those years in
prison.) Some of what the addict steals is cash, none of which
has to go to a fence. A large part of the cost of heroin is paid
for by dealing in the heroin business, rather than stealing from
society, and another large part by prostitution, including male
addicts living off prostitutes. But no matter how carefully you
slice it, if one tries to estimate the value of property stolen by
addicts by assuming that there are 100,000 addicts and estimating
what is the minimum amount they would have to steal to support
themselves and their habits (after making generous estimates for
legal income), one comes up with a number in the neighborhood of
$1 billion a year for New York City.

But what happens if you approach the question from the other
side? Suppose we ask, “How much property is stolen–by addicts or
anyone else?” Addict theft must be less than total theft. What
is the value of property stolen in New York City in any year?
Somewhat surprisingly to me when I first asked, this turned out to
be a difficult question to answer, even approximately. No one had
any estimates that they had even the faintest confidence in, and
the question doesn’t seem to have been much asked. The amount of
officially reported theft in New York City is approximately $300
million a year, of which about $100 million is the value of
automobile theft (a crime that is rarely committed by addicts).
But it is clear that there is a very large volume of crime that
is not reported; for example, shoplifting is not normally reported
to the police. (Much property loss to thieves is not reported to
insurance companies either, and the insurance industry had no good
estimate for total theft.)

It turns out, however, that if one is only asking a question
like, “Is it possible that addicts stole $1 billion worth of
property in New York City last year?” is relatively simple to
estimate the amount of property stolen. It is clear that the two
biggest components of addict theft are shoplifting and burglary.
What _could_ the value of property shoplifted by addicts be? All
retail sales in New York City are on the order of $15 billion a
year. This includes automobiles, carpets, diamond rings, and
other items not usually available to shoplifters. A reasonable
number for inventory loss to retail establishments is 2%. This
number includes management embezzlers, stealing by clerks,
shipping departments, truckers, etc. (Department stores,
particularly, have reported a large increase in shoplifting in
recent years, but they are among the most vulnerable of retail
establishments and not important enough to bring the overall rate
much above 2%.) It is generally agreed that substantially more
than half of the property missing from retail establishments is
taken by employees, the remainder being lost to outside
shoplifters. But let us credit shoplifters with stealing 1% of
all the property sold at retail in New York City–this would be
about $150 million a year.

What about burglary? There are something like two and one-
half million households in New York City. Suppose that on the
average one out of five of them is robbed or burglarized every
year. This takes into account that in some areas burglary is even
more commonplace, and that some households are burglarized more
than once a year. This would mean 500,000 burglaries a year. The
average value of property taken in a burglary might be on the
order of $200. In some burglaries, of course, much larger amounts
of property are taken, but these higher value burglaries are much
rarer, and often are committed by non-addict professional thieves.
If we use the number of $200 x 500,000 burglaries, we get $100
million of property stolen from people’s homes in a year in New
York City.

Obviously, none of these estimated values is either sacred or substantiated. You can make your own estimate. The estimates here have the character that it would be very surprising if they were wrong by a factor of 10, and not very important for the conclusion if they were wrong by a factor of two. (This is a good position for an estimator to be in.)

Obviously not all addict theft is property taken from stores or from people’s homes. One of the most feared types of addict crime is property taken from the persons of New Yorkers in muggings and other forms of robbery. We can estimate this, too. Suppose that on the average, one person in 10 has property taken from his person by muggers or robbers each year. That would be 800,000 such robberies, and if the average one produced $100 (which it is very unlikely to do), $8 million a year would be taken in this form of theft.

So we can see that if we credit addicts with _all_ of the shoplifting, _all_ of the theft from homes, and _all_ of the theft from persons, total property stolen by addicts in a year in New York City amounts to some $300 million. You can throw in all the “fudge factors” you want, add all the other miscellaneous crimes that addicts commit, but no matter what you do, it is difficult to find a basis for estimating that addicts steal over half a billion dollars per year, and a quarter billion looks like a better estimate, although perhaps on the high side. After all, there must be some thieves who are not addicts.

Thus, I believe we have shown that whereas it is widely assumed that addicts steal from $2 billion to $5 billion a year in New York City, the actual number is _ten_ times smaller, and that this can be demonstrated by five minutes of thought.[1] So what? A quarter billion dollars’ worth of property is still a lot of property. It exceeds the amount of money spent annually on addict rehabilitation and other programs to prevent and control addiction. Furthermore, the value of the property stolen by addicts is a small part of the total cost to society of addict theft. A much larger cost is paid in fear, changed neighborhood atmosphere, the cost of precautions, and other echoing and re-echoing reactions to theft and its danger.

One point in this exercise in estimating the value of
property stolen by addicts is to shed some light on people’s
attitudes toward numbers. People feel that there is a lot of
addict crime, and that $2 billion is a large number, so they are
inclined to believe that there is $2 billion worth of addict
theft. But $250 million is a large number, too, and if our sense
of perspective were not distorted by daily consciousness of
federal expenditures, most people would be quite content to accept
$250 million a year as a lot of theft.

Along the same lines, this exercise is another reminder that
even responsible officials, responsible newspapers, and
responsible research groups pick up and pass on as gospel numbers
that have no real basis in fact. We are reminded by this
experience that because an estimate has been used widely by a
variety of people who should know what they are talking about, one
cannot assume that the estimate is even approximately correct.

But there is a much more important implication of the fact
that there cannot be nearly so much addict theft as people
believe. This implication is that there probably cannot be as
many addicts as people believe. Most of the money paid for heroin
bought at retail comes from stealing, and most addicts buy at
retail. Therefore, the number of addicts is basically–although
imprecisely–limited by the amount of theft. (The estimate
developed in a Hudson Institute study was that close to half of
the volume of heroin consumed is used by people in the heroin
distribution system who do not buy at retail, and do not pay with
stolen property but with their “services” in the distribution
system.[2]) But while the people in the business (at lower levels)
consume close to half the heroin, they are only some one-sixth or
one-seventh of the total number of addicts. They are the ones who
can afford big habits.

The most popular, informal estimate of addicts in New York
City is 100,000-plus (usually with an emphasis on the “plus”).
The federal register in Washington lists some 30,000 addicts in
New York City, and the New York City Department of Health’s
register of addicts’ names lists some 70,000. While all the
people on those lists are not still active addicts–many of them
are dead or in prison–most people believe that there are many
addicts who are not on any list. It is common to regard the
estimate of 100,000 addicts in New York City as a very
conservative one. Dr. Judianne Densen-Gerber was widely quoted in
1970 for her estimate that there would be over 100,000 teenage
addicts by the end of the summer. And there are obviously many
addicts of 20 years of age and more.[3]

In discussing the number of addicts in this article, we will
be talking about the kind of person one thinks of when the term
“addict” is used.[4] A better term might be “street addict.” This
is a person who normally uses heroin every day. He is the kind of
person who looks and acts like the normal picture of an addict.
We exclude here the people in the medical profession who are
frequent users of heroin or other opiates, or are addicted to
them, students who use heroin occasionally, wealthy people who are
addicted but do not need to steal and do not frequent the normal
addict hangouts, etc. When we are addressing the “addict
problem,” it is much less important that we include these cases;
while they are undoubtedly problems in varying degrees, they are a
very different type of problem than that posed by the typical
street addict.

The amount of property stolen by addicts suggests that the
number of New York City street addicts may be more like 70,000
than 100,000, and almost certainly cannot be anything like the
200,000 number that is sometimes used. Several other simple ways
of estimating the number of street addicts lead to a similar
conclusion.

Experience with the addict population has led observers to
estimate that the average street addict spends a quarter to a
third of his time in prison. (Some students of the subject, such
as Edward Preble and John J. Casey, Jr., believe the average to be
over 40%.) This would imply that at any one time, one-quarter to
one-third of the addict population is in prison, and that the
total addict population can be estimated by multiplying the number
of addicts who are in prison by three or four. Of course the
number of addicts who are in prison is not a known quantity (and,
in fact, as we have indicated above, not even a very precise
concept). However, one can make reasonable estimates of the
number of addicts in prison (and for this purpose we can include
the addicts in various involuntary treatment centers). This
number is approximately 14,000-17,000, which is quite compatible
with an estimate of 70,000 total New York City street addicts.

Another way of estimating the total number of street addicts
in New York City is to use the demographic information that is
available about the addict population. For example, we can be
reasonable certain that some 25% of the street addict population
in New York City is Puerto Rican, and some 50% are blacks. We
know that approximately five out of six street addicts are male,
and that 50% of the street addicts are between the ages of 16 and
25. This would mean that 20% of the total number of addicts are
black males between the age of 16 and 25. If there were 70,000
addicts, this would mean that 14,000 blacks between the ages of 16
and 25 are addicts. But altogether there are only about 140,000
blacks between the ages of 16 and 25 in the city–perhaps half of
them living in poverty areas. This means that if there are 70,000
addicts in the city, one in 10 black youths are addicts, and if
there are 100,000 addicts, nearly one in six are, and if there are
200,000 addicts, one in three. You can decide for yourself which
of these degrees of penetration of the young black male group is
most believable, but it is rather clear that the number of 200,000
addicts is implausible. Similarly, the total of 70,000 street
addicts would imply 7,000 young Puerto Rican males are addicted,
and the total number of Puerto Rican boys between the ages of 17
and 25 in New York City is about 70,000.

None of the above calculations is meant in any way to
downplay the importance of the problem of heroin addiction.
Heroin is a terrible curse. When you think of the individual
tragedy involved, 70,000 is an awfully large number of addicts.
And if you have to work for a living, $250 million is an awful lot
of money to have stolen from the citizens of the city to be
transferred through the hands of addicts and fences into the
pockets of those who import and distribute heroin, and those who
take bribes or perform other services for the heroin industry.

The main point of this article may well be to illustrate how
far one can go in bounding a problem by taking numbers seriously,
seeing what they imply, checking various implications against each
other and against general knowledge (such as the number of persons
or households in the city). Small efforts in this direction can
go a long way to help ordinary people and responsible officials to
cope with experts of various kinds.

Notes

[1] Mythical numbers may be more mythical and have more vitality
in the area of crime than in most areas. In the early 1950s the
Kefauver Committee published a $20 billion estimate for the annual
“take” of gambling in the United States. The figure actually was
“picked from a hat.” One staff member said: “We had no real idea
of the money spent. The California Crime Commission said $12
billion. Virgil Petersen of Chicago said $30 billion. We picked
$20 billion as the balance of the two.”

An unusual example of a mythical number that had a vigorous
life–the assertion that 28 Black Panthers had been murdered by
police–is given a careful biography by Edward Jay Epstein in the
February 13, 1971, _New Yorker_. (It turned out that there were
19 Panthers killed, ten of them by the police, and eight of these
in situations where it seems likely that the Panthers took the
initiative.)

[2] A parallel datum was developed in a later study by St. Luke’s
Hospital of 81 addicts–average age 34. More than one-half of the
heroin consumed by these addicts, over a year, had been paid for
by the sale of heroin. Incidentally, these 81 addicts had stolen
an average of $9,000 worth of property in the previous year.

[3] Among other recent estimators we may note a Marxist, Sol
Yurick, who gives us “500,000 junkies” (_Monthly Review_, December
1970), and William R. Corson, who contends, in the December 1970
_Penthouse_, that “today at least 2,500,000 black Americans are
hooked on heroin.”

[4] There is an interesting anomaly about the word “addict.” Most
people, if pressed for a definition of an “addict,” would say he
is a person who regularly takes heroin (or some such drug) and
who, if he fails to get his regular dose of heroin, will have
unpleasant or painful withdrawal symptoms. But this definition
would not apply to a large part of what is generally recognized as
the “addict population.” In fact, it would not apply to most
certified addicts. An addict who has been detoxified or who has
been imprisoned and kept away from drugs for a week or so would
not fit the normal definition of “addict.” He no longer has any
physical symptoms resulting from not taking heroin. “Donkey”
Reilly would certainly fulfill most people’s ideas of an addict,
but for 30 of the 54 years he was an “addict” he was in prison,
and he was certainly not actively addicted to heroin during most
of the time he spent in prison, which was more than half of his
“addict” career (although a certain amount of drugs are available
in prison).

Reprinted with permission from The Public Interest, no. 23,
Spring 1971, pp. 3-9. Copyright (c) 1971 by National Affairs,
Inc.

Topics: E.T.
Comments
  • ET says:

    A recent article is Slate by Jack Shafer reports the GAO’s Singer-style analysis:
    http://www.slate.com/id/2147876

  • Rod says:

    Thoughtful article.

    Because my daughter is an addict, I realized when I read the article that a major source of the money with which addicts buy drugs was not commented on. Many of the addicts I have met get most of their money for drugs by panhandling.

    One of the reasons so many mythical numbers involve crime statistics is that the decision to report a crime and the decision of how to classify it are highly discretionary. For example, child abuse was much less frequently reported fifty years ago. Have we had an avalanche of abuse because more is reported now, or are people more sensitized and aware of their rights, or are social workers (and health care professionals) trained to look for it more readily, or are some kids, after being made aware of the consequences of making the accusations, using the charge to accomplish their own agendas?

    Probably a little of several of the above.

  • Karl Hartkopf says:

    Good timing for the Singer piece; a new study does something similar with Traumatic Stress Disorder in Vietnam Vets –
    http://www.npr.org/templates/story/story.php?storyId=5665198&ft=1&f=1007

    Oh yes, and ET got himself on NPR, as well, for Beautiful Evidence;
    http://www.npr.org/templates/story/story.php?storyId=5673332&ft=1&f=1007

    K

  • Athel Cornish-Bowden says:

    It would be nice to have a similarly pentetrating analysis of the huge estimates of the losses that software and music publishers claim they suffer as a consequence of illegal copying. Illegal copying is not a good thing, of course, but I suspect that the main people who suffer from it are the people who pay the prices demanded by publishers, and not the publishers themselves. I cannot be sure of it (and that is why I would like to see an analysis done by someone who knows), but I think it is quite plausible to suppose that publishers don’t suffer at all, because the circulation of illegal copies generates more than enough publicity to produce at least as many extra sales as there would be with no illegal copying.

    I’ve been suspicious of software prices ever since I made the move from mainframes to small computers about twenty years ago. (Before that it never occurred to me to ask how the software like operating systems and compilers that were built into university mainframe computers was paid for. If I thought about it all I probably thought the price of the software was included in the price of the machine.) At that time I thought that given time and energy I could probably write a word-processor as satisfactory as those then on the market, but that I wouldn’t know where to begin if I wanted to write a compiler. Having learnt a bit more since then about Polish notation I realize that writing a compiler might be marginally less difficult than I thought, but still more difficult than writing a word-processor. Yet 20 years ago one could buy a perfectly serviceable compiler for a PC (Turbo Pascal, for example) for a far lower price than one needed to pay for a word-processor. As far as I can see the only explanation is that the prices have nothing to do with profit margins or with the amount of investment that was in the development, but everything to do with what the market will stand.

    Re-reading this, I fear that in the second paragraph I have drifted away from the topic, but I hope it will stand up as an illustration of why the analysis asked in the first paragraph would be worth having.

  • Athel Cornish-Bowden says:

    Another case is to be found in vol. I (p. 210 in 3rd edn., 1969) of Kendall and Stuart’s classic work The Advanced Theory of Statistics, where they list 24 forecasts of potato yields in England and Wales in 1929-1936. Of the 24 forecasts, one was spot on, but all the others underestimated the actual results. As they commented, the “table exhibits very clearly … the chronic pessimism of crop forecasts”, and they go on to say that “one of the commoner misunderstandings … is based on the supposition that, though individuals may make mistakes, their errors will cancel out in the aggregate”. My impression is that “the chronic pessimism of crop forecasts” is alive and well in 2006, at least in France. Every year we have dire predictions of the catastrophic yields that farmers are going to have, and every year they survive to complain again the following year.

    Incidentally, although Kendall and Stuart’s book calls itself “advanced”, it is a lot more readable than many books that claim to be elementary. At least, volume I is. I never bought volumes II and III, but my recollection from library copies is that they were much heavier going.

  • John Galada says:

    On the new Johns Hopkins University study estimating 650,000 Iraqi deaths, far more
    than the official estimate of 30,000: does the new study over-reach? Or does the official
    estimate under-reach?

    Mortality after the 2003 invasion of Iraq: a cross-sectional cluster sample survey

    by Prof. Gilbert Burnham MD

    http://
    fairuse.100webcustomers.com/sf/LancetStudy06.htm

    Iraqi Death Count: 650,000? How to read the latest study on mortality in Iraq.

    by Daniel Engber

    http://www.slate.com/id/2151418

  • Edward Tufte says:

    Looks like “identity theft” via lost hard drives doesn’t happen. See the excellent debunking
    story by Fred H. Cate in the Washington Post.

  • Edward Tufte says:

    Two interesting estimates, each probably 2 to 3 orders of magnitude on the high side:

    Cash value of pot crops is highlighted in report, by Eric Bailey,
    Los Angeles Times

    Litvinenko’s
    killers used polonium worth $10m to give massive overdose,
    by Daniel McGrory and Tony Halpin, The Times (UK)

    The stoned story in the Los Angeles Times contains the inevitable comment that
    the
    prankish estimate is “conservative.” Time to read Singer’s article (above) again.

  • Edward Tufte says:

    Here’s the general principle of “conservative estimates:” If an estimate is described as conservative, it’s not. “Conservative” is often a cheerleading, self-congratulatory, and mushy word that replaces
    quantitative estimates of error.

    Recall the Columbia shuttle slide: “Review of Test Data Indicates Conservatism for Tile
    Penetration” in
    our thread PowerPoint does rocket science–and better
    techniques for technical reports.

  • Edward Tufte says:

     image1

  • Edward Tufte says:

    For good, practical advice on effective analytical thinking, see here

  • George V. Reilly says:

    I’m surprised that no-one has noted that 800,000 robberies at an average of $100 per robbery yields $80 million, not $8 million. Not that it invalidates the conclusions.

  • Doug says:

    Great stuff. I will be using the Singer paper in a class I teach on policy analysis. I also use writings on how to measure hunger and debates over how many “defensive gun uses” there are per year. Both are useful for discussions about definitions (e.g., who says your use of your gun truly was “defensive” as oppposed to completely unnecessary).

    The kind of reasoning Singer does is what many people call Fermi Problems. They can be fun to go over in class, too. E.g., how many piano tuners work in Chicago.

  • Matt R says:

    Enrico Fermi not only estimated the number of piano tuners in Chicago, one of his most
    famous estimates was the one he made during the first atom bomb test on 16 July, 1945.
    There was an important question in the minds of the bomb makers on the yield of this new
    class of weapon. During the test Fermi estimated that it was about 10 kilotons.

    Fermi didn’t guess – as the shockwave from the explosion hit Fermi he threw a handful of
    paper scraps into the air and watched how far they moved. Using this data and some
    assumptions he made his estimate. It was surprisingly accurate. Not only to the correct
    order of magnitude, but within a very respectable factor of 2. The actual yield was 19
    kilotons.

    Fermi used the “piano tuner” approach to train his students to be able to conceptualise
    and evaluate “order of magnitude” estimates.

    For a very recent Fermi estimate – of the energy released (and volume and mass of sand
    ejected) during the eruption of the Puyehue-Cordon Caulle volcano in Chile on 4 July see
    here = http://arxiv.org/abs/1109.1165.

    Fermi Problem: Power developed at the eruption of the Puyehue-Cordon Caulle volcanic
    system in June 2011

    By Hernan Asorey & Arturo Lopez Davalos

    Abstract of the paper reads;

    On June 4 2011 the Puyehue-Cordon Caulle volcanic system produced a pyroclastic subplinian
    eruption reaching level 3 in the volcanic explosivity index. The first stage of the
    eruption released sand and ashes that affected small towns and cities in the surrounding
    areas, including San Carlos de Bariloche, in Argentina, one of the largest cities in the
    North Patagonian Andean region. By treating the eruption as a Fermi problem, we estimated
    the volume and mass of sand ejected as well as the energy and power released during the
    eruptive phase. We then put the results in context by comparing the obtained values with
    everyday quantities, like the load of a cargo truck or the electric power produced in
    Argentina. These calculations have been done as a pedagogic exercise, and after evaluation
    of the hypothesis was done in the classroom, the calculations have been performed by the
    students. These are students of the first physics course at the Physics and Chemistry
    Teacher Programs of the Universidad Nacional de Rio Negro

    Best wishes

    Matt

Contribute

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.