All 4 books by Edward Tufte now in
paperback editions, $100 for all 4
Visual Display of Quantitative Information
Paper/printing = original clothbound books.
Only available through ET's Graphics Press:
catalog + shopping cart
Edward Tufte e-books
Immediate download to any computer
connected to the internet:
La représentation de l'information
quantitative 200 pages $12
La Representación Visual de Información
Cuantitativa 200 páginas $12
Visual and Statistical Thinking $2
The Cognitive Style of Powerpoint $2
Seeing Around + Feynman Diagrams $2
Data Analysis for Politics and Policy $2
catalog + shopping cart
Edward Tufte one-day course,
Presenting Data and Information
San Francisco, February 9, 10
San Jose, February 13
Arlington, March 31
Washington, April 1, 2
One of the prevailing orthodoxies of this forum - one to which I whole-heartedly subscribe - is that pie charts are bad and that the only thing worse than one pie chart is lots of them.
However, I offer the attached.
The original is 4cm by 6cm, and within that space I find it difficult to think how else I could represent the information. Location is given by the centre of each circle, and each circle is proportionate to the size of the total of which the segments are part. Obviously there are disadvantages: bar charts for each location would allow us to see whether the red of St. Albans is greater or smaller in absolute terms than the red of Luton — but I think that falls within the normal range of compromises we make in design. The designer has chosen to tell us about proportions and not absolutes. Using a bar chart instead of a
circle we should need a separate symbol for the location of the town and perhaps an arrow pointing to it.
-- Martin Ternouth (email)
You gave me a flashback to a college course in cartography with that graphic! I remember drawing something sinisterly like this having to do with registered voters in Wisconsin or some such. In India ink on onion skin. Iew.
How about small multiples retaining the colors for the categories? Or small comparison graphs like in Tufte's medical information display paper? Perhaps small counting figures like the letter value displays espoused by Tukey in "Exploratory Data Analysis" (I am unable to insert images into my messages--how does one do that?--or else I'd give you an example).
We all know that pie charts are bad but what about flags (ie, rectangles)? Are they more or less misleading? Like the little t-shirt diagrams in Envisioning Information but the internal areas are proportional to the values. Just a thought.
-- Max M Houck
Showing data for spatially located nouns is a difficult problem. Already both dimensions of paper are used for the underlying map; then 2-dimensional circles represent a 1-dimensional numbers; and finally the dreaded pie chart.
Worth a try is a table, with the nouns ordered by something important (the circle area variable?). Or perhaps a table-graph with ordered nouns and some little bars. Then have a map near the table.
The other problem in most data of this sort is their log-normal distribution (many small values, a few big values) over the geographic units (many small cities, a few big cities).
Some of the values in this map look wild: for example, Hitchin's little red slice labelled "1,500%".
-- Edward Tufte
If you presented the table of data as a side bar to the map, with the order of towns linked to their location vertically on the map. So, in the part of the map shown, the order would be - Arlesey, Bletchley, Hitchin, Letchworth, Buzzard and so on.
In this way, a reader could easily locate the data if they know the town, and the town if they have the data. An alphabetical list would leave an reader unfamiliar with the locations searching the entire map to find the town. You could then leave the circles and their size to represent whatever absolute you require. Overlapping circles would be less of a problem as the circles now only represent one data value.
-- Andrew Nicholls (email)
What is supposed to be the first pie chart ever appears in William Playfair's Statistical Breviary (1801). It is reproduced and discussed on pages 44-45 of The Visual Display of Quantitative Information:
-- Edward Tufte
As only the red segment of some towns are labelled with quantities, could some of the information be removed (for the purposes of the diagram) thus freeing up space for some other representation? The full set of data could be tabulated elsewhere.
Could a horizontal, scaled and sectioned bar under (or over) each town name do the job? That way, using a ruled edge, direct comparison of sizes between towns could be made, whilst the relative proprtions of the data, within a town, would still be visible. There may be scope for the numerical labels to sit within the bar, too, though this may make them too large for the area.
Both Hitchin and Hertford have odd numbers labelling their red segments. With the textured background, and it may be down to my monitor, but I can't make up my mind if they are in the thousands - 1,500% and 1,030% (which seems to go against the relative sizes), or decimalised to too many places - 1.500% and 1.030%. I suspect the former.
Without wishing to digress the thread, as both my town of residence and place of work are on this map, I would be interested in what data is being shown.
-- Adam (email)
Thank you, that's interesting. That explains Hertford and Hitchin's percentages, then. Nowadays, in places like Hertford at least, there would be a segment for wood, though I would guess that it would be quite small.
-- Adam (email)
Right orthagonal with cast shadows???
I'm feeling like an illiterate- how *do* I post an image on this board?
-- Ziska (email)
I'm really interested in how information is communicated visually- so I'm
wondering- is this just as awful (or perhaps worse) than the pies? On the plus
side it would be simple to put in some paths- rivers, road, etc. which would
engage the viewer- *but* it still doesn't give you that nice side by side
comparison which allows you to compare the amounts easily.
I'm not an academic or a cartographer- I just measure elevators.... so please
bear with me...
-- Ziska Childs (email)
ET's books, especially The Visual Display of Quantitative Information, explain why Ziska's graph (however earnest) fails to help readers understand the data. Ziska's graph emphasises decoration rather than numbers: good graphs — that is, graphs that communicate with ease — maximise data and minimum ink. The best graphs aren't a puzzle for the reader, but make a clear statement.
In the last couple of years I have lead a one-day course on how to design readable tables and graphs. I have done a lot of thinking on what makes a good graph — I mean graphs readers can understand quickly and might recall later. Here are some points:
1. During the discovery stage of your work you can use any style or type of graph you wish. Design only becomes important as soon as you want to convey information to someone else. At that point you have to create graphs that communicate ideas to others.
2. Graphs communicate most easily when they have a specific message —for instance, "coffee production up!". They lose impact and are less successful at conveying ideas when their point is vague - for example, "The number of students in public high schools, 1993-2003."
3. Graphs are powerful when you use the title to reinforce your specific message — "The number of students in public high schools has fallen by a third in ten years ". Such transparent messages will be understood and remembered by readers. The graph itself acts as evidence for your assertion. Certainly if you don't tell readers what the graph is saying, some will never know.
4. Finally, after years of hard thinking I have concluded that graphs are like jokes: if you have to explain them they have failed.
-- Sally Bigwood (email)
Sorry if I offended.....and I do agree wholeheartedly with all of the above.
However this is a recurring problem- stats *and* geography. Simply put- the
more data you have- the less time you have to analyze that data- the more
people of different cultures who need to respond to the analysis- the more
trouble you are in.....That joke had better be a Henny Youngman one liner.
Now as to those feeble little attempts...just trying to respond to Adam's verbal
description. (And to the statement "using a bar chart instead of a circle we
should need a separate symbol for the location of the town and perhaps and
arrow pointing to it) In fact the principle behind this is decidedly "not" a graph
as you so wisely noted. It is a 17th century map. That map's purpose was to
show power and ownership- not percentages.
A sense of place is often more crucial than the statistics themselves - since
without the identification of the viewer with a familiar place- they will not listen
to the statistics. The "Hey that's my neighborhood" response should not be
underestimated- we saw it here in this thread....
Beauty as another method of engaging your audience should not be
underestimated. By beautiful I do not mean "ornate"
Perhaps the only joke with a good punch line in this case is a pie? The pies
do seem to win both the statistical and the geographic battle....
Isn't Martin right? Shouldn't we be able to do better?
-- Ziska Childs (email)
I must disagree with many of the comments about this graph,
including some made by my sibling, Sally Bigwood. ET said it all
in his post January 21 post, but I will (foolishly?) try to explicate.
As far as I can see, the map is unnecessary for the information.
Given the topic (changes in energy sources, 1954-1958) it is
unlikely that readers will be unfamiliar with the geography. It is
further unlikely that anyone, familiar or unfamiliar with the
location, needs to see that Arlesey is northwest of Letchworth to
appreciate and use the data. Geography might matter if we saw
a trend, such as increase in coal use as one moved west. More
interesting is what trends do exist: Do smaller cities rely more on
gasoline? Did the cities with large oil increases erect new
buildings and homes in the four years, changing energy
Once the map is removed, we're looking at pies. Pies are not evil
because of prevailing orthodoxy; the prevailing orthodoxy is a
reaction to the inescapable distortions of pies. The human eye
simply fails to make accurate estimation of the relative slices (for
a perceptual psychologist's opinion, look at the work of Steven
Kosslyn). Multiple pies compound the inaccuracies.
As ET's suggests, a table or table-graph is probably the best
bet. You will need to decide if you group the four energy source
together by town or group towns according to energy sources.
Avoid component bars and columns; like pies, they are difficult to
interpret and frequently distort the data. A separate bar/column
on increase in oil consumption would highlight this feature, but
won't work because of the extreme range (130 to 1,500). A table
A total illustration presenting different data comparisons (small
comparison graphs; a table; the map if necessary) would
resemble several of the NY Times graphics.
Now my disagreements with some of the arguments put forth
--"the normal range of compromises we make in design" The
essential feature lies in accurate depiction of the data. Ergo, get
rid of the map and pies. Make your compromises in these
deletions, rather than in distortions.
--"engage the viewer:" See Tufte's work. Select engaging data if
you want to engage. No amount of irrelevant ink (rivers, roads)
will create interest or beauty.
--"specific message" "explanatory titles:" Yes, one of the
problems with the original here, is that I'm unsure of the graph's
purpose (hence our digressions about geography). A good title
would tell us what that purpose is. And small multiples present
several specific messages. But explanatory titles and the joke
analogy work are two simplistic for many data displays.
Sophisticated information can have many dimensions (see
dissuasion of tree diagrams). Part of the power of Menard's map
is that it gives size of army, date, geography, and temperature. In
addition to the overall purpose and immediate impact (Look! The
army was this big and then fell to this small!), the map demands
examination. The more we examine the more we learn
(somewhat like reading an essay rather than an abstract?). As
we appreciate the interplay of contributing factors, the horror
becomes more acute than a number could convey.
As for "the statistical and the geographic battle," each set of data
is different and the purpose of each graph is different. Maps are
invaluable when appropriate; pies have some (limited) use. But
neither seems justified by Martin's information.
-- Melissa Spore (email)
I agree with Melissa's point that there would be no need of pies if there was no map - but the map is the reference base. In the atlas from which this comes there are several hundred maps of the whole UK (and thus several hundred maps of this area) that cover everthing from rainfall, grassland (distinguishing between permanent and temporary), telephone traffic, and the manufacture of cardboard boxes.
Most of these do not correlate with the information on energy use, but some do - or could do. Transport links for the delivery of coal (including navigable rivers), local gas works (this was the 1950s and North Sea natural gas had not yet replaced town gas), underlying local coal measures - not identified by town but by area - all these and others are something for which the location in flatland adds to the comprehension.
The atlas uses a variety of graphical techniques, including tables and bar charts.
To show the regional differences in size between argricultural holdings it uses this, which I find effective - even without a key - in showing that the landscape of large estates stops quite abruptly.
This is the distribution of oak woodland.
Two curiosities here: at the scale of the map each spot represents the area the spot covers (100 acres), and the concentration at the bottom is the remnant of the vast Andreswald oak forest that covered South-eastern England in the Dark Ages.
And finally this, which shows the proportion of people speaking Welsh - (brown and purple high - blue through red/yellow/green decreasing).
Here the map location is important as it suggests that the seaside resorts and their hinterlands along the north coast have been "colonised" by erstwhile day-trippers from the English-speaking conurbations of Liverpool and Manchester to the east. And is there any significance that Beaumaris and Conway (not seaside resorts) are the sites of castles built and manned by the English to keep the Welsh down?
I have included these different examples to illustrate that the cartographic team behind the atlas gave thought to the choices of representation. In starting this thread I wanted to examine the statement "only one thing worse than a pie chart and that is lots of them". What the responses have brought out is the reasoning: "pie charts are bad because . . ." which is very helpful.
However, in examining and understanding why they are bad, we must also
place them in the context of the whole set of data in the atlas, and in the space available on the page. My view is that they are defensible, and I have faith that pie charts were chosen by the cartographers because they were the best solution possible given all the other constraints: the decision was not careless or uninformed.
There is indeed one specific instance where I will use pie charts in preference to any other form of presentation: when I wish to draw attention to an unusual or unexpected proportion. I was retained by a large client to analyse the sale of their 300 financial services products so that they could see which ones were most profitable - they each cost more-or-less the same to produce. One product accounted for 78% and the rest were apparently nowhere - a result that was entirely unexpected. I think the mass of red in the pie chart is more effective than the height of a bar with a width one three-hundredth of the x-axis.
I also worked on a large project in a government department that was
producing national medical statistics based on every individual patient episode in the country - all the data. A wealth of fascinating issues: I have viewed medical statistics with a very cynical eye ever since. How, for example, do you publish meaningful epidemiological figures where in certain areas some 80% of the diagnoses recorded are impossible (ie internally inconsistent with the rest of the dataset - such as a 76-year-old man giving birth to twins), or where over 50% of the diagnoses on admission are simply
not filled in at all. The redoubtable Data Cleansing turned up in her black knitted shawl at every meeting and displayed the same ruthlessness as her nephews in the Ethnic department in eliminating large unwanted data populations and replacing them with a more respectable middle class. And in my researches I came across statistics that were not only unpublished, but politically unpublishable.
-- Martin Ternouth (email)
Perhaps the best, if not most data-rich, pie chart I've seen:
-- Stephen Hampshire (email)
More on Pie Charts
Most of the examples above discuss the value of combining Pie Graphs with maps. Rather than using Pie Charts in the geographic space, I am currently exploring the use of Pie Charts in the data space.
In a visualization project, still under development, called AgileGraph (www.agilegraph.com), I am considering the use of Pie Charts in a Scatter Plot to show the breakdown of aggregated data elements.
The diagram below shows a set of pre-owned SUVs by mileage and price. The plotted symbols are color coded by the vehicle's actual color. This shows three dimensions of the data in one diagram.
However, there are inevitable overlaps in this data set, and these are aggregated into larger symbols, designated with their total. Pie Charts are used to visually show the percentage of vehicles colors represented in that aggregate.
"Location" is still given by the center of each circle, meaning there are 5 vehicles for sale at or near $15K, each with roughly 80K miles. However, unlike the original example, each circle is not "proportionate to the size of the total of which the segments are part." This is an enhancement I am considering for the future.
I believe that this is an intuitive way to utilize Pie Graphs to quickly communicate three different dimensions of the data in a single diagram, while reducing visual clutter.
If you're looking for an SUV for under $15K, and with less than 20K miles, you can easily understand your available choices regarding vehicle color.
...pretend for a moment that none of the vehicles have two-tone paint. :)
-- Jeff Carpenter (email)
Via reddit.com today, I found this interesting gem, which examines the use of segmented square blocks to replace pie charts. I had thought that the only thing worse than a pie chart was more pie charts (or possibly, three-dimensional pie charts), but I think I have been mistaken.
At least pie wedges can be compared, unidimensionally, on the basis of degrees or radians.
-- Scott Zetlan (email)
One example where a pie chart makes some sense
I have loathed pie charts since reading The Visual Display of Quantitative Information. The main reason is difficulty in quantifying the “sectors” swept by a typical pie chart compared to simply reading the same data in a table.
The pie chart could be useful in the context of a clock, as this .pdf illustrates. The example a typical hour of the Michael Reagan Show for “pitching” to potential advertisers. I am not a fan of this show—I just offer the example. Please disregard the “chart-junky” appearance, needless coding, lack of scale and problematic typography. The hyperlink is a truncation of the actual file name:
***Editor: Please insert graphic image here, if desired***
The graphic allows us to dissect a typical hour of a typical talk-show. There are four program segments per hour, with two “floating” breaks (at the show’s discretion) and two “hard” breaks (fixed time) per hour. Program “content” totals 44:20 minutes per hour. Local and national advertising (“avails” in Radio-Speak) total 8 minutes and 6 minutes, respectively. News and other fluff round out the hour.
I don't recall ever seeing a pie chart-as-clock before.
-- Jon Gross (email)
Natural disasters kill Americans
I would disaggregate the data in the map J. Jenks posted, and plot instead one small dot
for each incidence, in an appropriate color. I would expect to see features of the
landscape become visible: a wandering line of blue dots should show the Mississippi
River's flood plains, and clusters of brown dots should show cities along California's
Instead of using the full range of gray for SMR, I'd tone it down to minimal effective
steps from white, to make plotted, colored dots more consistently visible. It's probably
still worth coloring in the entire region, if the SMR data is averaged over the region.
I'd move the colors for Tornado and Other to the dark end of gray, or select additional
basic colors, such as red and black. I'd generally reshuffle colors so that the events
which are most common (severe weather, winter weather, other) have cool, pastel colors,
since there are many data points, and infrequent or strictly regional events
(geophysical, lightning, tornado) have stronger colors to draw attention.
-- Jason Catena (email)
A pie-pie chart
-- Edward Tufte
An old-school gamer's pie chart
-- Irwin Anolik (email)
Hey, I know the Pac Man post is a full-on joke, but I can't resist mentioning that its legend is itself misleading. The "Percentage of chart that looks like" Pac Man is in fact 100%; the outline of Pac Man cannot be said to exist without the missing wedge. While I can't put my finger on it right now, this seems to have a companion mistake in serious visualizations. Perhaps it is a rough parallel to when an "Others" wedge is > 33% but isn't even labeled because it is counter to the intended "influence" of the chart... maybe something else, but it seems like a classic "truthy chart, falsy legend" kind of thing. Again, I know it is a joke, and no offense, but it's missing something (unless that too was part of the joke I'm not getting).
-- Quito (email)
If the only thing worse than a pie chart is lots of them, then I think I've discovered that the only thing worse than lots of them, is half of one. See attached.
This example comes from a performance management review. The key result areas are weighted out of 100%, and a score is calculated in the table at right. The example total given is 93. The writer is then instructed to place the score total in the chart. Where the score lands will decide whether the employee has failed to meet expectations (FM) through to Far Exceeding Expectations (FE).
Note, how a score in the table (with an apparent maximum of 100), when transferred to the pie, becomes a percentage, but not out of 100!!
How does one calculate a lie factor in a pie chart when the proportions are seemingly equal but the numerical data is not mathematically sound. Is it possible to calculate, or is calculating the lie factor restricted to other types of graphical representation?
-- Liam (email)
To digress a little further on an eight year old thread - I think you are both right. The yellow area of the chart does look like PacMan, AND the "percentage of chart that looks like PacMan" is 100%.
Consider this: The unlabeled space does not actually appear to be part of the pie chart, since it is not bounded by the edge of the pie. If we assume this to be the case, then the pie is simply not round. Now we can say that 100% of the pie indeed looks like PacMan, since the pie itself is PacMan-shaped, and the yellow area occupies 100% of the pie.
Further evidence of the difficulty in communicating via pie charts.
-- Thomas (email)
| Threads relevant to analytic design:|