Pie Charts
One of the prevailing orthodoxies of this forum – one to which I whole-heartedly subscribe – is that pie charts are bad and that the only thing worse than one pie chart is lots of them.
However, I offer the attached.
The original is 4cm by 6cm, and within that space I find it difficult to think how else I could represent the information. Location is given by the centre of each circle, and each circle is proportionate to the size of the total of which the segments are part. Obviously there are disadvantages: bar charts for each location would allow us to see whether the red of St. Albans is greater or smaller in absolute terms than the red of Luton — but I think that falls within the normal range of compromises we make in design. The designer has chosen to tell us about proportions and not absolutes. Using a bar chart instead of a
circle we should need a separate symbol for the location of the town and perhaps an arrow pointing to it.
Any views?
You gave me a flashback to a college course in cartography with that graphic! I remember drawing something sinisterly like this having to do with registered voters in Wisconsin or some such. In India ink on onion skin. Ew.
How about small multiples retaining the colors for the categories? Or small comparison graphs like in Tufte’s medical information display paper?
We all know that pie charts are bad but what about flags (ie, rectangles)? Are they more or less misleading? Like the little t-shirt diagrams in Envisioning Information but the internal areas are proportional to the values. Just a thought.
Showing data for spatially located nouns is a difficult problem. Already both dimensions of paper are used for the underlying map; then 2-dimensional circles represent a 1-dimensional numbers; and finally the dreaded pie chart.
Worth a try is a table, with the nouns ordered by something important (the circle area variable?). Or perhaps a table-graph with ordered nouns and some little bars. Then have a map near the table.
The other problem in most data of this sort is their log-normal distribution (many small values, a few big values) over the geographic units (many small cities, a few big cities).
Some of the values in this map look wild: for example, Hitchin’s little red slice labelled “1,500%”.
If you presented the table of data as a side bar to the map, with the order of towns linked to their location vertically on the map. So, in the part of the map shown, the order would be – Arlesey, Bletchley, Hitchin, Letchworth, Buzzard and so on.
In this way, a reader could easily locate the data if they know the town, and the town if they have the data. An alphabetical list would leave an reader unfamiliar with the locations searching the entire map to find the town. You could then leave the circles and their size to represent whatever absolute you require. Overlapping circles would be less of a problem as the circles now only represent one data value.
What is supposed to be the first pie chart ever appears in William Playfair’s Statistical Breviary (1801). It is reproduced and discussed on pages 44-45 of The Visual Display of Quantitative Information:
As only the red segment of some towns are labelled with quantities, could some of the information be removed (for the purposes of the diagram) thus freeing up space for some other representation? The full set of data could be tabulated elsewhere.
Could a horizontal, scaled and sectioned bar under (or over) each town name do the job? That way, using a ruled edge, direct comparison of sizes between towns could be made, whilst the relative proprtions of the data, within a town, would still be visible. There may be scope for the numerical labels to sit within the bar, too, though this may make them too large for the area.
Both Hitchin and Hertford have odd numbers labelling their red segments. With the textured background, and it may be down to my monitor, but I can’t make up my mind if they are in the thousands – 1,500% and 1,030% (which seems to go against the relative sizes), or decimalised to too many places – 1.500% and 1.030%. I suspect the former.
Here are some points on what makes a good graph:
1. During the discovery stage of your work you can use any style or type of graph you wish. Design only becomes important as soon as you want to convey information to someone else. At that point you have to create graphs that communicate ideas to others.
2. Graphs communicate most easily when they have a specific message — for instance, “coffee production up!” They lose impact and are less successful at conveying ideas when their point is vague — for example, “The number of students in public high schools, 1993-2003.”
3. Graphs are powerful when you use the title to reinforce your specific message — “The number of students in public high schools has fallen by a third in ten years.” Such transparent messages will be understood and remembered by readers. The graph itself acts as evidence for your assertion. Certainly if you don’t tell readers what the graph is saying, some will never know.
4. Finally, after years of hard thinking I have concluded that graphs are like jokes: if you have to explain them they have failed.
I must disagree with many of the comments about this graph,
including some made by my sibling, Sally Bigwood. ET said it all
in his post January 21 post, but I will (foolishly?) try to explicate.
As far as I can see, the map is unnecessary for the information.
Given the topic (changes in energy sources, 1954-1958) it is
unlikely that readers will be unfamiliar with the geography. It is
further unlikely that anyone, familiar or unfamiliar with the
location, needs to see that Arlesey is northwest of Letchworth to
appreciate and use the data. Geography might matter if we saw
a trend, such as increase in coal use as one moved west. More
interesting is what trends do exist: Do smaller cities rely more on
gasoline? Did the cities with large oil increases erect new
buildings and homes in the four years, changing energy
sources?
Once the map is removed, we’re looking at pies. Pies are not evil
because of prevailing orthodoxy; the prevailing orthodoxy is a
reaction to the inescapable distortions of pies. The human eye
simply fails to make accurate estimation of the relative slices (for
a perceptual psychologist’s opinion, look at the work of Steven
Kosslyn). Multiple pies compound the inaccuracies.
As ET suggests, a table or table-graph is probably the best
bet. You will need to decide if you group the four energy source
together by town or group towns according to energy sources.
Avoid component bars and columns; like pies, they are difficult to
interpret and frequently distort the data. A separate bar/column
on increase in oil consumption would highlight this feature, but
won’t work because of the extreme range (130 to 1,500). A table
becomes necessary.
A total illustration presenting different data comparisons (small
comparison graphs; a table; the map if necessary) would
resemble several of the NYTimes graphics.
Now my disagreements with some of the arguments put forth
here:
I agree with Melissa’s point that there would be no need of pies if there was no map – but the map is the reference base. In the atlas from which this comes there are several hundred maps of the whole UK (and thus several hundred maps of this area) that cover everthing from rainfall, grassland (distinguishing between permanent and temporary), telephone traffic, and the manufacture of cardboard boxes.
Most of these do not correlate with the information on energy use, but some do – or could do. Transport links for the delivery of coal (including navigable rivers), local gas works (this was the 1950s and North Sea natural gas had not yet replaced town gas), underlying local coal measures – not identified by town but by area – all these and others are something for which the location in flatland adds to the comprehension.
The atlas uses a variety of graphical techniques, including tables and bar charts.
IMAGE=barcharts
To show the regional differences in size between argricultural holdings it uses this, which I find effective – even without a key – in showing that the landscape of large estates stops quite abruptly.
This is the distribution of oak woodland.
Two curiosities here: at the scale of the map each spot represents the area the spot covers (100 acres), and the concentration at the bottom is the remnant of the vast Andreswald oak forest that covered South-eastern England in the Dark Ages.
And finally this, which shows the proportion of people speaking Welsh – (brown and purple high – blue through red/yellow/green decreasing).
Here the map location is important as it suggests that the seaside resorts and their hinterlands along the north coast have been “colonised” by erstwhile day-trippers from the English-speaking conurbations of Liverpool and Manchester to the east. And is there any significance that Beaumaris and Conway (not seaside resorts) are the sites of castles built and manned by the English to keep the Welsh down?
I have included these different examples to illustrate that the cartographic team behind the atlas gave thought to the choices of representation. In starting this thread I wanted to examine the statement “only one thing worse than a pie chart and that is lots of them”. What the responses have brought out is the reasoning: “pie charts are bad because . . .” which is very helpful.
However, in examining and understanding why they are bad, we must also
place them in the context of the whole set of data in the atlas, and in the space available on the page. My view is that they are defensible, and I have faith that pie charts were chosen by the cartographers because they were the best solution possible given all the other constraints: the decision was not careless or uninformed.
There is indeed one specific instance where I will use pie charts in preference to any other form of presentation: when I wish to draw attention to an unusual or unexpected proportion. I was retained by a large client to analyse the sale of their 300 financial services products so that they could see which ones were most profitable – they each cost more-or-less the same to produce. One product accounted for 78% and the rest were apparently nowhere – a result that was entirely unexpected. I think the mass of red in the pie chart is more effective than the height of a bar with a width one three-hundredth of the x-axis.
I also worked on a large project in a government department that was
producing national medical statistics based on every individual patient episode in the country – all the data. A wealth of fascinating issues: I have viewed medical statistics with a very cynical eye ever since. How, for example, do you publish meaningful epidemiological figures where in certain areas some 80% of the diagnoses recorded are impossible (ie internally inconsistent with the rest of the dataset – such as a 76-year-old man giving birth to twins), or where over 50% of the diagnoses on admission are simply
not filled in at all. The redoubtable Data Cleansing turned up in her black knitted shawl at every meeting and displayed the same ruthlessness as her nephews in the Ethnic department in eliminating large unwanted data populations and replacing them with a more respectable middle class. And in my researches I came across statistics that were not only unpublished, but politically unpublishable.
Via reddit.com today, I found this interesting gem, which examines the use of segmented square blocks to replace pie charts. I had thought that the only thing worse than a pie chart was more pie charts (or possibly, three-dimensional pie charts), but I think I have been mistaken.
At least pie wedges can be compared, unidimensionally, on the basis of degrees or radians.
Direct comparison of pie charts vs. tables in the “Coalition of the Willing” from the Washington Post, from the piece “Boots on the Ground in Iraq.”
I have loathed pie charts since reading The Visual Display of Quantitative Information. The main reason is difficulty in quantifying the “sectors” swept by a typical pie chart compared to simply reading the same data in a table.
The pie chart could be useful in the context of a clock, as this .pdf illustrates. The example a typical hour of the Michael Reagan Show for “pitching” to potential advertisers. I am not a fan of this show — I just offer the example. Please disregard the “chart-junky” appearance, needless coding, lack of scale and problematic typography.
The graphic allows us to dissect a typical hour of a typical talk-show. There are four program segments per hour, with two “floating” breaks (at the show’s discretion) and two “hard” breaks (fixed time) per hour. Program “content” totals 44:20 minutes per hour. Local and national advertising (“avails” in Radio-Speak) total 8 minutes and 6 minutes, respectively. News and other fluff round out the hour.
I don’t recall ever seeing a pie chart-as-clock before.
A wonderful piece of chartjunk from none other than Steve Jobs during the Macworld 2008 keynote, covered by Engadget:
You got to love the 21.2% versus the 19.5% slices of the pie.
The same keynote also featured this bizarre choice of patterns:
There is hope that the general population will soon grow to dislike 3D pie charts.
The graphic below accompanies a prominent story online about the types of natural disasters that kill Americans — it is an important topic, but the assortment of pies seem to obfuscate rather than clarify the story. My thought was that E.T. and followers would have useful ideas on how to improve the presentation and eliminate the pies.
I would disaggregate the data in the map J. Jenks posted, and plot instead one small dot
for each incidence, in an appropriate color. I would expect to see features of the
landscape become visible: a wandering line of blue dots should show the Mississippi
River’s flood plains, and clusters of brown dots should show cities along California’s
fault lines.
Instead of using the full range of gray for SMR, I’d tone it down to minimal effective
steps from white, to make plotted, colored dots more consistently visible. It’s probably
still worth coloring in the entire region, if the SMR data is averaged over the region.
I’d move the colors for Tornado and Other to the dark end of gray, or select additional
basic colors, such as red and black. I’d generally reshuffle colors so that the events
which are most common (severe weather, winter weather, other) have cool, pastel colors,
since there are many data points, and infrequent or strictly regional events
(geophysical, lightning, tornado) have stronger colors to draw attention.
If the only thing worse than a pie chart is lots of them, then I think I’ve discovered that the only thing worse than lots of them, is half of one.
This example comes from a performance management review. The key result areas are weighted out of 100%, and a score is calculated in the table at right. The example total given is 93. The writer is then instructed to place the score total in the chart. Where the score lands will decide whether the employee has failed to meet expectations (FM) through to Far Exceeding Expectations (FE).
Note, how a score in the table (with an apparent maximum of 100), when transferred to the pie, becomes a percentage, but not out of 100!!
How does one calculate a lie factor in a pie chart when the proportions are seemingly equal but the numerical data is not mathematically sound. Is it possible to calculate, or is calculating the lie factor restricted to other types of graphical representation?