In Defence of the Humble Pie Chart
It seems to have become a truism in data visualization that pie charts are to be avoided. A Google search for “avoid pie charts” returns 1.4 million results. At the top of the list are articles with titles ranging from the extreme (Pie Charts are the Worst) to the fanatical (Death to Pie Charts). This mantra seems to have been repeated so often, that it has turned into gospel. For example, I came across an article describing a set of wonderful-sounding visualization tools being created at the Swiss newspaper NZZ. They mention that “editors want pie charts real bad”, but that they refuse to build them and don’t feel the need to explain why (I tweeted the author and didn’t get a response).
This seems to be a situation that is repeated over and over. Lay people like pie charts, ask for them, and supposed data visualization experts deride them and refuse. Anytime a situation such as this develops one should be extremely suspect of the experts. It’s possible that all these people clamouring for pie charts are misguided and will only do themselves visualization harm. However, it’s also possible that they know something the dogmatic experts haven’t taken into account. With that in mind I set off to examine the issue more closely and draw my own conclusions. Little did I know what I was getting myself into, how long and contested this issue has been…
Ways to Show Percentages
Before we dig in, we need to cover some basics. Pie charts are a way of showing percentages of a whole. For our purposes we will consider pie charts and doughnut charts as equivalent. Other ways of showing the same information are bar charts, stacked bar charts (sometimes called divided bar charts) and tables. Bar charts align a series of bars along a common axis allowing you to easily see their relative length. Stacked bar charts stack those bars on top of each other. Intuitively we can see that if we curve a stacked bar chart so that its end touches its beginning, then we will have created a doughnut chart.
And that’s all we’re arguing about. Ways to show percentages. It hardly seems worthy of all the vitriol, yet here we are.
History & Literature Review
The pie chart controversy is old. Dan Kopf provides a great overview of the issue and traces it all the way back to Willard Brinton’s Graphic Methods arguing against them in 1914. By 1926 the feud was well underway as seen in an article by Eells1 arguing for their use against a backdrop of general criticism. This is rather impressive given that they were only first published in William Playfair’s The Statistical Breviary in 1801. Since then not much has changed.
The modern take on this issue usually traces its roots to Cleveland and McGill’s seminal work on graphical perception2 that included a list of ways of encoding information ranked from most to least accurate.
Cleveland & McGill’s ranking of 10 elementary perceptual tasks:
1. Position along a common scale
2. Positions along non-aligned scales
3. Length, direction, angle
5. Volume, curvature
6. Shading, colour saturation
In the paper, Cleveland & McGill investigated each of these for accuracy in a number of tasks. Overall they found pie charts and stacked bar charts to be roughly equivalent in performance, with neither being particularly accurate. Instead they recommended the use of a bar chart or dot chart to encode the same information.
Interestingly I’ve found that in more modern representations of their ranking, length is sometimes placed higher than angle, while still citing the Cleveland & McGill paper. Alberto Cairo’s wonderful book The Truthful Art is one example, Nathan Yau’s Flowing Data website is another. I’m not sure why this is the case, but it subtly favours one side of the pie-bar argument in a sneaky way.
In my quest to understand this controversy I ended up reading quite a few research papers on the topic. In general they ran empirical studies looking at subtly different user tasks and found mostly conflicting results. A chronological summary of the papers is available here. Here are my notes on what the various papers had to say about each task:
Estimating the percentage of the whole:
– Bar charts were better at estimating direct magnitude3
– Pie charts the most accurate and equally speedy as stacked bars. Bar charts were least accurate and their speed decreased with the number of bars.4
– Pies were as good as tables8
Comparison of categories; identify the largest/smallest:
– Pie charts were the slowest and least accurate5
– Tables were the most accurate, while bar and pie charts were roughly equally less accurate.6
– Pies were as good as tables for judging which of two categories was larger8
Difference of Percentage:
– Tables were the most accurate, while bar and pie charts were roughly equally less accurate.6
– Pies were as good as tables8
– Pies were better for comparisons of combinations of proportions3
– Pie charts were superior when mental summation of slices was required5
– Tables were better than graphs8
A chronological summary of the papers is available here.
What did I learn from all this research? Are we any closer to being able to answer the question of whether pie charts are good or bad? My main takeaway is that the results seem to be very context and task dependent. Different representations seemed to favour different tasks, but given the disagreement between papers it’s hard to know how to proceed. Perhaps I could try re-reading them to distinguish the subtle differences of various study tasks, but even then it isn’t clear how useful the results would be.
Taking a step back, I’m even more dubious of the applicability of these papers. For example, testing the accuracy of estimating proportion using pie slices or bars makes sense on the surface, but often we only need to get a gist of the distribution of values from the visuals. In these situations we have a few heuristics for estimating pie chart values. For example, 50% is obvious straight line; 25% is an almost as obvious right angle; we’re used to judging twelfths from reading analogue watches; other values can be estimated by comparing them to these familiar shapes. For precision we can add a percentage label to the pie slice or read the bar height on the axis. Here both would work better than a table that only offers precision without that initial gestalt understanding.
Similarly, it is difficult if not impossible to know up front what tasks people will perform with your visualization. If the correct choice of representation depends on whether they will estimate proportions of the whole, sum percentages or estimate differences, then the design task seems hopeless. Instead the best we can do is choose a representation that gives users a general idea of what is going on in their data along with the means to delve in for more details.
All of this is to say that none of the research is likely to convince any of the pie chart naysayers of its virtues, nor the pie enthusiasts of the superiority of bars. So if the research won’t help us, what else can we say about this issue?
The Big Picture
Overall I was somewhat disappointed in the research. Given that the issue has existed for over a hundred years, I expected greater nuance in the situations examined. For example, none of the studies looked at what types of data distributions are most appropriate for each representation. Since this doesn’t seem to exist, here are some initial thoughts along those lines.
Data visualizations are understood in stages proceeding from the general to the specific:
1. Noticing the visualization
2. Determining the subject of the data
3. Absorbing a general gist understanding of the data
4. Delving more deeply looking for insights, accuracy, questions and answers
Noticing is an often underappreciated aspect of data visualization design. Aesthetics matter and more beautiful visualziations will attract users’ attention and encourage them to look deeper. Everyone is entitled to their opinion, but in general I’d give the nod to pies here. Their round forms will stand out better in most rectilinear layouts. Similarly their lack of axes, ticks and gridlines can make for a less fussy visual impact.
Once they’ve noticed the visualization, the next step is to figure out what its about. Here again the pie chart has a natural advantage. As many others have noted, pies implicitly indicate that the data represents proportions of a whole. Bars on the other hand could represent any kind of quantitative values. Bar chart axes extend to infinity; pies present a finite whole to be subdivided.
Then with a subject in mind, the user will try to get a general sense of what the data is saying. How many categories are there? Do one or two categories dominate or are the values fairly evenly distributed? Here I feel that depending on the data, either pie or bar charts (or both) may work well. For situations with a small number of categories I would favour pie charts, and for more categories bars are likely better. Evidence looking at this topic would be helpful.
Finally, with a general understanding of the data, the user may go looking for greater precision or more specific answers. Finally we’re into the kinds of tasks considered by all the research, even though we didn’t find firm conclusions. Bar chart proponents will point out that the greater precision of measuring position on a common baseline. On the other hand, simply labelling a pie slice with the percent value gives precision equal to a table.
One additional benefit of pie charts is that they lend themselves to some interesting applications. Because they are self-contained and don’t rely on external axes for interpretation, they can be sized and laid out in ways that are difficult to do effectively with bar charts.
Pies can be sized to indicate the overall value they represent. Bars, regular or stacked, can be sized too, but it confuses the ability to compare position or length to determine percentages. One limitation of sizing pies is that they can typically only represent positive values, whereas bars can easily show negative numbers (colour might help in this case, but another representation may be best).
Sized pie charts separate the angle and area properties of the chart making gestalt perception of the values more complex. When comparing large slivers versus small obtuse angles we have to read the shape which is unlikely to be preattentive. Still as seen in this example from the Wall Street Journal, sized pies can work to great effect.
Because pies are self-contained shapes, they can be placed anywhere without affecting position judgements that rely on a shared baseline. This example from Bloomberg Sports shows how a grid of pies can tell a story that would be hard to convey with other representations. Notice how it jumps out that the pitcher relies on the fastball when the count is 3-0.
Sized and Positioned Pies
These qualities of pie charts weren’t lost on their early users. In 1858 Charles Joseph Minard used sized pie charts on a map to show where in France the cattle consumed in Paris arrived from. This example shows how using a natural representation of proportion enables greater versatility in the use of other dimensions (size and position) to encode additional variables.
Given all that we’ve learned (or not), I thought it would be useful to sum things up with some simple heuristics on when to use the various types of charts. These are my opinions, so please let me know your thoughts.
Pies are great for quickly understanding that the data presented is about proportions of a whole rather than quantities. They are best when the accuracy of visually estimated values isn’t crucial or can be conveyed with additional info (e.g. labels). They work best for a small number of categories, especially when you want to encode additional information through position or size encodings.
To use pies most effectively there are a few tips worth keeping in mind:
- Avoid 3D because it distorts our perception of the pie values.
- Small slivers are hard to distinguish, so consider combining them into an ‘other’ slice.
- Order the slices consistently. The usual convention is to order slices from greatest to smallest clockwise from 12 o’clock. If you have multiple pie charts, it’s likely better to keep the ordering consistent.
- Use color to distinguish the categories
Bars & Dots
Bar charts and dot plots, on the other hand, are great in the situations where pies struggle. They excel when there are a large number of categories and the smaller ones matter. In these cases bar charts provide the space to give every category equal attention and precision. They might require a slightly slower closer inspection to understand, but reward that attention with greater accuracy. This can come in especially handy when visually comparing similar values.
Dots have a couple advantages over bars: 1) they don’t require a zero-baseline and thus can better highlight differences in similar values; 2) they can encode multiple values per row/column, which saves space and can be easier to read than clustered bars. On the other hand dots don’t also encode their values with length & area, so they don’t provide a good gist view of the data and require closer inspection to understand.
Stacked bars are great when you have a sequence of percentage breakdowns. The bars will all have equal height of 100%, which helps indicate that proportions rather than quantities are being represented and the categories will line up allowing (still tricky) length comparisons from one bar to another. Pie or bar charts would take up more space and make it harder to see the relationships between each. In the limit this becomes an area line chart, but even just a few separated stacked bar charts can be effective. Stacked bars are especially good when the categories have a natural ordering that can be encoded in order of the stack. Note that any of the middle bars will be more difficult to judge than the top and bottom ones, so if possible add interaction to filter to see only those values dropped to a common baseline.
Note the absolute value labels above the bars. Depending on how important this information is, sized pie charts might have been a better choice.
Pies are great; bars are great. Like all visualization techniques the trick is knowing when and how to use them to maximum effect. Hopefully this post has given you some ideas along those lines. In the end we only have a limited number of visual dimensions available to convey information to our users and every situation is different. Do what works and ignore the dogma on all sides.
Except this. Never this. Please don’t make a scatter plot of stacked pie bars.
If you really want to combine pies and bars, the internet has you covered.
- Eells, Walter Crosby. “The relative merits of circles and bars for representing component parts.” Journal of the American Statistical Association 21.154 (1926): 119-132. ↩
- William S. Cleveland and Robert McGill, Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods, Journal of the American Statistical Association, Vol. 79, No. 387 (Sep., 1984), pp. 531-554 ↩
- Spence, Ian, and Stephan Lewandowsky. “Displaying proportions and percentages.” Applied Cognitive Psychology 5.1 (1991): 61-77. ↩ ↩ ↩
- Hollands, J. G., and Ian Spence. “Judging proportion with graphs: The summation model.” Applied Cognitive Psychology 12.2 (1998): 173-190. ↩ ↩
- Design features of graphs in health risk communication: a systematic review. Ancker JS, Senathirajah Y, Kukafka R, Starren JB, J Am Med Inform Assoc. 2006 Nov-Dec; 13(6):608-18. ↩ ↩ ↩
- Schonlau, Matthias, and Ellen Peters. “1RAND Corporation, 4570 Fifth Avenue, Suite 600, Pittsburgh, PA 15213; email.” (2008). ↩ ↩ ↩
- Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design, Jeffrey Heer and Michael Bostock, CHI 2010, April 10–15, 2010, Atlanta, Georgia, USA. ↩
- Schonlau M, Peters E. (2012).Comprehension of Graphs and Tables Depends on the Task: Empirical Evidence from two web-based studies. Statistics, Politics and Policy. 3(2); Article 5. ↩ ↩ ↩ ↩ ↩