Excel’s statistical graphics

March 21, 2009  |  Edward Tufte
18 Comment(s)

From a Professor of Physics:

http://scienceblogs.com/principles/2009/03/why_does_excel_suck_so_much.php#more

Topics: E.T.
Comments
  • dermot says:

    The comments in the linked post are spot on. I love Excel as a business tool, and have been using it for nearly 20 years. However, the chart defaults are dreadful (which means most business charts I see are equally dreadful), and the upgrade in Excel 2007 was a terrible disappointment, because Microsoft went for bling instead of clarity. Their design managers have obviously never read your books!

  • David Magda says:

    Excel isn’t the only Office application that is ‘sub-standard’. A lot of people take issue with Word’s typography:

    http://robgoodlatte.com/2007/07/24/3-examples-of-bad-microsoft-word-typography/
    http://oestrem.com/thingstwice/2007/05/latex-vs-word-vs-writer/

    http://nitens.org/taraborelli/latex

    Microsoft has a lot of resources, with many very smart and passionate people working for them, and yet for some
    reason many of its products seem to be sub-par (to put it lightly).

  • Fred Mast says:

    I have “fond” memories of spending too much time on the formatting of graphs and reports for the various lab courses
    I took in undergrad. Rather than studying the data and thinking about the significance of the experiment I had
    conducted that day, my time was consumed with removing chartjunk from the Excel-generated graphs. All of my lab
    manuals had an example of what was expected (something similar to the “improved” graph on the scienceblog site) but
    no instructions on how to get there. Looking back over those lab reports I can see that I would frequently receive
    deductions for incorrect format rather than wrongful conclusions. Instead of revealing and engaging the scientific
    principles of the experiment, my lab reports taught me to conform to bureaucracy.
    Maybe ET’s “don’t do lowest common denominator design” can be expanded to “don’t force your students to use lowest
    common denominator software.”

  • Vassilis Golfinopoulos says:

    Actually, there is at least one program manager for Excel that has read Edward Tufte’s books, as stated in this relevant post (second sentence):

    http://blogs.msdn.com/excel/archive/2009/02/24/how-to-create-a-professional-chart-using-excel-2007.aspx

    It is a pity that Excel has such poor charting capabilities. Because of its widespread use it could educate people by example on the principles of good chart design.

    However, the solution in stated elsewhere in the Ask ET forum: use proper tools (i.e. statistical packages) for the job at hand.

  • Niels Olson says:

    OpenOffice is touted as the replacement for Microsoft Office. Unfortunately, when one goes looking around the internet
    for reviews of OpenOffice, one tends to run across some naysayers. The backstory is that Sun has been a relatively poor
    custodian of OpenOffice. So why do geeks still like OpenOffice? Are they just less informed wanna-be geeks? Turns out
    many Linux distros and the NeoOffice
    project for Mac
    derive their source code from the Go-OO.org fork of
    OpenOffice
    , whose politics the hackers have found more acceptable.

    That said, people still use the generic term OpenOffice, so I’ll stick with that. I’ve found OpenOffice is generally
    better than Excel by default, and the depth of howto available on the Internet is much better.¿¿Want to construct
    regression models and the functions aren’t built in? Somebody has a step-by-step tutorial for it. A classic is linear
    regression, which can be rather frustrating in Excel (I found it easier to work out a real-world sink-or-save-the-ship
    problem on my HP 48G calculator, circa 1992). I started using OpenOffice specifically because the community provides
    more help for users trying to do more complex things.

  • Derek Cotter says:

    Niels, how does OpenOffice compare with MSOffice as a graph application? I’m ready to believe that the spreadsheet, as a spreadsheet (i.e. an arrangement of cells containing formulae) is comparable.

  • Will Oswald says:

    Microsoft have now launched an official blog for the Excel 2010 product. While details are still sparse, there is a contact page available. While I am unfamiliar with the Microsoft development process, would it perhaps be worthwhile for contributors to this forum to lobby for improvements to the Excel graph displays so that the many workarounds and fixes that I and others have had to develop become part of the core charting capability instead?

  • Derek Cotter says:

    Will, I feel a little bit burned by the experience of the development of Excel 2007’s rewritten graph application. The blog that was set up for the public discussion of ideas was hosted by someone who swore he’d read and absorbed Tufte’s work, but the final result was insultingly at odds with the boasts.

    For example, Microsoft yet again failed to include the labelling of scatter points with the contents of a range that is neither the category, nor the value, nor the series. This is a function that Lotus 123 had in the 1980s, for goodness’ sake, and yet in the 2010s we will still be using third party VBA apps if we want to do it with a Microsoft product.

    Excel 2007’s “new” graph application contained no new statistical features, such as box and whisker graphs, but many new art effects, such as metallic look surfaces. It was very clear that the call for feedback had been nothing but PR, with no intention of responding to it with action.

  • Mike Davis says:

    It was interesting in a different thread to read Edward Tufte’s endorsement for the Origin graphing/statistics package. So while this thread seems to be complaining about Excel’s graphing capabilities I thought I’d offer another alternative that I’ve been using.

    The package is GNUPLOT and is available for free on a variety of platforms. It seems mainly to solve the Excel problem listed above because it’s a lot harder to use than Excel is, specifically you can’t accidentally choose an option that looks right when in fact it is wrong (the professor’s lament about his students graphing current and magnetic field strength).

    But despite its complications there are a lot of demonstration files supplied and I find it much easier to do useful work fast by quickly scanning through the demos to find one that’s close to what I want to do and then to examine the plot commands used in that demo.

    Here’s a collection of plots made using gnuplot.

  • amb says:

    For graphs of Excel data, an alternative I’m using more and more is MATPLOTLIB [
    http://matplotlib.sourceforge.net/ ]
    See [ http://matplotlib.sourceforge.net/gallery.html ] for their gallery. This Python
    library is open source and easy to use with a basic understanding of Python. It is well
    supported, and provides good documentation. Please note that MATPLOTLIB ships with an Excel
    toolkit to easily read/write Excel files [
    http://matplotlib.sourceforge.net/users/toolkits.html ]

  • Matt R says:

    I’m surprised that no-one has yet mentioned R (http://cran.r-project.org/). This is rapidly becoming the de facto statistics package for professional data analysts in academia and Industry and there are now some great books on how to use it for statistics and for professional graphic output e.g. (http://tinyurl.com/nrfvr3).

    For those who have all their data already in Excel there is this (http://www.statconn.com/).

  • Matt R says:

    To follow on from my last item on the R language, The New York Times recently highlighted the spread of R as the package of choice for academics and commercial users. And a related NYT blog item here.

    One quote is interesting – “Intel Capital has placed the number of R users at 1 million”.

  • Gabriel Hanna says:

    For quick plotting, and curve fitting, I use gnuplot. For plots that go in papers I use Grace. Both are free.

    I don’t like Origin, too much overhead and too expensive. But it does make nice plots without a lot of effort.

  • Mike Walter says:

    Sparklines will be available in Excel 10. http://blogs.msdn.com/excel/archive/tags/Sparklines/default.aspx
    The last entry in the link references Dr. Tufte.

  • Tony Laidig says:

    I’d like to add to the mention of the R language the package ggplot2, which by default follows from the advice of Tufte. It also includes color scales from the ColorBrewer packages, which make it easy to write an application to generate booklets of graphs. I’ve been using it all summer and the graphs are well appreciated and far more pleasing to the eye than those of Excel or the default R graphs.

  • Max M. Houck says:

    Has anyone used Tableau to any great extent? I have started working with it and found it to
    be a wonderful tool for exploration of data but it does have some peculiarities in plotting.

    http://www.tableausoftware.com

  • David Gilding says:

    For those with Apple Mac, DataGraph by David Adelstien (http://www.visualdatatools.com/DataGraph/) is well worth a look – and won’t break the bank either (50GBP at present).

    David

  • Nick Chaiyachakorn says:

    Of course, instead of whisking around for a program that caters to your whims, there’s the complete-control end of the
    continuum: JavaScript graphics”/dfdvisualizations” libraries like D3 and Protovis (on the site – look – a recreation of Minard’s graphic!)
    give you complete control and flexibility, but obviously at the cost of having to learn an equally flexible control language
    (although the API’s actually quite nice and functional and data-oriented, if you care about that stuff.) Given attention to
    detail, it sure as hell can produce purdy results, though – although I figure because one has to start from scratch with
    every graphic there’s a lot nudging you towards minimalism.

Contribute

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.