fbpx

Cleaning up Excel’s poshlust graphics

March 8, 2007  |  Edward Tufte
31 Comment(s)

Several good contributions (from another thread) on workarounds for Excel’s
chartjunky
graphics are posted below. Years ago several of my students hacked Excel to do
sparklines;
the hack was such that Excel never knew it was doing good graphics.

poshlust = Russian for pretentious banality

Topics: E.T.
Comments
  • Fraser Moffatt says:

    Not sure if anyone in the forum has come across this one yet

    Clean Charts from Juice Analytics: a free Excel Add-in for automatically reducing chartjunk
    in Excel charts. Project offered under the “MIT License”.

    Quote from JuiceAnalytics blog: “Fixing Excel charts…Or, why cast stones when you can
    pick up a hammer?”

    I’ve tried it – easy to use and install. Pretty effective – saves at least a hundred keystrokes
    and/or mouseclicks in making charts less junky.

    http://www.juiceanalytics.com/weblog/?p=161

    Disclaimer – I have nothing whatsoever to do with Juice Analytics or any subsidiaries aside
    from having downloaded and used the aforementioned too.

  • Stephen Hampshire says:

    With reference to Fraser Moffatt’s (very useful) pointer to the Juice Analytics add-in, I find the
    easiest way to save mouse-clicks tidying up Excel’s chartjunk is to set up a range of custom
    chart types.

    A bit of time and thought getting the templates right in the first place saves you days in the
    long run.

    More info here:
    http://office.microsoft.com/en-us/excel/HP051996281033.aspx

  • Steve Sprague says:

    Microsoft claims the latest version of Excel, 2007, has a much improved chart engine. I haven’t had a chance to look into it in great detail, but it does offer “millions” of colors, not just the 56 of all prior versions, something that would have been a great help to me in a project I just finished. Does anyone else have experience with Excel 2007? Did they really come up with something truly better or just slide in a few cosmetic touchups?

  • Derek Cotter says:

    I’m very pleased to see this topic. One of the first Ask E.T.topics I saw was Steve Sprague’s request for advice about Graphing Software, in which the advice about Excel was “avoid Excel” and no more. I was disappointed by this, not because I think Excel is particularly good, but because it has such a vast user base that advice on how to work with it should surely benefit a great many of the inquirers who come to this site. This topic could be exactly what I hoped for.

    Fraser has already referred to the Juice Analytics blog‘s Excel chart cleaner. I would also mention their worksheet of custom charts inspired by Will Oswald on this site (in turn inspired by the ideas in The Visual Display of Quantitative Information:

    http://www.juiceanalytics.com/weblog/?p=240

    I’m almost sorry to say that, after fruitless attempts to contact Will, I finally learned how to reproduce the range scale graph type from tVDoQI on my own, quite some time before the Gemigniani brothers at Juice came out with their version. The key is to throw away the limited Excel scale facility, and create your own using a data series, formatted to your tastes. Both Tushar Mehta and Jon Peltier have tutorials to get you started, and I find it’s the single most powerful technique for turning Excel’s limited repertoire of tricks into something less constrained.

  • Derek Cotter says:

    Steve, would you believe I didn’t notice your name just above when I wrote my comment? 🙂

    I’m sorry to say (at second hand, from Excel chart guru Jon Peltier; I haven’t seen it myself) that the “improved chart engine” apparently amounts to no more than the addition of new Microsoft overlaid imagery such as metallic-looking surfaces and so on.

    I’m very disappointed to hear it. Of all the genuine improvements they could have made, (some outstanding since the very early 90s, like a proper facility to label data points using an arbitrary range of cells) they appear to have made exactly zero, but to have catered again to their constituency of business managers who wish to have their graphics appear to look “professional” as if they have been produced by an expensive art department, instead of professional as if they had been produced by an expensive statistics department, which I think would be better.

  • Alan Charbonneau says:

    I greatly respect Jon Peltier and he has helped me out with several Excel graphs, most of which were attempts to produce “Tufte-style” graphs. But I have to disagree with him about Excel 2007 graphs (assuming this second hand info is accurate), there is greater ease-of-use to some of the basic graphics in Excel 2007 and not simply the ability to add more chartjunk.

    The basic graphs produced do not have the dark gray background used in earlier Excel versions, nor do the bars on bar charts have a border around them by default as the earlier versions did. Basically, they still look like Excel graphs, but much cleaner versions.

    Next, it is much easier to customize gridlines than it was in older versions. The gridlines are somewhat less bothersome to begin with than they used to be, probably because the graph has a cleaner look. But when you select the gridlines, you can change the level of transparency as well as the color and the dash style, giving you pretty decent control of them. Similarly, the axes are easier to control.

    So, out of the box, it is true that a default Excel graph will not be of publication quality using the principles outlined by professor Tufte. However, they are easier to get into an acceptable state, requiring less work than earlier versions and requiring less guru-level knowledge. Excel could always be hacked to an acceptable level and Ed Tufte has stated that his students were able to do so. But some hacking is well beyond the knowledge of most Excel users. With this new version, you don’t have to be quite as much of a geek to figure out how to get the formatting right.

    There is at least one combination graph that I hacked out in earlier versions that I have yet to figure out how to duplicate in Excel 2007, but for the most part this is a nice improvement.

    As far as I can tell, some of the other hoped for improvements have not occurred, like the poster who mentioned he’d like to link labels from arbitrary cells to the graphs. From that point-of-view, Excel 2007 can be a bit of a disappointment. It’s new ribbon interface and better contextual menus do make it a better product, but greater functionality could still have been added to its graphics.

    p.s. Names to Google for advanced Excel skills include: Jon Peltier, Andy Pope, Tushar Mehta, and John Walkenbach. Each, in turn, has links to other experts.

  • Tchad says:

    Imagine having the graphic and typographic precision of Illustrator in Excel.

    Perhaps what we really need to do, is to talk with Adobe about adding a few
    Tufte chart templates in the new version of Illustrator. They started a graphing tool
    but didn’t really
    finish the job – the variables don’t work and data cannot
    be sourced from a database or spreadsheet, not to mention the
    the pathetic
    treatment of the axis configuration.

  • Niels Olson says:

    Or talk to Sun, the company behind OpenOffice. On a similar note, OpenOffice is the only major office suite that works on Windows, Mac (as NeoOffice), and comes preinstalled on most Linux distros. Or, since OpenOffice is open source, Adobe could be encouraged to fork it.

    I’m all for helping people, but on a 20 year horizon, publishing Excel hacks for free strikes me as enabling Microsoft to continue its bloatware development program and encouraging everyone else in the world to work harder instead of smarter.

  • Tchad says:

    I would rather see a proper tool inside Illustrator
    or perhaps Google Spreadsheets.

    I love Excel when bringing data together, and I hate
    it whenever I try to produce something that I want to
    look at or see.

    My ideal product would be something that contains the
    following elements:

    1) Computational horsepower of Mathematica or SPSS
    2) Typographic finesse of Illustrator
    3) Data connectivity of Business Objects
    4) Spreadsheet simplicity of Excel

    The graphical output of Mathematica is excellent but the
    learning curve is too steep to be widely adopted.

    Sometimes I wonder if anyone could put together a graphing
    package that would make me happy. Maybe, I just want a
    macro that would build boxes or draw functions in Illustrator

    that are to scale.

  • Gerd Schlottig says:

    What about combining your two ideas, Tchad and Niels? Thinking away from a “product for each” idea you have while working on the “desk” — what about a different document format as next step for an open office, why spreadsheet, presentation, writing, calculating, graphing, drawing, type setting? Why copying the overcome idea also in an open version that aims standard level?

    Some tentative beginnings exist. Despite all the desperation while using Powerpoint, it allows to bring some elements together like photos, text, spreadsheet, drawing — but it does not work.
    There are dedicated open versions complementing the proprietary tools, like Inkscape (vector,SVG), Gimp (photos), Scribus (desktop publishing), Scilab (numerical computation) or the mentioned OpenOffice. Could Sun have an open ear to combine the reached into a better interface, without interruptions to search hierarchies for a simple calculator? Or is it something for a community that realized the many open source projects?

    Could anyone give hints of existing projects on this? Maybe another thread though?

  • Ken Burnside says:

    I have found the following to be handy in making Excel charts human-readable.

    1) Use the tools referenced earlier in this thread.
    2) Export the chart in Excel, and take a screen capture of it.
    3) Load the screen capture in Illustrator.
    4) Use Illustrator “live trace” to convert it to vector elements.
    5) Commence cleanup in Illustrator, including fixing typography, using a sensible color space model, etc.

    Handy notes in doing this: Illustrator uses PDF as an internal format. Its atomic unit of scale is 1/72″ (the Adobe Point, as opposed to the Printer’s Point, which is a tiny tiny bit smaller.) The more you grasp this, and work with it, the happier you will be.

    Rumors abound that Illustrator 13 (for Adobe’s CS3 suite) will be able to import an Excel chart format directly, without having to do the “rasterize, then vectorize” stunts here. I will believe it when I see it.

    Now, if only they’d fix Excel imports in InDesign tables. (I will admit, that for what I do, my definition of “fixed” is likely something they never considered.)

  • Niels Olson says:

    The Economist provides a spreadsheet about Iraqi casualties among military aged men. This tabular data might make thought-provoking material for a redesign project among both science and design students.

    The basic, fundamental, iterative elements in any of my Excel projects, both tabular and graphical:

    • Get the analysis and basic layout down using whatever colors and fonts and ruling I want. It is not necessary to worry about design. Design is not important if my numbers are wrong. Over time, I started following design principles at this step because it’s just easier and even improves my own thinking (I need all the help I can get).
    • At some point toward the end of laying things out, I select the entire worksheet and make the background white.
    • I like to see how far I can get with layout and ruling. Many projects stop here. I’ve found it also brings out weaknesses in the layout.
    • Choose a few combinations of signage and body fonts and see what they look like. IF you’ve seen a font you like, fonts.com has a useful Search by sight function.
    • Consider adding color. What does it add? What function does it serve? What are the tradeoffs, the costs of color? Does the organization have a style sheet with specified colors and fonts? Visit Kuler or ColorBlender to consider color schemes.

    A couple of hard-won tips for Excel:

    • You get 56 colors. Period. Thankfully, few designs call for 56 colors and you can pick whichever 56 (R,G,B) colors you want by clicking Tools – Options – Color – pick a color – Modify – Custom*. Kuler and ColorBlender provide the numbers for you. If you find a color you like on the web, Kuler and ColorBlender will convert it to (R,G,B) coordinates for you.
    • At 100% print scale 72 points = 96 pixels = 1 inch. More about that here.

    *Not quite as easy as hitting F6 in Illustrator, I know. In OpenOffice the clicks are the same (Tools – Options – Colors), but the interface is a little better once you get there.

  • Niels Olson says:

    Mac users have previously been presented with several roadblocks to using OpenOffice, not least of which was having to install X11 windowing and then bear with its Windows-like absurdities. OpenOffice is now available for Mac users as NeoOffice, which uses OS X’s normal windowing. Just download the dmg and click it.

  • Zach Heaton says:

    Regrettably, the Clean Charts add-in calls the Win32 API, and thus is unusable with any of the Macintosh versions of Excel.

    Instead, I’ve generally had very good results using Inkscape in conjunction with a plotting package that produces SVG output, such as R, gnuplot, or Graphviz. Since SVG is Inkscape’s native format, the plots generated by any of these software packages can be loaded and annotated without any format conversions. Thus, the data flow is Data File → R/gnuplot/Grapviz → SVG → Inkscape → PNG/PDF/PS, with the only losses occurring in the final conversion.

    (In theory, this process could be streamlined further by interfacing gnuplot or R directly into Inkscape using a Python extension. (Data File → Inkscape → PNG/PDF/PS) However, this would require a substantial effort for a minimal return.)

  • kd says:

    You know, if graphing with Excel is this much trouble to do properly, I’m still inclined to cut my losses and figure out the relatively arcane R syntax to get quality graphs going as a first resort.

  • John F. Opie says:

    Hi –

    I’m a Senior Economist tasked, besides my forecasting work, with increasing production efficiency for a number of products we sell (we’re in the credit risk rating business, among other things). A typical product (industrial models for, say, Austria, covering 150+ industrial sectors (NACE rev 1.1), which contain, all told, around 4200 charts, 600 tables and assorted texts) must be producable within a reasonable period of time, which really means absolutely less than 1 day, preferably less than 4 hours for the pure production process. That’s also on a single machine, not parallel. We do have products that have as many as 10000 charts and 2000 tables, so you get an idea of the kind of huge volume work we do: these are produced either monthly or quarterly (and we don’t kill off forests, but deliver these on-line or on a CD, all in PDF format).

    We do it all in Excel once the forecast and ancillary work is completed externally. The key is not to use ANY of Excel’s default formatting, but rather to predefine the charts using templates, into which data is imported via our database tools. That’s the ONLY way we can get the charts done in any sensible way. We tried formatting via VBA code, and while it is an option, the moment you get away from more than 1-2 chart types, the maintainability of the code becomes very problematic, and in many cases a non-programmer has to maintain the code, i.e. forget it.

    Now, I’ve been a long-time admirer of Dr. Tufte, at least 15 years by now, and try to be as clean as possible with as little chartjunk as I can, but ultimately our chart designs are done by committee with all *that* implies.

    That said, we really haven’t found all that much that we can’t do with Excel’s charting engine, but dread the day that we have to convert to Office 2007, since the chart engine is the chart engine that you find in .NET from MSFT. Why do we dread that day? Because it means having to rediscover how everything is done, how we will have to hack everything, and how to get around what the MSFT programmers think is the way that charts should be done. Now, that’s just work: there is a real problem with using the charts in Excel 2007 that hasn’t been addressed. How do you get them out of Excel? We can’t embed 4200+ charts: the files would be huge, unwieldy, and Excel internal garbage collection is so pathetic that you can’t do more than around 50 charts in a workbook before you run into serious resource problems. What we have done to date is to export them as WMF/EMF files, but with Excel 2007 we have a real problem.

    The real problem is that due to security considerations (WMF overflow exploit) MSFT will no longer provide Excel 2007 users with the ability to export WMF or EMF files, but only GIF, JPEG and PNG. You can imagine what this means for image quality: we create chart archives with WMF pictures of our charts, which are then put where we want them in Excel workbooks. Five years ago, when we decided to use WMF/EMF, MSFT recommended to us that we use these formats, as they are eminently scalable and will always be part of the Windows world, as Windows uses the WMF/EMF format internally to generate an image of, for instance, an embedded object in Word/Excel/PowerPoint/whatever, showing the content of that embedded object instead of an icon of that embedded object.

    This means that we got big, big problems with using Excel starting with the Office 2007 version, since we use that format exclusively (and believe me, JPG, GIF and PNG do NOT make the grade in any shape, fashion or form.

    John

  • Niels Olson says:

    John,

    I’ve done a couple of redesigns on very similar products. A few dozen templates, plus graph templates, delivered and in production now for a few of years. My sense is that you’ve got enough in house ability and investment that I would recommend you switch to Linux and use the more advanced packages in ET’s thread Graphing Software Packages. You will have access to a much larger community and libraries upon libraries of software. If you want a soft introduction, start with OpenOffice, which works in Windows and native to Linux, developed by Sun. Most of the code is open source so you can customize things if you need to. You will also be free of Microsoft tyranny.

    Download the Debian netinst cd, put it in a machine you don’t care about, make sure it’s connected to the network, and turn it on. You could put in a machine you care about, but it will ask a scary question at the beginning, how do you want to reformat your hard drive, so fewer antiacids are required if you’re not concerned about the machine. 🙂 Debian is very stable, mature, widely used on servers, and this will provide you with the exceptionally stable and advanced Gnome desktop to start with. See this cladogram for Debian’s lineage. You can get additional packages using the Synaptic package manager, which comes with Debian. Many of the graphing software packages mentioned in the other thread are available.

    If you’re the John F. Opie in Germany, the very productive Nicholas Bissantz may have some ideas for you.

    Most of this stuff is available for Mac also, but the support forums lean toward Linux.

    As an aside, your “one machine” is somewhat concerning. If this data is really the source of someone’s livelihood, I’d recommend a RAID-backed filesystem and subscribing to a nightly off-site backup. Debian will do this stuff for you too (additional hardware required for RAID, but you can set that up as you go).

  • Niels Olson says:

    I should’ve tacked this onto my last post: Regardless of your short-term decision, I’d recommend getting Debian, or some flavor of Linux, now so you can play around with it for six months to a year before migrating mission critical stuff, if you can afford that much time. Microsoft is not going to become more transparent. I would think you should remain competitive in the market with Office 2003 in the mean time.

  • Tchad says:

    We have been playing around with VBA to get what we
    want out of Excel Graphics and have recently started
    trying to create layers in Excel.

    So far we just programme what we want and hit a button
    that will generate all the information and stick it
    in the right place – another button removes it.

    Perhaps someone with more open source skills than I
    would be able to place layers in open office.
    The easiest way would seem to be a way of adjusting the opactity
    of the worksheets so that one could peer through the
    ones on top to the data below. I would imagine a small
    button on the tab that would toggle the worksheet’s visibility.

    It would be lovely if we could make it as easy to use
    as Illustrator or Photoshop.

    Fancy making a high end spreadsheet application Adobe?

  • John Walker says:

    G’day,

    At the risk of pushing this thread even further away from its theme of cleaning up excel graphics, I’d like to add my two cents worth to the discussion on alternative tools (should be moved to graphing software).

    The merging of all the specialised applications for typesetting, drawing and graphing into one application has been achieved for Python (various libraries). The downside is that the interface is that of a language due to the complexity of the tasks undertaken and the need for a simple interface.

    The real difficulty is in designing an graphical interface that provides the tools appropriate to the task. A menu interface eventually becomes limited by too great a number of options (How many too many?). The reality may be that a single application of this type is too difficult to implement with a GUI due to an excessive number of menu elements to provide for all the tasks. The best way forward may be to create a file format capable of storing all the different types of data (say a code file) and to have a master application that then selects from sub-applications depending on context. Office Suites such as MS Office and Open Office are the first and just one type of answer to the problem. The next generation may use better applications as their base, such as Gimp, Scribus etc. but it will have to start with a common file format and will need to have, underneath any eventual GUI, a language based interface.

  • John F. Opie says:

    Hi –

    Niels, yep. I’m the one and only. 🙂

    The problem that I have here is that the decision of go/no go is not in my hands, but rather in the hands of IT, who has to listen to what management says. Yes, it’s potentially a decade of Dilbert waiting to happen, but such is life.

    Second, this is something that is delivered to customers as a turn-key solution for their own in-house needs, and I doubt that I need to explain how dead-on-arrival a recommendation would be to install a Linux machine at pretty much any given corporation: if it isn’t MSFT, it doesn’t stand a chance of the proverbial snowball of user acceptance.

    But I’ve been able to work out worse problems before, so I am confident that this, too, will work out. It’s just gonna be painful.

  • John F. Opie says:

    Hi again –

    Oh, and the “one machine” is mirrored at three different locations, one of which is off-site entirely, and are synchronized at hourly intervals, as well as off-site in the morning and in the evening (in case work is done at the off-site location).

    Paranoid?

    Nope. I think I’m the only one in the company who could continue to work if the building burned down and the off-site backups of the central system turned out to be corrupt: I can’t duplicate the entire infrastructure, but I can meet my quarterly deadlines and have the customers pay my salary. 🙂

  • Niels Olson says:

    Microsoft defaults continue to victimize great minds: here are some slides of graphs, from the OLPC project.

  • Tchad says:

    In defense of the OLPC project, I think these charts are probably
    about the OLPC project rather than from the OLPC project.

    Negroponte has good taste and a lot of class; the OLPC uses good
    materials and good design to create an inexpensive computer that is
    quite impressive.

    Have a look:



    http://www.laptop.org/en/laptop/

  • Niels Olson says:

    Yes, authorship is difficult to assign. The slides are from this article, interviewing OLPC’s CTO, Ivan Krstić. Here’s another slideshow he presented. He’s 21 and handling one of the largest IT security challenges on the planet and he has to deliver 10 million laptops in the next 6 months. I can’t fault him for spending minimal prep time on these presentations. The point is the software’s values failed him from the get go.

  • Sherman Dorn says:

    To John F. Opie:

    I’ve discovered that Microsoft’s Document Image Writer (should be installed as a printer option) provides decent TIFF images, with one caveat: B/W only. On the other hand, when I’m trying to produce small multiples of 46 figures (period estimates of educational attainment in the Americas from the 1940s forward), it’ll do and works far better than the default JPG output.

  • John F. Opie says:

    Hi –

    Sherman, thanks for that tip. Unfortunately, TIFF B&W just doesn’t hack it for us (color charts), and we’re still collectively scratching our heads on this one.

    With my luck, they’ll install Office 2007 when I’m off on vacation and then blame me for not having brought up the issue when it was installed…

    John

  • Edward Tufte says:

    Here’s our new Poshlust Millennium Award:

     image001

  • Tchad says:

    It looks like someone at Adobe may be listening to the needs
    expressed earlier on this thread and others. According to Wired:


    “When considering what your spreadsheets and documents might look like on the always-on desktop of the future, don’t leave Adobe out of the picture.

    The software maker famous for Flash and Photoshop is poised to take the plunge into the lucrative world of office applications. It may sound far-fetched at first, but the stage is set for Adobe to flex its muscle in the office-app arena. The company already has a strong presence in business software with its Acrobat suite of products and interest in its new platform for web-enabled applications that run on the desktop is rising quickly.

    According to Adobe group manager for platform evangelism, Mike Downey, it wouldn’t be outlandish to predict the company throws its hat into the ring soon.”

    FULL ARTICLE
    http://www.wired.com/software/coolapps/news/2007/08/adobe_officedocs

  • Marlow Macht says:

    I would like to call attention to another attempt by Microsoft to make clear visual communication more difficult. The
    2008 version of Excel no longer allows you to specify X axis labels for your data. At the same time, Microsoft has added
    approximately 40 different “styles” and an additional 30 “effects.”

    A workaround is here:
    http://www.mackb.com/Uwe/Forum.aspx/excel/4535/Cannot-add-data-as-labels-for-the-horizontal-category-
    axis

  • Will Scarbrough says:

    Edward Tufte would you consider co-producing an Excel plugin with someone else? In my world (analytics and data for a startup) Excel is an amazing tool, as much for the formula capabilities as for the common knowledge most employees have. Within this world, a plugin replacing the standard Excel charts with a few best practice chart types would make a world of difference in communication. This does not need to be high end graphics, nor comprehensive. Instead, helping information workers display business data clearly every day would improve everyone’s understanding.

    Example like the Juice Analytics charts are nice but everyday users are not going to copy those. Instant and easy to use is critical for everyday use.

    Thank you.

Contribute

Leave a Reply