All 4 books by Edward Tufte now in
paperback editions, $100 for all 4
Visual Display of Quantitative Information
Envisioning Information
Visual Explanations
Beautiful Evidence
Paper/printing = original clothbound books.
Only available through ET's Graphics Press:
catalog + shopping cart
All 4 clothbound books, autographed by the author $150
catalog + shopping cart
Edward Tufte e-books
Immediate download to any computer:
Visual and Statistical Thinking $2
The Cognitive Style of Powerpoint $2
Seeing Around + Feynman Diagrams $2
Data Analysis for Politics and Policy $2
catalog + shopping cart
Edward Tufte one-day course,
Presenting Data and Information
Arlington VA, June 5, 6
Bethesda MD, June 8
Seattle WA, July 11, 12
Portland OR, July 14
Denver CO, July 17
Minneapolis MN, August 15
Chicago IL, August 17, 18
Click here for more information about ET's course and to register.
Describing and tracking stimulus projects totaling $787,000,000,000 on the internet: any ideas?

From Recovery.gov, a federal government website tracking the stimulus projects:

"RECOVERY ACCOUNTABILITY AND TRANSPARENCY BOARD

The Recovery Accountability and Transparency Board was created by the American Recovery and Reinvestment Act to coordinate and conduct oversight of funds distributed under this law in order to prevent fraud, waste and abuse. The Board includes a Chairman, Earl E. Devaney, appointed by the President, and ten Inspectors General specified by the Act. The Board has a series of functions and powers to assist it in the mission of providing oversight and promoting transparency regarding expenditure of funds at all levels of government. Quarterly and annual reports on the use of Recovery Act funds and any oversight matters will be issued as part of the Board's work. The Board may also make recommendations to agencies on measures to avoid problems and prevent fraud, waste and abuse. To address issues quickly, the Board may send flash reports to the President and Congress on potential management and funding problems that require immediate attention. The Board is also charged under the Act with establishing and maintaining a user friendly website, Recovery.gov, to foster greater accountability and transparency in the use of covered funds.

The job of the Recovery Accountability and Transparency Board is to make sure that Recovery.gov fulfills its mandate -- to help citizens track the spending of funds allocated by the American Recovery and Reinvestment Act.

The Board consists of Inspectors General from about ten major cabinet agencies -- including the Departments of Justice, Treasury, and Commerce."



This is an extraordinary and noble effort at improving the accountability and transparency of government funding.

I met with the Chair of the Recovery Accountability and Transparency Board and the director of Recovery.com a few days ago in Washington, DC to discuss their work. These discussions may continue and I would appreciate any ideas about this project and its website. (I'm already aware of the pie charts on their frontpage!)

With regard to my own accountability and transparency, I expect to provide some advice to the Recovery Accountability and Transparency Board. My on-going policy for worthy government work (e.g., Recovery Accountability and Transparency Board, the Centers for Disease Control and Prevention, NASA, etc.) is that the work is pro bono. Any changes to this policy will be posted in this thread.

Thanks, ET

-- Edward Tufte


scorecard.org

The current leading metaphor is scorecard.org, suggested
by Philip Greenspun.

-- Edward Tufte


Response to Describing and tracking projects totaling $787,000,000,000 on the internet: any ideas?

Projects are reported to management as a summary of tasks: the smallest quantities of activity that can quantified in an operational plan as so-many person-hour-or-days of work over a period of days or weeks. All project management software operates at this level. However, projects succeed or fail at levels below the task: individual snippets and elements that are too detailed to be defined and planned and whose quality is determined by the individual professional competence of individual team members.

A crude analogy would be with a game of football: any discipline; American; Aussie Rules; Soccer; Rugby Union or Rugby League. You cannot plan for specific players to be in specific places on the pitch at 28 minutes past the hour. An experienced manager will know which team has the best of the play by observing continually, and this cannot be defined by studying photographic snapshots taken every five minutes. Evaluation of the play is a continuous process.

There is a further problem. Any attempt to set targets below the level of the task will in due course lead to the targets becoming more important than responsive operational fluidity. A study of ten thousand soccer games may reveal a positive correlation between a winning score and the number of throw-ins. However, instructing the team to gain throw-ins, and monitoring success against a throw-in target, will divert the team from the real business of playing.

Monitoring project progress - even small projects - can not be done directly: it is an observation of Brownian Motion. Patterns of molecular activity can be predicted from the effect on the dust of which the tasks are composed. The problems of monitoring projects that spend $200,000 a day are - thanks to the computational ability of modern machines - different only in degree to a programme such as ET has described that spends ten-thousand times as much.

A summary outline of an incipient response to the original posting could be introduced in a week-long seminar. However, the following would be an example of a minmum dataset as a summation of all the individual projects. If anyone should cavil at the level of detailed data collection involved they should be reminded that all this data must be collected for each project individually in order to pay staff and contractors and consultants' invoices: so it is just a matter of computational aggregation, which is what modern technology does effortlessly.

Daily - hours worked; hours planned: weekly - hours charged; cost planned; length of issues/risks/requirements logs: tasks reporting - hours/days over/under: reporting deadlines met/not met (by hours/days): checkpoint attendance (absences): scope reductions: interviews/meetings planned and cancelled/postponed.

You need All the Data. You must report All the Data. The huge variety of streams of data and the vast number of data items implied here can be analysed and presented by the techniques exemplified on this website. What is needed are indicators that can be used to inform a pattern recognition of how each project is progressing: formal management reports written earnestly by project managers in hotel rooms at midnight will almost invariably end up summarising good news. The bad news for the project comes later when all the blood has finally drained away from the head and it suddenly collapses dead on the floor. The ability to spot the incipient signs of early blood loss before they can be detected by crude management instruments is the most important attribute of the successful project manager .

-- Martin Ternouth (email)


1. Spreadsheet author should be instructed to enter only actual activities when populating cells below the headings 'Short bulleted list of the major actions taken to date' and 'Short bulleted list of the major planned actions'.

Example:

On page http://www.recovery.gov/?q=content/weekly-report&agency_code=13&startdate=2009-04-03&noofreports=1&status=1&report_id=147(Department of Commerce), under the heading 'Major Actions Taken to Date', the bullet points 'Department wide', 'NOAA' and 'NTIA' are listed as activities. This appears to be because in file http://www.recovery.gov/sites/default/files/weeklyreport_WR20090403DOC.xls, in the 'Major Activities' worksheet, 'Department wide','NOAA' and 'NTIA' have been entered as underlined sub-headings in column B 'Short bulleted list of the major actions taken to date'.

Sub-headings are not listed as bullet points on pages of other agencies that I viewed.

2. Uniformity should be maintained between the Report Date and Source file name in the report history of each agency.

Example:

On page http://www.recovery.gov/?q=content/weekly-report&agency_code=27&startdate=2009-04-07&noofreports=1&status=1&report_id=163 (Federal Communications Commission), in the 6 pairs of Report Dates and Source Files, the Report Date is identical to the date substring in the xls file name for 2 pairs, while those same dates are different for the other 4 pairs.

[ET comment added April 14, 2009: These excellent points, my thanks to Kindly Contributor Dan MacKenzie. His comments are now in the hands of those in charge of Recovery.gov.]

-- Dan MacKenzie (email)


Here is one person's start on this. It could provide a framework for visualization and database development: Bruce Phillip's interactive stimulus watch dashboard.

-- Michael W Cristiani (email)


Bullet lists are admin debris that steal content space

Here's my follow-up to Dan MacKenzie's comments (above).

Also it should not be called "short bulleted list of the major planned actions." Users can see it is a list!
Just say "Major planned actions." The principle is to always describe the content, not the (rather obvious) display method. Also there's space for more precise and more descriptive column headers when "short bulleted list of the" is removed. The original column header had 8 words, 5 of which were admin debris. With the sugested change, all 8 words can be devoted to what's contained in the column.

By the way, bullets burn up content space and are rarely needed. If a line spills over, just indent the spillover line a bit. Airplane pilot checklists (which are very serious lists) don't use bullets; such checklists use the second-line indent method.

More generally, thoughtful typography can eliminate all such administrative debris: bullet lists, grids in tables, numbered points, and other content-free foolings around. We're going for 100% content, with the proper reading of the format enforced by typography and layout, rather than by format instruction manuals. Users have come to the website to learn about the stimulus projects, not to read format instruction manuals.

For design of all sorts of lists, see our thread Lists: theory and practice

-- Edward Tufte




WEEKLY UPDATES

Since it appears that the bullets are going to stay,
it may be worth trying to tone them down a bit.
They could be made smaller or perhaps they could
change colour (light grey like the title bar?) so
that they have less of a visual impact.

It may also be worth moving them to the left so that
they are aligned with the title bar. The space after
the bullet already provides an indent and this extra
space may help keep the descriptions on one line.

On a slightly different note, are the bullets in any
sort of order or priority? Descending $$$s?
complexity? time? other? Should they be?

Would it be useful to provide a hyperlink from a
planned action to the update when it actually
occurs? What about vice-versa? If a Major Action is
listed should it link back to a planned action?
What would would it say if there is a link? What
would it say if there is not a link?

Is it worth adding a three letter indicator for the Month
in the listings to ease comprehension?

2009 - 04 - 03
2009 APR 03

Should there be a subtle indication that the Planned
Actions or Major Actions Taken have appeared on a previous
weekly update? Perhaps they should become a lighter
shade of grey in subsequent weeks. Maybe a sparkline
could be used to show its age or better yet the $$$.

The sparkline could also be used on this page:
http://www.recovery.gov/? q=content/agency-weekly-reports


Note - The apostrophes appear to have been changed to periods on the Recovery.gov site.
see page: http://www.recovery.gov/? q=content/agency-weekly-reports

-- Tchad (email)



It seems that the apostrophes have been cleared up now and
a new set of charts can be found on the Recovery.org
web site; they appear to be Fusion Charts.

It is a step in the right direction but it may be worth
referencing the Sparkline Thread...information trumps animation



The current Recovery.org charts do not really provide any interagency reference points
and the full agency list is still sorted alphabetically (without any sort of detail)
http://www.recovery.gov/? q=content/agency-weekly-reports

Other thoughts and questions
Review the detail of the action lists: it is interesting to see how the detail correlates to the size of the allocation.

It may also be worth making the external links in the action lists clickable
(see the Department of Energy link below)
http://www.recovery.gov/?q=content/agency-summary&agency_code=89&startdate=2009-04- 10&noofreports=1&status=1&report_id=172

Here is a link for those who prefer to click:
http://www.recovery.gov/?q=content/agency- summary&agency_code=89&startdate=2009-04-10&noofreports=1&status=1&report_id=172

How do visitors to the Recovery.org web site interpret the success metrics shown in the Recovery.org data?
What is the most important? Is the the speed of the spend? Is it financial or time savings? Is it saving or creating jobs? Is success just the documentation process?

-- Tchad (email)


The bullet lists do steal content space, but in any case a list is necessary. Ideally, I would like to see an "expected" value next to each line item and a measure of current progress. Sparklines would do very well at displaying this. There should be an "expected" value sparkline with a running sparkline over it next to each line item showing the performance of each at a regular cadence. This would allow the viewer to see if each item is on-track or off-track and a rough order of magnitude for either.

This does away with one of the problems of "goals" in financial forecasts. Typically the goal is a binary measure out at some distant timepoint (for example, increase revenue to $4.1 bio by 31Dec). This inevitably leads to the dreaded "hockeystick" of performance which all magically gets crammed into the last three months of the year. If we really care about the performance, we'll look at it in comparison with an "expected" value by sparklines at regular intervals.

-- B.L. (email)



Is there any chance of running a special Masterclass Session on the topic of
facilitating the understanding of very large data sets? Perhaps this could be
run in conjunction (the day after) one of the one day workshops...

From Recovery.org


Announcement (Sp)

Recovery Dialogue: IT Solutions
For one week beginning April 27th, The Recovery Accountability and Transparency Board and the Office of Management and Budget in partnership with the National Academy of Public Administration, will host a national online dialogue to engage leading information technology (IT) vendors, thinkers, and consumers in answering a key question: What ideas, tools, and approaches can make Recovery.gov a place where all citizens can transparently monitor the expenditure and use of recovery funds?

Participants from across the IT community will be able to recommend, discuss, and vote on the best ideas, tools, and approaches. Your ideas can directly impact how Recovery.gov operates and ensure that our economic recovery is the most transparent and accountable in history. Mark your calendars and check back for the web link and additional information.

Will there be an Ask ET entry?

-- Tchad (email)


http://www.recovery.gov/sites/default/files/weeklyline.jpg

1. On the horizontal axis, the 11-day interval from Feb. 17 to Feb. 28 is the same length as the other intervals, which are 7 days.

2. On the vertical axis, the $30.5 B point for 'Available' is below the $30 B line.

-- Dan Mackenzie (email)


"Major Actions Taken to Date" include # On Tuesday, Secretary Chu participated in a keynote luncheon Q&A at Newsweeks Third Annual Global Environment Leadership Conference; for excerpts of his remarks see: http://www.newsweek.com/id/193488 # On Tuesday, Secretary Chu delivered the keynote address at the 2009 Energy Information Administration Annual Energy Conference

These are major actions?

-- Rita Risser (email)


came by this in an entirely unrelated thread (I think), but it's an example of a analytic design contest where one of the datasets was the stimulus money:

http://www.tableausoftware.com/blog/announcing-winners-tableau-viz-challenge

-- Niels Olson (email)


This is a great effort. One aspect that deserves special attention, however, is the importance of providing the public with better access to the raw data that agencies use to produce the more digestible summaries and analyses that I expect will be the focus of their communications. Recovery.gov's Frequently Asked Questions currently provides the following note:

Q: Is the spending data on recovery.gov available in a format (like XML) that developers can use to create mashups and gadgets?
A: Not at this time. But, as new systems are developed to capture the allocations and expenditures under the Act, we plan to make that data available in exportable form.

I hope that this answer doesn't reflect a vision of structured data as something that is useful only "to create mashups and gadgets", but rather is just an attempt to describe their plans in terms they hope will appeal to the Web 2.0 set. Raw, importable data is needed first and foremost to allow others to reformulate, revisualize, and verify the data underlying any higher-level summaries and their conclusions. Without this data, the public is unnecessarily captive to the editorial and design choices of the government officials responsible for communicating the information. And while I've spoken with some of these officials who have the public's best interests at heart, they all have a conflict of interest in that they have enormous pressure not to make their employer look bad.

Of course, the exact structure of structured data is important too. For example, the list of 30 agencies in the Financial and Activity Reports lets you drill-down to see weekly reports on outlays, in Excel files. This is great. However, the Excel files are one-per-week with no expenditure data shared between the files. It would be much more helpful if each agency gave you a time series to which it added each week (with additional columns). Even better would be if recovery.gov then put those all together into one uber-sheet.

I'm not clear about what "new systems" are being "developed to capture the expenses and allocations under the Act" from their answer above. Do you have insight into what these new systems are?

Finally, are there other substantive discussions of these issues online? I've seen this thread, but little else.

-- Peter Couvares (email)