All 4 books by Edward Tufte now in
paperback editions, $100 for all 4
Visual Display of Quantitative Information
Beautiful EvidencePaper/printing = original clothbound books.
Only available through ET's Graphics Press:
catalog + shopping cart
All 4 clothbound books, autographed by the author $180
catalog + shopping cart
Edward Tufte e-books
Immediate download to any computer:
Visual and Statistical Thinking $5
The Cognitive Style of Powerpoint $5
Seeing Around + Feynman Diagrams $5
Data Analysis for Politics and Policy $9catalog + shopping cart
Edward Tufte one-day course,
Presenting Data and Information
Princeton NJ, August 1
Brooklyn NY, August 2, 5
Philadelphia PA, August 6
Chicago, September 23
Chicago IL, September 24
Minneapolis MN, September 26
In all my books, one of the key arguments revolves around the routinely spectacular resolution of the human eye-brain system, which then in turn leads to the idea that our displays of evidence should be worthy of human eye-brain system. This is, for example, the conclusion of sparkline analysis in Beautiful Evidence, where the idea is to make our data graphics at least operate at the resolution of good typography (say 2400 dpi).
Here is a link to a press-release summary account of an article in Current Biology (July 2006) by Judith McLean and Michael A. Freed, from the University of Pennsylvania School of Medicine, and Ronen Segev and Michael J. Berry III, from Princeton University. The research suggests that the human retina transmits data to the brain at the rate of 10 million bits per second, which is close to an Ethernet connection!
Looking around the world is easier than analyzing evidence displays, and there may also be within-brain impediments to handling vast amounts of abstract data, but at least the narrow-band choke point for information resolution should not be the display itself.
The average PP slide contains 40 words, which take less 10 seconds to read. Call that 1000 bits per second, which comes to 1/10,000 of the routine human retina-brain data capacity.
Also most of our evidence displays are in flatland, which is a easier than 3D perceptual tasks. On the other hand, many serious data displays are not in the familiar 4D space/time coordinate system that our eye-brain knows so well.
Memory problems can be partly handled by high-resolution displays, so that key comparisons are made adjacent in space within the common eyespan. Spatial adjacency greatly reduces the memory problems associated with making comparisons of small amounts of information stacked in time (PP slides, for example).
-- Edward Tufte
While PowerPoint is surely a horrid way to transmit information, I'm not sure we can inject very abstract information into people at ethernet rates. 40 words in 10 seconds doesn't translate to 1000 bits per second transmitted over the optic nerve, which connects the retina to the banks of the calcarine sulcus in the occipital lobe, via the optic chiasm and the lateral geniculate nucleus. At a minimum the data being transmitted would require an analysis of the typography's geometry (edge detection being a basic function of the retina), the amount of the visual field taken up by the display, the location of the display's image on the retina relative to the fovea, and the rates of change in the display and surrounding motion (the speaker, other audience members, etc).
Your guesstimate of 40 words in 10 seconds leads to a 240 word-per-minute reading speed. Like normal readers, braille readers can read at 200 to 400 words per minute. Is there any evidence that a person with an aquired partial nerve blindness also aquires an impaired ability to reason spatially? My classmates at Tulane Med found they preferred listening to the lecture audio I recorded (see the (audio) links for Spring 2006) at one-and-a-half speed, which also pushes close to 200 words per minute. Most people found twice-speed to be uncomfortably fast. This 200, 240, 400 word-per-minute rate may be a more accurate definition of the rate at which the human mind can receive and abstract information in word form, and this is likely driven by communication between Broca's area and Wernicke's area via the arcuate tract. Keep in mind, reading is a highly abstract function. Babes learn to speak as they learn to interpret what they hear and form their vocabulary. Indeed, Kuhl et al from the University of Washington published findings last week in NeuroReport of activity in Broca's area and Wernicke's area becomes synchronized during the first year (learnt via NPR). Toddlers then dedicate their alphabet to memory, learn to form words in their mind by matching what they see with their mental images of the letters, often by saying them out loud and blending the sounds together, and concommitently starting to memorize common word forms, like their names, as a sort of super-alphabet. That's the left side.
Mountain bike racing in the woods is probably a good speed test for the right cerebral hemisphere's ability to interpret incoming visual data. The entire scene is certainly changing much more quickly, and this is likely recruiting as much of the optic nerve's "bandwidth" as possible. I wouldn't be at all surprised if the bit rate exceeded 106 bits per second, but the degree of abstraction is much lower. "Go there" and "Don't hit the tree" is about all that's necessary. Indeed, one of the most valuable things a racer can do to improve race day lap times is to pre-ride the course the day before. Pre-riding is not nearly such a big deal in road racing.
Where do visual evidence presentations fall in this? I for one have never been in a race and bothered to compare the catenary of a treacherous vine with the theoretical hyperbolic secant, nor is speech much of an issue after the start. Back in civil society, however, any 2D graphic is instantaneously captured in the mind's eye. A projection of 3D more slowly, but still very quickly. There comes a challenge though in asking people to find out what the axis labels are and interpret what they mean. How much of any science student's time is spent memorizing what variables the Greek and Roman characters represent in their particular field? The pursuit of common and standard placement in your work (horizontal words, hang the Y label of the top of the axis like a flag, etc) certainly goes to this challenge, but the fact remains that there is no one overwhelmingly accepted architecture for visual evidence displays, even for the simple graph. Any given reader faces an analytic graphic as a rather loose jumble of sticks and glyphs and bears an unfortunate burden of comparing them to a rather loose model, then reasoning about them. Prose faces people with a lesser burden, that of literacy, and brings with it an overwhelmingly accepted architecture — word, clause, sentence, paragraph. The analytic graphic carries an inherently higher burden. The graph, with all its degrees of abstraction (words, equations, the data, trend lines, etc) represents an extraordinary challenge that demands recalling linguistic and non-linguistic memories, comparing them to the scene falling on the retina, and reasoning about both the stuff before you and the difference between that stuff and the perfect stuff of memory. This requires not only comparing abstractions like words and drawings, which requires communication across the hemispheres via the corpus callosum, but also extraordinarily complex relays with the frontal cortex; in some cases the mere interpretation of an analytic graphic reaches as far as free will, which Francis Crick and at least one of his colleagues at Scripps believed to be located in the anterior cingulate gyrus, Brodmann area 24.
None of this is an apology for PowerPoint, only a minor point about which bit rates can be compared.
-- Niels Olson (email)
Is there any statistic or theory or number that illuminates how people absorb information visually vs. through words? ie; information retention via visuals is X% more effective than through words/sentences/bullet points? I know it will depend upon the visual and the words; i'm just looking for an estimate or hypothesis. Any help is apprecitated.
-- peter shier (email)
Early research showed that the eye is an 80% receptor to the brain vs other sensors. German researchers created a visual discussion method to allow both visual criteria to be applied together with structured written verbal expression to support the spoken word. The method allows many persons to provide input simultaneously and the thinking structure is managed by a discussion butler. The problem with PP and many other displays are that they work to the adage 'the medium is the message' rather than allowing relationships to steer the matter. Interaction is one way. Participants sit in a darkened room that hinders visual relationships with other viewers (possibly on purpose if autocratic presentations are acceptable). One of the charms of ET's work is his use of 'context' to provide visual variety of interpretation. Who else would put sheep with rocks and metal objects to assist new insights? The question then is, how well do we employ our viusal senses and enmesh them with wisdom of interpretation?
-- Roger Daventry (email)
In response to Peter Shier's question, there have been a number of experiments that try to demonstrate that retention of information provided graphically is greater than that of information provided through text alone. (One example is Butler and Mautz, "Multimedia Presentations and Learning," Issues in Accounting, Fall 1996). The problem with this and other similar research from a visual display perspective is that they tend to lump all non- textual display under the label of "multimedia," and information density is never a variable that is measured and reported. Also, the actual visuals used are almost never included in the journal articles; the few that are, if they are indicative of the rest, are worrisome: they tend to include the worst kind of PowerPoint "Phfluff", and so the experiemental results are questionable.
-- Andrew Abela (email)
Ok, I'll say it: A picture is worth a thousand words.
-- Steve Sprague (email)
Lungarella M, Sporns O (2006) Mapping Information Flow in Sensorimotor Networks. PLoS Comput Biol 2(10): e144
(3) The morphology of an embodied system can have significant effects on its information processing capacity. We tested the hypothesis that sensor morphology (here, the arrangement of photoreceptors in a simulated retina) influences the flow of information in a sensorimotor system.Click for a picture.
The last point in the previous paragraph supports the notion of a quantitative link between the morphology of the retina and a computational principle of "optimal flow of information." Given a fixed number of photosensitive elements, their space-variant arrangement maximizes the information gathered, even more so in a system engaged in a sensorimotor interaction, e.g., foveation behavior. If the photoreceptors were uniformly distributed in the retina, those in the periphery would be underutilized; also, fewer photoreceptors would be in the fovea, yielding (on average) lower spatial resolution, and resulting in less accurate estimates of object locations. Such non-uniformity at the receptor level is mirrored by non-uniformity at the cortical level in a topology-preserving fashion, that is, nearby parts of the sensory world are processed in nearby locations in the cortex. There has been some work on deriving such topology-preserving maps through the principles of uniform cortical information density  and entropy maximization . We argue here that in a sensorimotor system, the rate of information transfer is maximized at the receptor stage if the probability distribution of target objects on the retina is adapted to the local photoreceptor density (a morphological property), and that this can be achieved through appropriate system-environment interaction, e.g., foveation, saccades, or adequate hand movements . A further implication of our findings relates to the possible role of early visual processing for the learning of causal relationships between stimuli. It has been shown, for instance, that the receptive fields of retinal ganglion cells produce efficient (predictive) coding of the average visual scene [17,46]. We propose that such coding also depends on the local arrangement of the receptors and on the spatial frequencies encountered during the organism's lifetime.
In conclusion, our results highlight the fundamental importance of embodied interactions and body morphology in biological information processing, supporting a conceptual view of cognition that is based on the interplay between physical and information processes.
One implication from studies like this: In normal reading the eye is moving rapidly over text, which creates motion-like pattern-matching activities in the retina, an example of the eye processing information before sending it to the brain. The large text of PowerPoint bullet points in the small space of a slide may inhibit the mind's natural uptake of the information because the few words on the screen are held in a space that the fovea can handle without eye or body movements like "foveation, saccades, or adequate hand movements"
As an aside, Firefox 2.0's spell checker doesn't recognize sensorimotor, photoreceptor, foveation, saccades, or the HTML tag blockquote, but it knows both Ps in PowerPoint are supposed to be capitalized.
-- Niels Olson (email)
Does this question fit here? Do you think about people with eye-brain connections that have been injured or are completely non-functional, but who still have huge cognitive hunger to get at and give out complex information? Have you or others explored the problem of providing beautiful evidence by means other than through the eye? I attended your workshop recently to further my design thinking, but found myself instead thinking over and over again about applications and implications of your principles in terms of my son with physical disability and cortical visual impairment. The difficulty with visual memory is one thing; try relying on purely auditory memory to categorize, retrieve and make sense of complex information. Now that's a task. I pictured scatter plots as jazz-like compositions; sparklines as played on a theremin (the creepy horror movie instrument by which one can vary pitch and volume by moving one's hands) or synthesizer. Sparklines, overlaid for comparison, with varied tonal quality or instrumentation perhaps, or trends revealed via the complexity and beauty of Bobby McFerin's "Voicestra" vocal orchestra. Surely there are plenty of scientists, students, CEOs with macular degeneration or blindness that need to and desire to grasp rich and complex evidence by ear or by other means just as beautifully. The task of translating visual evidence to a "hi-res" auditory (ear-brain) experience that could be understood as readily and as beautifully as we would wish is a problem that begs for solutions, don't you think?
-- Ann McDonald-Cacho (email)
Here's the best and only answer I have for your difficult question: have your son watch, perhaps over and over, the Music Animation Machine videos. You will enjoy them also. See these links:
Discussion of the Music Animation Machine is at our thread
-- Edward Tufte
How much the eye tells the brain?
Here's the original article, available as a PDF:
Kristin Koch, Judith McLean, Ronen Segev, Michael A. Freed, Michael J. Berry, Vijay Balasubramanian, Peter Sterling, "How Much the Eye Tells the Brain," Current Biology 16 (July 25, 2006), 1428-1434.
PDF file here
-- Edward Tufte
Adrian Perrig and Dawn Song wrote a paper in 2000, Hash Visualization: a New Technique to improve Real-World Security. Their thinking was that the human eye-brain system is much more adapted to remembering structured images than strings of random characters, the usual representation of an encryption key, so images generated from keys might be sufficient for rapid visual confirmation that a key is trustworthy. Nothing beats confirming the alphanumeric key character for character, but it is a time-consuming task and many users completely bypass this step, creating substantial opportunity for man-in-the-middle attackers. The images would be an intermediate good-enough check for the typical person using online banking.
There are other ways to quickly assess a public key. A checksum of the key, called the fingerprint, is the most common; and some people will even truncate that down to just examining the first and last characters or the first 8 characters, something like that.
Perrig and Song specifically propose generating images with the Random Art algorithm: the binary public key is used as the seed for a randomly created function, and the image is generated by sampling the function. If you want a 100x100 image, you sample the function 10,000 times.
So, these images are generated from less than 10 million bits of data, the largest public keys are around 65 thousand bits, but the images could conceivably be inflated to 10 million bits by simply sampling the function 10 million times, which would create a roughly 3000 x 3000 pixel image, smaller than the output of a Nikon D2. 10 million bits can be even smaller if you add color as a dimension.
It is the structure of the function that the eye-brain system must evaluate. It seems to me the question is how much structure can the eye-brain system can parse, given a reference image and a challenge image and the implicit assumption that this should take about a second.
Some readers may realize their banks are already using images as an independent source of confirmation, but those are typically strongly metaphorical photographs of lions, houses, etc. This is rather different.
I found out about this 8 year old paper because OpenSSH 5.1 was released today and it includes image generation as a non-default option for system administrators to start using in the wild. Since the code is open-source, if admins find it effective, then people may start to change how they authenticate over the internet in the next few years. It may even lower the comfort barrier to acceptance of the current holy grail of Internet authentication, OpenID.
-- Niels Olson (email)
A visitor to this forum, Angela Morelli, asked me by email why we understand numbers and graphs differently. My response became a more thorough version of what is posted above.
People interpret numbers and graphs differently because they are handled differently in the brain. Numbers are generally handled by the verbal linguistic system and graphs are handled by both the non-verbal linguistic system and the limbic system. The bit rate of the visual system is about 10 million bits/second (see the first post in this thread). The rate of reading, listening, braille, typing, maxes out at around 150-400 words per minute. To understand how this works, and provide a foundation for further reading, a *very* brief review of the relevant neuroscience seems in order.
Visual processing begins in the retina with some very simple edge definition. Further edge definition occurs where the neurons of the optic nerve enter the brain at the lateral geniculate nucleus. The neurons synapse and new neurons run in the optic radiations from the LGN to the banks of the calcarine sulcus, where further edge definition and integration occurs. The calcarine sulcus is at the very posterior part of the cerebrum and represents the first time the visual information enters the cerebral grey matter. From there, the final elements of subconscious analysis and pattern recognition occur in the lingual gyrus and cuneus, which sort of wrap around the banks of the calcarine sulcus like concentric rings (heavily folded, of course). Everything up to and including this point is essentially image processing.
Conscious recognition starts to occur in the inferotemporal region, Brodmann's areas 37 and 7a. Lesions to these areas lead to what are called agnosias. Oliver Sacks' The Man Who Mistook His Wife for a Hat has a good example of an agnosia. In fact, most of the back half of the cerebrum (abaft your ears) that isn't involved in basic visual perception is involved in this kind of unimodal association. The other major exception is the angular gyrus and Wernicke's area.
Once the basic conversion to symbolic information occurs, information is routed based on type. Basic numeracy is handled by Brodmann's area 39 in the angular gyrus of the parietal lobe, just slightly above Brodmann's area 37 where the numbers were recognized as numbers. The syntactic region of the brain, Wernicke's area, exists very close by, and can be considered to involve the angular gyrus. A stroke to Wernicke's area leads to expressive aphasia. The patient has access to their entire vocabulary and will speak words clearly, but can't understand and can't compose syntactically correct thoughts. This is commonly described as a "word salad". Wernicke's area is where mathematical training or computer science training trains new syntactic structures. A physicist, in some ways, can think thoughts a non-physicist can't. If Noam Chomsky's syntactic structures exist, they are mainly constructed in Wernicke's area. Now, Wernicke's area is still in the back half of the brain. It is evolutionarily older than Homo sapiens. And, indeed, apes and even dogs, rats, cats, and insects can construct syntacticly different sequences of sounds.
Where we really start to diverge from other species is in Broca's area, which structurally lies in the relatively new prefrontal cortex and consists of the Pars triangularis and Pars opercularis. Functionally, Broca's area contains the dictionary. A person with a stroke in Broca's area can, with great difficulty, construct sentences, but they have profound word-finding problems which get worse with stress. And they're always stressed out because they can't find the word! Between Wernicke's area and Broca's area is the arcuate tract, a superhighway of axons committed to carrying information between the neurons of Wernicke's area and Broca's area. Damage to the arcuate tract results in a person who can understand and can speak, but can't hear what you say and then formulate a reply. In higher math, it is the left-sided verbal linguistic system that is involved in equations. In basic number recognition, it is simply the angular gyrus that is involved. This whole system is essentially verbal and exists on the left side. A non-verbal, musical, spatial, temporal, inflection-oriented corollary system exists on the right side.
Someone with a right-sided stroke can communicate, but may have inappropriate responses because they can't understand, interpret, or compose the "how you say it". They also have difficulties interpreting space and, perhaps, time, as they tend to have attention disorders. Interestingly, mathematicians are commonly interested in music and often find spatial expressions of their mathematics particularly appealing, suggesting a high degree of integration between their verbal and non-verbal language centers. And, similar to the syntactic function of Wernicke's area, a musician can compose syntactic structures a non-musician may have a hard time understanding.
As an aside, I am inclined to wonder if general intelligence is an emergent property of our neurons in a way that is similar to how a Turing complete programming language can emerge from Church numerals and lambda calculus. One could think of a neuron as an atom, and two neurons connected by a synapse as a list.
-- Niels Olson (email)
this reference made me think of how I could make a Petabyte more understandable. In digital data terms a petabyte is a lot of data. 1 PB = 1,000,000,000,000,000 B = 1015 byte. Assuming a byte is 8 bits then a petabyte is 8 x 1015 bits. According to this paper, Google processes more than 20 Petabytes of data per day using its MapReduce program. According to Kevin Kelly of the New York Times, this reference, "the entire works of humankind, from the beginning of recorded history, in all languages" would amount to 50 petabytes of data. These are all difficult to understand as they are abstract. So I tried to find a way of understanding what a Petabyte is in terms of an individual human being. From the paper you refer to here we can estimate that the human retina communicates with the brain at a rate of 10 million bits per second or 106 bits per second. This sounds pretty impressive. How long does it take a human eye-brain system to move a petabyte of data (assuming that you could keep your eyes permanently open so that you are getting your full 10 million bits per second). By my calculations a year is 3.15 x 107 seconds. This means a total amount of data per year from retina to brain of 3.15 x 1013 bits. Dividing 8 x 1015 by 3.15 x 1013 we get 254 years. This is a long time to keep your eyes open! If we take a normal human life to be the biblical standard of Psalms 90: The days of our years are threescore years and ten, then a normal human creates about 0.27 petabytes in their life. We could also define a brand new unit, the PetaBlife, with a symbol ℘ which is the number of standard human lifetimes required for a human retina to make a PetaByte of data. Matt Reed
-- Matt R (email)
Yesterday's NYT had an article about processing visual information and risk analysis within extreme conditions: www.nytimes.com/2009/07/28/health/research/28brain.htm. J.D. McCubbin
-- J. D. McCubbin (email)
34 gigabytes per day per person?
-- Edward Tufte
I have been reading up on the evolution of eyes and vision. I stumbled across the work of Prof Russell Fernald who is at Stanford University (http://www.stanford.edu/group/fernaldlab). From a paper by him published in Current Opinion in Neurobiology 10(4): 444-50 in 2000 the following profound statement made a big impression on me;
"Light has probably been the most profound selective force to act during biological evolution. The 10^15 sunrises and sunsets that have taken place since life began have led to the evolution of eyes which use light for vision and for other purposes including navigation and timing."
-- Matt R (email)
Dear Professor Tufte,
I eagerly look forward to your analysis of Apple's new iPhone
4 particularly the Retina
Display they are branding. I can't help but wonder who at Apple has been following this discussion thread
which you initialized several years ago?
Some people who have apparently held and used an iPhone 4 are making comments like these:
The resolution of the "retina display" is as impressive as Apple boasts. Text renders like high quality print.
It's mentioned briefly in Apple's promotional video about the design of the iPhone 4, but they're using a new production process that effectively fuses the LCD and touchscreen -- there is no longer any air between the two. One result of this is that the iPhone 4 should be impervious to this dust-under-the-glass issue. More importantly, though, is that it looks better. The effect is that the pixels appear to be painted on the surface of the phone; instead of looking at pixels under glass, it like looking at pixels on glass. Combined with the incredibly high pixel density, the overall effect is like "live print".
What might text and sparklines look like on a "Retina Display"? I can't wait for your own hands-on review of iPhone 4 and also the iPad. Thank you! -Eddie
-- Eddie (email)
There is a growing debate about the resolution of the new iPhone and how it compares to the eye.
Here are some highlights:
Raymond Soneira : on Wired
1. The resolution of the retina is in angular measure - the accepted value is 50 Cycles Per Degree.
A cycle is a line pair, which is two pixels, so the angular resolution of the eye is 0.6 arc
minutes per pixel.
2. So, if you hold an iPhone at the typical 12 inches from your eyes it would need to be 477 pixels
per inch to be a retina limited display. At 8 inches it would need to be 716 ppi. You have to hold
it out 18 inches before the requirement falls to 318 ppi. The iPhone 4 resolution is 326 ppi.
Phil Plait : on Discover
Let me make this clear: if you have perfect eyesight, then at one foot away the iPhone 4's pixels are
resolved. The picture will look pixellated. If you have average eyesight, the picture will look just fine.
-- Tchad (email)
Jeff Hawkins's brilliant new model of the neocortex
Jeff Hawkins presents a coherent, derived-from-first-principles model of neocortical
thinking that is just stunning.
Jeff Hawkins: Advances in Modeling Neocortex and Its Impact on Machine Intelligence
-- Niels Olson (email)
Alfred Lukyanovich Yarbus (1914 -1986) was a Russian psychologist who made a number of seminal studies of eye movements. Many of his most interesting results were published in a book, translated into English and published in New York in 1967 as Eye Movements and Vision. This book is now out of print but you can find PDF copies to download.
I first saw some of Yarbus' data about 13 years ago as scratchy black and white scans from the book.
One of the most compelling of Yarbus' experiments was an eye-tracking study he performed where he asked subjects to look at a reproduction of a Russion oil painting An Unexpected Visitor painted by Ilya Repin in 1884.
Yarbus asked the subjects to look at the same picture in a number of different ways, including;  examine the painting freely.  estimate the material circumstances of the family.  assess the ages of the characters  determine the activities of the family prior to the visitor's arrival.  remember the characters' clothes. And  surmise how long the visitor had been away from the family. What is brilliant is that the eye-tracking traces recorded by Yarbus showed that the subjects visually interrogate the picture in a completely different way depending on what they want to get from it.
Cabinet Magazine (Issue 30 The Underground Summer 2008) has a piece by Sasha Archibald called Ways of Seeing that takes the original eye-tracking traces from Yarbus' book and superposes them on a colour reproduction of the painting.
This is the first time I have seen this done. The originals in the book by Yarbus are disembodied eye-tracking traces laid out near to, but not overlaying, the reproduction of the Repin painting. These new overlays by Archibald are worth comparing. Here is (left) the original image (middle) free examination and (right) what the subject did when asked to estimate the material circumstances of the family.
-- Matt R (email)
Example of industrial supplier using retinal tracking
A major industrial supplier asked me to participate in a study of their web interface.
After an interview, they sat me before a monitor and handed me about six different objects to find to find on their website. One object was a small plastic pipe fitting, and I remember a a couple of fasteners. A tiny web cam atop the monitor tracked my eye movements as I negotiated the site and found the products. They were testing frames, as I recall.
Based on my compensation, this testing is expensive, but the quality of their website shows.
-- Jon Gross (email)