| [ Current Topics | Complete List of All Active Topics | RSS feed | Search ] |
Retina communicates to brain at 10 million bits per second: Implications for evidence displays?In all my books, one of the key arguments revolves around the routinely spectacular resolution of the human eye-brain system, which then in turn leads to the idea that our displays of evidence should be worthy of human eye-brain system. This is, for example, the conclusion of sparkline analysis in Beautiful Evidence, where the idea is to make our data graphics at least operate at the resolution of good typography (say 2400 dpi). Here is a link to a press-release summary account of an article in Current Biology (July 2006) by Judith McLean and Michael A. Freed, from the University of Pennsylvania School of Medicine, and Ronen Segev and Michael J. Berry III, from Princeton University. The research suggests that the human retina transmits data to the brain at the rate of 10 million bits per second, which is close to an Ethernet connection! Looking around the world is easier than analyzing evidence displays, and there may also be within-brain impediments to handling vast amounts of abstract data, but at least the narrow-band choke point for information resolution should not be the display itself. The average PP slide contains 40 words, which take less 10 seconds to read. Call that 1000 bits per second, which comes to 1/10,000 of the routine human retina-brain data capacity. Also most of our evidence displays are in flatland, which is a easier than 3D perceptual tasks. On the other hand, many serious data displays are not in the familiar 4D space/time coordinate system that our eye-brain knows so well. Memory problems can be partly handled by high-resolution displays, so that key comparisons are made adjacent in space within the common eyespan. Spatial adjacency greatly reduces the memory problems associated with making comparisons of small amounts of information stacked in time (PP slides, for example). -- Edward Tufte, July 26, 2006 |
|
While PowerPoint is surely a horrid way to transmit information, I'm not sure we can inject very abstract information into people at ethernet rates. 40 words in 10 seconds doesn't translate to 1000 bits per second transmitted over the optic nerve, which connects the retina to the banks of the calcarine sulcus in the occipital lobe, via the optic chiasm and the lateral geniculate nucleus. At a minimum the data being transmitted would require an analysis of the typography's geometry (edge detection being a basic function of the retina), the amount of the visual field taken up by the display, the location of the display's image on the retina relative to the fovea, and the rates of change in the display and surrounding motion (the speaker, other audience members, etc). Your guesstimate of 40 words in 10 seconds leads to a 240 word-per-minute reading speed. Like normal readers, braille readers can read at 200 to 400 words per minute. Is there any evidence that a person with an aquired partial nerve blindness also aquires an impaired ability to reason spatially? My classmates at Tulane Med found they preferred listening to the lecture audio I recorded (see the (audio) links for Spring 2006) at one-and-a-half speed, which also pushes close to 200 words per minute. Most people found twice-speed to be uncomfortably fast. This 200, 240, 400 word-per-minute rate may be a more accurate definition of the rate at which the human mind can receive and abstract information in word form, and this is likely driven by communication between Broca's area and Wernicke's area via the arcuate tract. Keep in mind, reading is a highly abstract function. Babes learn to speak as they learn to interpret what they hear and form their vocabulary. Indeed, Kuhl et al from the University of Washington published findings last week in NeuroReport of activity in Broca's area and Wernicke's area becomes synchronized during the first year (learnt via NPR). Toddlers then dedicate their alphabet to memory, learn to form words in their mind by matching what they see with their mental images of the letters, often by saying them out loud and blending the sounds together, and concommitently starting to memorize common word forms, like their names, as a sort of super-alphabet. That's the left side. Mountain bike racing in the woods is probably a good speed test for the right cerebral hemisphere's ability to interpret incoming visual data. The entire scene is certainly changing much more quickly, and this is likely recruiting as much of the optic nerve's "bandwidth" as possible. I wouldn't be at all surprised if the bit rate exceeded 106 bits per second, but the degree of abstraction is much lower. "Go there" and "Don't hit the tree" is about all that's necessary. Indeed, one of the most valuable things a racer can do to improve race day lap times is to pre-ride the course the day before. Pre-riding is not nearly such a big deal in road racing. Where do visual evidence presentations fall in this? I for one have never been in a race and bothered to compare the catenary of a treacherous vine with the theoretical hyperbolic secant, nor is speech much of an issue after the start. Back in civil society, however, any 2D graphic is instantaneously captured in the mind's eye. A projection of 3D more slowly, but still very quickly. There comes a challenge though in asking people to find out what the axis labels are and interpret what they mean. How much of any science student's time is spent memorizing what variables the Greek and Roman characters represent in their particular field? The pursuit of common and standard placement in your work (horizontal words, hang the Y label of the top of the axis like a flag, etc) certainly goes to this challenge, but the fact remains that there is no one overwhelmingly accepted architecture for visual evidence displays, even for the simple graph. Any given reader faces an analytic graphic as a rather loose jumble of sticks and glyphs and bears an unfortunate burden of comparing them to a rather loose model, then reasoning about them. Prose faces people with a lesser burden, that of literacy, and brings with it an overwhelmingly accepted architecture — word, clause, sentence, paragraph. The analytic graphic carries an inherently higher burden. The graph, with all its degrees of abstraction (words, equations, the data, trend lines, etc) represents an extraordinary challenge that demands recalling linguistic and non-linguistic memories, comparing them to the scene falling on the retina, and reasoning about both the stuff before you and the difference between that stuff and the perfect stuff of memory. This requires not only comparing abstractions like words and drawings, which requires communication across the hemispheres via the corpus callosum, but also extraordinarily complex relays with the frontal cortex; in some cases the mere interpretation of an analytic graphic reaches as far as free will, which Francis Crick and at least one of his colleagues at Scripps believed to be located in the anterior cingulate gyrus, Brodmann area 24. None of this is an apology for PowerPoint, only a minor point about which bit rates can be compared. -- Niels Olson (email), July 27, 2006 |
|
Is there any statistic or theory or number that illuminates how people absorb information visually vs. through words? ie; information retention via visuals is X% more effective than through words/sentences/bullet points? I know it will depend upon the visual and the words; i'm just looking for an estimate or hypothesis. Any help is apprecitated. -- peter shier (email), October 6, 2006 |
|
Early research showed that the eye is an 80% receptor to the brain vs other sensors. German researchers created a visual discussion method to allow both visual criteria to be applied together with structured written verbal expression to support the spoken word. The method allows many persons to provide input simultaneously and the thinking structure is managed by a discussion butler. The problem with PP and many other displays are that they work to the adage 'the medium is the message' rather than allowing relationships to steer the matter. Interaction is one way. Participants sit in a darkened room that hinders visual relationships with other viewers (possibly on purpose if autocratic presentations are acceptable). One of the charms of ET's work is his use of 'context' to provide visual variety of interpretation. Who else would put sheep with rocks and metal objects to assist new insights? The question then is, how well do we employ our viusal senses and enmesh them with wisdom of interpretation? -- Roger Daventry (email), October 6, 2006 |
|
In response to Peter Shier's question, there have been a number of experiments that try to demonstrate that retention of information provided graphically is greater than that of information provided through text alone. (One example is Butler and Mautz, "Multimedia Presentations and Learning," Issues in Accounting, Fall 1996). The problem with this and other similar research from a visual display perspective is that they tend to lump all non- textual display under the label of "multimedia," and information density is never a variable that is measured and reported. Also, the actual visuals used are almost never included in the journal articles; the few that are, if they are indicative of the rest, are worrisome: they tend to include the worst kind of PowerPoint "Phfluff", and so the experiemental results are questionable. -- Andrew Abela (email), October 7, 2006 |
|
Ok, I'll say it: A picture is worth a thousand words. -- Steve Sprague (email), October 7, 2006 |
|
Lungarella M, Sporns O (2006) Mapping Information Flow in Sensorimotor Networks. PLoS Comput Biol 2(10): e144 (3) The morphology of an embodied system can have significant effects on its information processing capacity. We tested the hypothesis that sensor morphology (here, the arrangement of photoreceptors in a simulated retina) influences the flow of information in a sensorimotor system.Click for a picture. One implication from studies like this: In normal reading the eye is moving rapidly over text, which creates motion-like pattern-matching activities in the retina, an example of the eye processing information before sending it to the brain. The large text of PowerPoint bullet points in the small space of a slide may inhibit the mind's natural uptake of the information because the few words on the screen are held in a space that the fovea can handle without eye or body movements like "foveation, saccades, or adequate hand movements" As an aside, Firefox 2.0's spell checker doesn't recognize sensorimotor, photoreceptor, foveation, saccades, or the HTML tag blockquote, but it knows both Ps in PowerPoint are supposed to be capitalized. -- Niels Olson (email), October 30, 2006 |
|
Does this question fit here? Do you think about people with eye-brain connections that have been injured or are completely non-functional, but who still have huge cognitive hunger to get at and give out complex information? Have you or others explored the problem of providing beautiful evidence by means other than through the eye? I attended your workshop recently to further my design thinking, but found myself instead thinking over and over again about applications and implications of your principles in terms of my son with physical disability and cortical visual impairment. The difficulty with visual memory is one thing; try relying on purely auditory memory to categorize, retrieve and make sense of complex information. Now that's a task. I pictured scatter plots as jazz-like compositions; sparklines as played on a theremin (the creepy horror movie instrument by which one can vary pitch and volume by moving one's hands) or synthesizer. Sparklines, overlaid for comparison, with varied tonal quality or instrumentation perhaps, or trends revealed via the complexity and beauty of Bobby McFerin's "Voicestra" vocal orchestra. Surely there are plenty of scientists, students, CEOs with macular degeneration or blindness that need to and desire to grasp rich and complex evidence by ear or by other means just as beautifully. The task of translating visual evidence to a "hi-res" auditory (ear-brain) experience that could be understood as readily and as beautifully as we would wish is a problem that begs for solutions, don't you think? -- Ann McDonald-Cacho (email), December 28, 2006 |
|
Here's the best and only answer I have for your difficult question: have your son watch, perhaps over and over, the Music Animation Machine videos. You will enjoy them also. See these links: http://www.well.com/user/smalin/mam.html Discussion of the Music Animation Machine is at our thread http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00005y&topic_id=1 -- Edward Tufte, December 28, 2006 |
|
How much the eye tells the brain?
Here's the original article, available as a PDF: Kristin Koch, Judith McLean, Ronen Segev, Michael A. Freed, Michael J. Berry, Vijay Balasubramanian, Peter Sterling, "How Much the Eye Tells the Brain," Current Biology 16 (July 25, 2006), 1428-1434. PDF file here -- Edward Tufte, March 21, 2008 |
|
Adrian Perrig and Dawn Song wrote a paper in 2000, Hash Visualization: a New Technique to improve Real-World Security. Their thinking was that the human eye-brain system is much more adapted to remembering structured images than strings of random characters, the usual representation of an encryption key, so images generated from keys might be sufficient for rapid visual confirmation that a key is trustworthy. Nothing beats confirming the alphanumeric key character for character, but it is a time-consuming task and many users completely bypass this step, creating substantial opportunity for man-in-the-middle attackers. The images would be an intermediate good-enough check for the typical person using online banking. There are other ways to quickly assess a public key. A checksum of the key, called the fingerprint, is the most common; and some people will even truncate that down to just examining the first and last characters or the first 8 characters, something like that. Perrig and Song specifically propose generating images with the Random Art algorithm: the binary public key is used as the seed for a randomly created function, and the image is generated by sampling the function. If you want a 100x100 image, you sample the function 10,000 times. So, these images are generated from less than 10 million bits of data, the largest public keys are around 65 thousand bits, but the images could conceivably be inflated to 10 million bits by simply sampling the function 10 million times, which would create a roughly 3000 x 3000 pixel image, smaller than the output of a Nikon D2. 10 million bits can be even smaller if you add color as a dimension. It is the structure of the function that the eye-brain system must evaluate. It seems to me the question is how much structure can the eye-brain system can parse, given a reference image and a challenge image and the implicit assumption that this should take about a second. Some readers may realize their banks are already using images as an independent source of confirmation, but those are typically strongly metaphorical photographs of lions, houses, etc. This is rather different. I found out about this 8 year old paper because OpenSSH 5.1 was released today and it includes image generation as a non-default option for system administrators to start using in the wild. Since the code is open-source, if admins find it effective, then people may start to change how they authenticate over the internet in the next few years. It may even lower the comfort barrier to acceptance of the current holy grail of Internet authentication, OpenID. -- Niels Olson (email), July 23, 2008 |
|
A visitor to this forum, Angela Morelli, asked me by email why we understand numbers and graphs differently. My response became a more thorough version of what is posted above. People interpret numbers and graphs differently because they are handled differently in the brain. Numbers are generally handled by the verbal linguistic system and graphs are handled by both the non-verbal linguistic system and the limbic system. The bit rate of the visual system is about 10 million bits/second (see the first post in this thread). The rate of reading, listening, braille, typing, maxes out at around 150-400 words per minute. To understand how this works, and provide a foundation for further reading, a *very* brief review of the relevant neuroscience seems in order. Visual processing begins in the retina with some very simple edge definition. Further edge definition occurs where the neurons of the optic nerve enter the brain at the lateral geniculate nucleus. The neurons synapse and new neurons run in the optic radiations from the LGN to the banks of the calcarine sulcus, where further edge definition and integration occurs. The calcarine sulcus is at the very posterior part of the cerebrum and represents the first time the visual information enters the cerebral grey matter. From there, the final elements of subconscious analysis and pattern recognition occur in the lingual gyrus and cuneus, which sort of wrap around the banks of the calcarine sulcus like concentric rings (heavily folded, of course). Everything up to and including this point is essentially image processing. Conscious recognition starts to occur in the inferotemporal region, Brodmann's areas 37 and 7a. Lesions to these areas lead to what are called agnosias. Oliver Sacks' The Man Who Mistook His Wife for a Hat has a good example of an agnosia. In fact, most of the back half of the cerebrum (abaft your ears) that isn't involved in basic visual perception is involved in this kind of unimodal association. The other major exception is the angular gyrus and Wernicke's area. Once the basic conversion to symbolic information occurs, information is routed based on type. Basic numeracy is handled by Brodmann's area 39 in the angular gyrus of the parietal lobe, just slightly above Brodmann's area 37 where the numbers were recognized as numbers. The syntactic region of the brain, Wernicke's area, exists very close by, and can be considered to involve the angular gyrus. A stroke to Wernicke's area leads to expressive aphasia. The patient has access to their entire vocabulary and will speak words clearly, but can't understand and can't compose syntactically correct thoughts. This is commonly described as a "word salad". Wernicke's area is where mathematical training or computer science training trains new syntactic structures. A physicist, in some ways, can think thoughts a non-physicist can't. If Noam Chomsky's syntactic structures exist, they are mainly constructed in Wernicke's area. Now, Wernicke's area is still in the back half of the brain. It is evolutionarily older than Homo sapiens. And, indeed, apes and even dogs, rats, cats, and insects can construct syntacticly different sequences of sounds. Where we really start to diverge from other species is in Broca's area, which structurally lies in the relatively new prefrontal cortex and consists of the Pars triangularis and Pars opercularis. Functionally, Broca's area contains the dictionary. A person with a stroke in Broca's area can, with great difficulty, construct sentences, but they have profound word-finding problems which get worse with stress. And they're always stressed out because they can't find the word! Between Wernicke's area and Broca's area is the arcuate tract, a superhighway of axons committed to carrying information between the neurons of Wernicke's area and Broca's area. Damage to the arcuate tract results in a person who can understand and can speak, but can't hear what you say and then formulate a reply. In higher math, it is the left-sided verbal linguistic system that is involved in equations. In basic number recognition, it is simply the angular gyrus that is involved. This whole system is essentially verbal and exists on the left side. A non-verbal, musical, spatial, temporal, inflection-oriented corollary system exists on the right side.
-- Niels Olson (email), November 8, 2008 |
|
this reference made me think of how I could make a Petabyte more understandable. In digital data terms a petabyte is a lot of data. 1 PB = 1,000,000,000,000,000 B = 1015 byte. Assuming a byte is 8 bits then a petabyte is 8 x 1015 bits. According to this paper, Google processes more than 20 Petabytes of data per day using its MapReduce program. According to Kevin Kelly of the New York Times, this reference, "the entire works of humankind, from the beginning of recorded history, in all languages" would amount to 50 petabytes of data. These are all difficult to understand as they are abstract. So I tried to find a way of understanding what a Petabyte is in terms of an individual human being. From the paper you refer to here we can estimate that the human retina communicates with the brain at a rate of 10 million bits per second or 106 bits per second. This sounds pretty impressive. How long does it take a human eye-brain system to move a petabyte of data (assuming that you could keep your eyes permanently open so that you are getting your full 10 million bits per second). By my calculations a year is 3.15 x 107 seconds. This means a total amount of data per year from retina to brain of 3.15 x 1013 bits. Dividing 8 x 1015 by 3.15 x 1013 we get 254 years. This is a long time to keep your eyes open! If we take a normal human life to be the biblical standard of Psalms 90: The days of our years are threescore years and ten, then a normal human creates about 0.27 petabytes in their life. We could also define a brand new unit, the PetaBlife, with a symbol ℘ which is the number of standard human lifetimes required for a human retina to make a PetaByte of data. Matt Reed -- Matt R (email), July 18, 2009 |
|
Yesterday's NYT had an article about processing visual information and risk analysis within extreme conditions: www.nytimes.com/2009/07/28/health/research/28brain.htm. J.D. McCubbin -- J. D. McCubbin (email), July 29, 2009 |
|
34 gigabytes per day per person? http://bits.blogs.nytimes.com/2009/12/09/the-american-diet-34-gigabytes-a-day/ -- Edward Tufte, December 11, 2009 |
|
Dear ET, I have been reading up on the evolution of eyes and vision. I stumbled across the work of Prof Russell Fernald who is at Stanford University (http://www.stanford.edu/group/fernaldlab). From a paper by him published in Current Opinion in Neurobiology 10(4): 444-50 in 2000 the following profound statement made a big impression on me; "Light has probably been the most profound selective force to act during biological evolution. The 10^15 sunrises and sunsets that have taken place since life began have led to the evolution of eyes which use light for vision and for other purposes including navigation and timing." Best wishes Matt
-- Matt R (email), June 6, 2010 |
|
Dear Professor Tufte,
I eagerly look forward to your analysis of Apple's new iPhone
4 particularly the Retina
Display they are branding. I can't help but wonder who at Apple has been following this discussion thread
which you initialized several years ago?
The resolution of the "retina display" is as impressive as Apple boasts. Text renders like high quality print. and It's mentioned briefly in Apple's promotional video about the design of the iPhone 4, but they're using a new production process that effectively fuses the LCD and touchscreen -- there is no longer any air between the two. One result of this is that the iPhone 4 should be impervious to this dust-under-the-glass issue. More importantly, though, is that it looks better. The effect is that the pixels appear to be painted on the surface of the phone; instead of looking at pixels under glass, it like looking at pixels on glass. Combined with the incredibly high pixel density, the overall effect is like "live print". What might text and sparklines look like on a "Retina Display"? I can't wait for your own hands-on review of iPhone 4 and also the iPad. Thank you! -Eddie -- Eddie (email), June 8, 2010 |
|
-- Tchad (email), June 14, 2010 |
|
|
|
|||||||||||