Computing Lie Factor by Dividing Percentages
In the book “The Visual Display of Quantitative Information”, there is a term “Lie Factor” defined on page 57. An example computation of lie factor is given. Basically, the actual data varies from 18 to 27.5, but graphically it varies from 0.6 to 5.3. So the actual change is 1.53, the graphical one is 8.83, and resulting lie factor is 5.78.
The computation in the book gives a lie factor of 14.8, which is incorrect.
Simpler example: if something changes by a factor of 2, and the graphic shows that it changed by a factor of 4, then the lie factor is 2. However, if we divide percentages, we get 300% / 100% = 3.
Therefore, lie factors reported in “Visual Display…” are exaggerated. If original data changes by a factor of a, and the graphics data changes by a factor of b, then the lie factor is b/a, but ET’s factor is (b-1)/(a-1).
Example: real data shows 1,1.01, graphics shows 1,2.
The lie factor is not 100, but 1.98. One could say “well, but the growth here is 1%, there it’s 100%, so it’s exaggerated by a factor of 100!” But this logic is incorrect. The following example illustrates it:
Suppose the original data is 1,2,3. The graphic shows 1,8,12.
If we scale the effect shown in the graphic down by a factor of 4, we get the correct growth. So the lie factor is 4. But if we divide percentages, the lie factor is either 7 or 5.5, depending on which pair of numbers you use to compute it. If the data went up to 100 and the graphic to 400, the Tufte lie factor would give 4.03 (=399/99). That is, only in the limit would it converge to the right number.
Regards,
Alexei Lebedev
I hope I am not up against the famous Russian mathematician, Alexei Lebedev, in making this response, but I side with ET in his calculation of the Lie Factor. I think it all comes down to how you define the term “size of effect” that appears in both the numerator and denominator of the equation for Lie Factor, i.e. LF = (size of effect shown in graphic / size of effect in data).
Dr Tufte defines size of effect as the % change, which he calculates correctly in the fuel economy example, where the size of the graphic increases from 0.6 to 5.3, giving a size of effect of the graphic of 783%. The size of the data increases from 18 to 27.5, giving a size of effect of the data of 53%. The LF then is 783 / 53 = 14.8, as Dr Tufte calculates.
BTW, in Alexi’s example of a change in data from 1.00 to 1.01 and a change in graphics from 1.00 to 2.00, the LF is 100 using this “size of effect” definition, which I think is key to the calculation of LF.
Alexi’s calculations could be correct, if “size of effect” were defined differently (as he does in his calculations). But since ET invented the LF, he gets to define its terms as well.
Regards, Jim Heimer
Generally, the way we calculate percent change (divide the change by the starting value) is defective and subject to manipulation by those who want to make a point, just as some do with graphics. A much more rational way to compute percent change is to divide the CHANGE by the AGM of the initial and final value.
AGM was studied by the great German mathematician Karl Gauss. It is easily computed by a simple recursive procedure on any calculator with a square root button, or directly on any calculator with a natural log button. (See http://megspace.com/science/sfe/i_ot_pch.html for this teaching.) Note that this idea extends to all calculations of percent change and it is most embarrassing to those who disseminate financial statistics.
Example cited by Lebedev from Tufte’s p. 57:
effect shown in graphic= 100*ln(5.3/.6) = 217.9%
effect shown in data= 100*ln(27.5/18) = 42.38%
lie factor = 100*ln(217.9/42.38) = 163.7%
If there had been no lying, the lie factor as computed above would be zero. Note that the lie factor as calculated here can be negative, as when a presenter wants to convey a false impression of stability.
Richard Schwartz
Seth Godin shows makes a comment about the decline in sales of Ford SUVs, What happened to Ford. He mentions that “This chart is just part of the problem.” although I think he misses why that is.
A very popular display showing the decline in banking sector market capitalisation since the end of mid-2007 has been circulating in financial markets for the past couple of days, with many people shocked by the changes. It becomes clear why, once the display is seen: the market capitalisation for each bank is shown using a circle, but the relative size of the banks is only accurately represented by the diameter of each circle. A classic example of what is described in VDQI of one-dimensional data represented by two-dimensional objects. Unfortunately, I was told that out of everyone that saw this representation, no-one else had realised that the data were misrepresented. Needless to say, the corrected version was less impressive (although nonetheless, still rather depressing).