odds ratios, graphing positive and negative associations together
January 11, 2004 | Marina Counter
9 Comment(s)
How do you graph, on one chart, the results of discrete choice logistic regression in which there are positive and negative associations (odds ratios above and below one) for different categories of different variables? Excel, SAS, SPSS and SUDAAN don’t seem to offer anything.
I have been doing it by creating a bar chart, with the origin at 1, that shows the magnitude correctly i.e. 0.2, 0.33, 1.0, 3.0, 5.0 as symmetrical, by transforming the odds ratios, and relabeling the grid.
Any suggestions?
Sounds like a job for any competent econometrics or a biometrics package.
Plotting the odds ratio on a log scale is a nice way to retain the symmetry of ratios above and below 1, and can be accomplished in any of those packages.
SAS can do this: Plot 95% confidence bounds vertically, with the point estimate, as a HiLo plot. Use a log-scale verticlae axis, include at a minimum a horizontal reference line at y=1. If you wish, add additonal rference lines for clinically significant (as opposed to statistically significant) odds ratios.
Let me fix the above post:
Compute the 95% confidence bounds for the natural log of your odds ratio.
Use a hilo plot in SAS.
Use a vertical log scaled axis (be sure you specify log base e).
Include a reference line (horizontal) for y=0, which corrsponds to log(1), equal risk for both classes of exposure.
Does anyone have ideas about how one might graphically show the relationship between a
single continuous predictor variable and the probability of a categorical outcome variable
with 3 levels?
If it had only two levels, a “logical” approach would be the logistic function. However, adding
that 3rd level (done with multinomial logistic regression) has me stumped. Any thoughts?
Thanks. -Erick
Odds ratios…interesting summary numbers.
The odds ratio ranges from 0 to positive infinity, with 1.0 indicating equal odds. The problem I have with interpreting the odds ratio is that the magnitude of odds can give the perception of huge differences in likelihood of the outcome, given the predictor.
I prefer to transform the odds ratio into a probability statement. Remember, in logistic regression we model the prob(Y|X) with the function, x / 1 + x, which takes on the range 0 to 1. To transform an odds ratio into a probability, simply calculate:
p = proportion in positive category on dependent variable
and
q = 1 – p.
Then, calculate pqb, where p and q are defined above and b is logistic regression coefficient (not odds ratio). Now, pqb is approximately the first derivative of the logistic function evaluated at the mean of the dependent variable.
SO, as an example…
Using an odds ratio, you might state that males are 1.5 times more likely than females to be diagnosed with HIV (please, these are hypotheticals). Calculating pqb, you could transform the odds ratio to make the alternative statement, equivalent in meaning, that the difference in the probability of diagnosis with HIV is .05 higher for males than females. If you are a relative frequentist, you might report that, all things being equal, males have a 5% higher rate of diagnosis than females. Of course, confidence intervals help us to understand precision of all point estimates and are easily calculated.
I think probability statements are more easily consumed than odds ratios. Just a personal preference.
Thanks for that feedback.
Actually, I am working with probabilities. (It may have been misleading that I entered the
question under the topic that included the term “odds ratios”, but that topic was the only
hit I got with the search term “logistic regression”.)
Anyway, my predictor variable is continuous, and my outcome variables, call them y-1 and
y-2, range from 0% to 100%, as you’ve suggested. Graphing x against y-1 gives one
sigmoidal curve (typical of the logistic function), while graphing x against y-2 gives
another sigmoid curve.
I supposed I could just have the two sigmoid curves in a single figure. So, at a given value
of x, one could read off that there’s, say, a 30% probability of outcome y-1 and a 45%
probability of y-2 (maybe they’re supposed to add to 100% — I don’t know).
Maybe what bothers me about the above idea is that I’d be plotting each value of x twice,
once against each of the two y’s. Just brainstorming, I wonder about something analogous
to the trilinear (aka triangular) plot, which allows you to plot x vs y vs z, all in a single
point. But I have no idea what this would look like for a logistic function. Maybe it would
be a mess.
Any thoughts on this or other ideas on how else one might present such data graphically?
Thanks.
Would any of the kindly contributors be able to help me learn how to create this plot in Stata, as CJ Alverson has done
for SAS above?
Gratefully,
Marlow Macht
I’ve had luck drawing these sorts of graphs in Stata using a combination of overlayed “twoway” plots. First, create a
dataset where each observation has a point estimate, upper bound, and lower bound. Assign an indicator to each
variable corresponding to your choice of horizontal or vertical scaling (e.g., _n). Then use “rspike” for the 95%
confidence interval and “scatter” for the point estimate.
In Stata syntax, this looks something like: twoway (scatter estimate n )( rspike upper lower, horizontal), xline(1)