Sunday, May 15, 2011

R Tutorial: Add confidence intervals to dotchart

Recently I was working on a data visualization project.  I wanted to visualize summary statistics by category of the data.  Specifically I wanted to see a simple dispersion of data with confidence intervals for each category of data. 

R is my tool of choice for data visualization.  My audience was a general audience so I didn't want to use boxplots or other density types of visualization methods.  I wanted a simple mean and 95% (~ roughly 2 standard deviations) confidence around the mean.  My method of choice was to use the dotchart function.  Yet that function is limited to showing the data points and not the dispersion of the data.  So I needed to layer in the confidence intervals. 

The great thing about R is that the functions and objects are pretty much layered.  I can create one R object and add to it as I see fit.  This is mainly true with most plotting functions in R.  I knew that I could use the lines function to add lines to an existing plot.  This method worked great for my simplistic plot and adds another tool to my R toolbox.

Here is the example dotchart with confidence intervals R script using the "mtcars" dataset that is provided with any R installation.


### Create data frame with mean and std dev
x <- data.frame(mean=tapply(mtcars$mpg, list(mtcars$cyl), mean), sd=tapply(mtcars$mpg, list(mtcars$cyl), sd) )

###  Add lower and upper levels of confidence intervals
x$LL <- x$mean-2*x$sd
x$UL <- x$mean+2*x$sd

### plot dotchart with confidence intervals

title <- "MPG by Num. of Cylinders with 95% Confidence Intervals"

dotchart(x$mean, col="blue", xlim=c(floor(min(x$LL)/10)*10, ceiling(max(x$UL)/10)*10), main=title )

for (i in 1:nrow(x)){
    lines(x=c(x$LL[i],x$UL[i]), y=c(i,i))
}
grid()






And here is the example of the finished product.

2 comments:

Abhijit said...

I have blogged on something similar using ggplot2 (http://goo.gl/FLRnO), and Matt Shotwell blogged about doing this using lattice (http://goo.gl/JBlyy).

Abhijit

Larry said...

Thanks Abhijit for those additional resources.