R is my tool of choice for data visualization. My audience was a general audience so I didn't want to use boxplots or other density types of visualization methods. I wanted a simple mean and 95% (~ roughly 2 standard deviations) confidence around the mean. My method of choice was to use the

*dotchart*function. Yet that function is limited to showing the data points and not the dispersion of the data. So I needed to layer in the confidence intervals.

The great thing about R is that the functions and objects are pretty much layered. I can create one R object and add to it as I see fit. This is mainly true with most plotting functions in R. I knew that I could use the

*lines*function to add lines to an existing plot. This method worked great for my simplistic plot and adds another tool to my R toolbox.

Here is the example dotchart with confidence intervals R script using the "mtcars" dataset that is provided with any R installation.

x <- data.frame(mean=tapply(mtcars$mpg, list(mtcars$cyl), mean), sd=tapply(mtcars$mpg, list(mtcars$cyl), sd) )

### Add lower and upper levels of confidence intervals

x$LL <- x$mean-2*x$sd

x$UL <- x$mean+2*x$sd

### plot dotchart with confidence intervals

title <- "MPG by Num. of Cylinders with 95% Confidence Intervals"

dotchart(x$mean, col="blue", xlim=c(floor(min(x$LL)/10)*10, ceiling(max(x$UL)/10)*10), main=title )

for (i in 1:nrow(x)){

lines(x=c(x$LL[i],x$UL[i]), y=c(i,i))

}

grid()

And here is the example of the finished product.

## 2 comments:

I have blogged on something similar using ggplot2 (http://goo.gl/FLRnO), and Matt Shotwell blogged about doing this using lattice (http://goo.gl/JBlyy).

Abhijit

Thanks Abhijit for those additional resources.

Post a Comment