R is my tool of choice for data visualization. My audience was a general audience so I didn't want to use boxplots or other density types of visualization methods. I wanted a simple mean and 95% (~ roughly 2 standard deviations) confidence around the mean. My method of choice was to use the dotchart function. Yet that function is limited to showing the data points and not the dispersion of the data. So I needed to layer in the confidence intervals.
The great thing about R is that the functions and objects are pretty much layered. I can create one R object and add to it as I see fit. This is mainly true with most plotting functions in R. I knew that I could use the lines function to add lines to an existing plot. This method worked great for my simplistic plot and adds another tool to my R toolbox.
Here is the example dotchart with confidence intervals R script using the "mtcars" dataset that is provided with any R installation.
x <- data.frame(mean=tapply(mtcars$mpg, list(mtcars$cyl), mean), sd=tapply(mtcars$mpg, list(mtcars$cyl), sd) )
### Add lower and upper levels of confidence intervals
x$LL <- x$mean-2*x$sd
x$UL <- x$mean+2*x$sd
### plot dotchart with confidence intervals
title <- "MPG by Num. of Cylinders with 95% Confidence Intervals"
dotchart(x$mean, col="blue", xlim=c(floor(min(x$LL)/10)*10, ceiling(max(x$UL)/10)*10), main=title )
for (i in 1:nrow(x)){
lines(x=c(x$LL[i],x$UL[i]), y=c(i,i))
}
grid()
And here is the example of the finished product.
2 comments:
I have blogged on something similar using ggplot2 (http://goo.gl/FLRnO), and Matt Shotwell blogged about doing this using lattice (http://goo.gl/JBlyy).
Abhijit
Thanks Abhijit for those additional resources.
Post a Comment