Monday, May 21, 2012

National Registry of Exonerations charts with R

According to recent news ( there is a new release of a public national database for wrongful convictions.  There are plenty of details in the public list including Age, Race, and how the conviction was overturned.  According to the database it seems that most of the convictions were overturned due to DNA evidence.

I thought it would be interesting to plot summaries of the details using the open source statistical computing environment R Project.  The following are the plots from the National Registry of Exonerations database.

Here is the R code used to create the above pie charts.

# National Registry of Exonerations
# pie charts


u <- ""

listu <- readHTMLTable(u)

exondf <- listu[[7]]
data <- exondf[24:nrow(exondf),]
names(data) <- as.character(unlist(exondf[4,]))

# transform data
data$Age <- droplevels(data$Age)
data$Race <- droplevels(data$Race)
data$State <- droplevels(data$State)
data$Crime <- droplevels(data$Crime)
data$Sentence <- droplevels(data$Sentence)
data$Convicted <- droplevels(data$Convicted)
data$Exonerated <- droplevels(data$Exonerated)

data$AgeCNV <- as.numeric(as.character(data$Age))
data$ConvictedCNV <- as.numeric(as.character(data$Convicted))
data$ExoneratedCNV <- as.numeric(as.character(data$Exonerated))

data$AgeCNV_floor <- floor(data$AgeCNV/10)*10
data$ConfinedYrs <- data$ExoneratedCNV - data$ConvictedCNV
data$ConfinedYrs_floor <- floor(data$ConfinedYrs/5)*5

# plot pie charts

LABELS <- c("10-19","20-29","30-39","40-49","50-59","60-69","")
pie(table(data$AgeCNV_floor), labels=LABELS, main="Age Exonerated")

pie(table(data$Race), main="Race")

pie(tail(sort(table(data$State)),10), main="Top 10 States")

LABELS <- c("0-4","5-9","10-14","15-19","20-24","25-29","30-34","35+")
pie(table(data$ConfinedYrs_floor), labels=LABELS, main="Years Confined")