Saturday, December 15, 2012

OpenOpt Suite release 0.43


I'm glad to inform you about new OpenOpt release 0.43 (2012-Dec-15):

    * interalg now can solve SNLE in 2nd mode (parameter dataHandling = "raw", before - only "sorted")
    * Many other improvements for interalg
    * Some improvements for FuncDesigner kernel
    * FuncDesigner ODE now has 3 arguments instead of 4 (backward incompatibility!), e.g. {t: np.linspace(0,1,100)} or mere np.linspace(0,1,100) if your ODE right side is time-independend
    * FuncDesigner stochastic addon  now can handle some problems with gradient-based NLP / NSP solvers
    * Many minor improvements and some bugfixes

Visit  openopt.org  for more details.

Regards, D.


Tuesday, September 25, 2012

Day in the life of a Data Scientist

A great read from the Decomposition blog about the day in the life of a Data Scientist.  I consider myself a Data Scientist by any other name.  The blog article by Sean does a great job of breaking down the essence of making better decisions for the organization you may be involved.

I've always thought asking good questions is the start of good analysis.  The organizations basically doesn't know what it doesn't know.  A good Data Scientist will be a like a sleuth looking for clues.  In all honesty that may be the most fun about being a Data Scientist.

Saturday, September 15, 2012

OpenOpt Suite release 0.42


Hi all,

I'm glad to inform you about new OpenOpt Suite release 0.42 (2012-Sept-15), fa free Python-written cross-platform software with primal focus on numerical optimization. Main changes:

*    Some improvements for solver interalg, including handling of categorical variables
*    Some parameters for solver gsubg
*    Speedup objective function for de and pswarm on FuncDesigner models
*    New global (GLP) solver: asa (adaptive simulated annealing)
*    Some new classes for network problems: TSP (traveling salesman problem), STAB (maximum graph stable set)], MCP (maximum clique problem)
*    Improvements for FD XOR (and now it can handle many inputs)
*    Solver de has parameter "seed", also, now it works with PyPy
*    Function sign now is available in FuncDesigner
*    FuncDesigner interval analysis (and thus solver interalg) now can handle non-monotone splines of 1st order
*    FuncDesigner now can handle parameter fixedVars as Python dict
*    Now scipy InterpolatedUnivariateSpline is used in FuncDesigner interpolator() instead of UnivariateSpline. This creates backward incompatibility - you cannot pass smoothing parameter (s) to interpolator no longer.
*    SpaceFuncs: add Point weight, Disk, Ball and method contains(), bugfix for importing Sphere, some new examples
*    Some improvements (essential speedup, new parameter interpolate for P()) for our (currently commercial) FuncDesigner Stochastic Programming addon
*    Some bugfixes

In our website ( http://openopt.org ) you could vote for most required OpenOpt Suite development direction(s) (poll has been renewed, previous results are here).

Regards, D.

Monday, September 10, 2012

Upgrade your skill sets with free courses

We are in the midst of the Insight Age.  We have moved beyond capturing data and are now processing information.  Properly processing the large amounts of data requires knowlege and skill sets.  Fortunately there are many ways to develop those skills.

Class Central is a website that provides a complete list of free online courses from some of the most established and prestigious universities in the world.  Websites like these are helping to make the world smaller by providing free and accessible learning resources.

I am a big fan of open courseware.  There are plenty of other places to look for open coureses.  The Open Courseware Consortium is a useful resource.  A good metasearch site like OpenCourseWare Finder is valuable as well. 

Monday, July 2, 2012

Popularity of R continues

No doubt those that read my blog know that the tools I use to do my Industrial Engineering and Operations Research work heavily rely on the open source side of software.  That is why I try to support as many open source projects such as COIN-OR, GLPK, and OpenOpt.  One tool that I love to perform Applied Math and Statistics is the statistical computing platform R.  So it comes as no surprise that I like to see how R is growing and its popularity among programmers.

A recent blog from RedMonk produced results of a programming language popularity study.  The study involved ranking popularity using common social media online sites such as Stack Overflow and GitHub.  These sites draw in a lot of programmers for their popularity around Q&A and code review.  I was surprised to see that R ranks highly compared to some very prominant programming languages.


Also interesting to note that the only other "Data Science" type of programming language I could find was Matlab.  As far as I could tell SAS, S, SPSS, Stata are still rather popular but apparently not among the programming community.

Friday, June 15, 2012

OpenOpt Suite 0.39

Hi all,

I'm glad to inform you about new OpenOpt release 0.39 (quarterly since 2007).

OpenOpt is free, even for commercial purposes, cross-platform software for mathematical modeling and (mainstream) optimization. Our website have reached 259 visitors daily, that is same to tomopt.com and ~ 1/3 of gams.com ( details ).

In the new release:
  • interalg (medium-scaled solver with specifiable accuracy abs(f-f*) <= fTol): add categorical variables and general logical constraints, many other improvements
  • Some improvements for automatic differentiation
  • DerApproximator and some OpenOpt/FuncDesigner functionality now works with PyPy (Python with dinamic compilation, some problems are solved several times faster now)
  • New solver lsmr for dense/sparse LLSP (linear least squares)
  • Some bugfixes and some other changes
In our website (openopt.org) you could vote for most required OpenOpt Suite development direction(s).

Monday, May 21, 2012

National Registry of Exonerations charts with R

According to recent news (dallasnews.com) there is a new release of a public national database for wrongful convictions.  There are plenty of details in the public list including Age, Race, and how the conviction was overturned.  According to the database it seems that most of the convictions were overturned due to DNA evidence.

I thought it would be interesting to plot summaries of the details using the open source statistical computing environment R Project.  The following are the plots from the National Registry of Exonerations database.



Here is the R code used to create the above pie charts.


# National Registry of Exonerations
# pie charts

library(XML)

u <- "http://www.law.umich.edu/special/exoneration/Pages/detaillist.aspx"

listu <- readHTMLTable(u)

exondf <- listu[[7]]
data <- exondf[24:nrow(exondf),]
names(data) <- as.character(unlist(exondf[4,]))

# transform data
data$Age <- droplevels(data$Age)
data$Race <- droplevels(data$Race)
data$State <- droplevels(data$State)
data$Crime <- droplevels(data$Crime)
data$Sentence <- droplevels(data$Sentence)
data$Convicted <- droplevels(data$Convicted)
data$Exonerated <- droplevels(data$Exonerated)

data$AgeCNV <- as.numeric(as.character(data$Age))
data$ConvictedCNV <- as.numeric(as.character(data$Convicted))
data$ExoneratedCNV <- as.numeric(as.character(data$Exonerated))

data$AgeCNV_floor <- floor(data$AgeCNV/10)*10
data$ConfinedYrs <- data$ExoneratedCNV - data$ConvictedCNV
data$ConfinedYrs_floor <- floor(data$ConfinedYrs/5)*5

# plot pie charts

LABELS <- c("10-19","20-29","30-39","40-49","50-59","60-69","")
pie(table(data$AgeCNV_floor), labels=LABELS, main="Age Exonerated")

pie(table(data$Race), main="Race")

pie(tail(sort(table(data$State)),10), main="Top 10 States")

LABELS <- c("0-4","5-9","10-14","15-19","20-24","25-29","30-34","35+")
pie(table(data$ConfinedYrs_floor), labels=LABELS, main="Years Confined")