Saturday, December 15, 2012

OpenOpt Suite release 0.43


I'm glad to inform you about new OpenOpt release 0.43 (2012-Dec-15):

    * interalg now can solve SNLE in 2nd mode (parameter dataHandling = "raw", before - only "sorted")
    * Many other improvements for interalg
    * Some improvements for FuncDesigner kernel
    * FuncDesigner ODE now has 3 arguments instead of 4 (backward incompatibility!), e.g. {t: np.linspace(0,1,100)} or mere np.linspace(0,1,100) if your ODE right side is time-independend
    * FuncDesigner stochastic addon  now can handle some problems with gradient-based NLP / NSP solvers
    * Many minor improvements and some bugfixes

Visit  openopt.org  for more details.

Regards, D.


Tuesday, September 25, 2012

Day in the life of a Data Scientist

A great read from the Decomposition blog about the day in the life of a Data Scientist.  I consider myself a Data Scientist by any other name.  The blog article by Sean does a great job of breaking down the essence of making better decisions for the organization you may be involved.

I've always thought asking good questions is the start of good analysis.  The organizations basically doesn't know what it doesn't know.  A good Data Scientist will be a like a sleuth looking for clues.  In all honesty that may be the most fun about being a Data Scientist.

Saturday, September 15, 2012

OpenOpt Suite release 0.42


Hi all,

I'm glad to inform you about new OpenOpt Suite release 0.42 (2012-Sept-15), fa free Python-written cross-platform software with primal focus on numerical optimization. Main changes:

*    Some improvements for solver interalg, including handling of categorical variables
*    Some parameters for solver gsubg
*    Speedup objective function for de and pswarm on FuncDesigner models
*    New global (GLP) solver: asa (adaptive simulated annealing)
*    Some new classes for network problems: TSP (traveling salesman problem), STAB (maximum graph stable set)], MCP (maximum clique problem)
*    Improvements for FD XOR (and now it can handle many inputs)
*    Solver de has parameter "seed", also, now it works with PyPy
*    Function sign now is available in FuncDesigner
*    FuncDesigner interval analysis (and thus solver interalg) now can handle non-monotone splines of 1st order
*    FuncDesigner now can handle parameter fixedVars as Python dict
*    Now scipy InterpolatedUnivariateSpline is used in FuncDesigner interpolator() instead of UnivariateSpline. This creates backward incompatibility - you cannot pass smoothing parameter (s) to interpolator no longer.
*    SpaceFuncs: add Point weight, Disk, Ball and method contains(), bugfix for importing Sphere, some new examples
*    Some improvements (essential speedup, new parameter interpolate for P()) for our (currently commercial) FuncDesigner Stochastic Programming addon
*    Some bugfixes

In our website ( http://openopt.org ) you could vote for most required OpenOpt Suite development direction(s) (poll has been renewed, previous results are here).

Regards, D.

Monday, September 10, 2012

Upgrade your skill sets with free courses

We are in the midst of the Insight Age.  We have moved beyond capturing data and are now processing information.  Properly processing the large amounts of data requires knowlege and skill sets.  Fortunately there are many ways to develop those skills.

Class Central is a website that provides a complete list of free online courses from some of the most established and prestigious universities in the world.  Websites like these are helping to make the world smaller by providing free and accessible learning resources.

I am a big fan of open courseware.  There are plenty of other places to look for open coureses.  The Open Courseware Consortium is a useful resource.  A good metasearch site like OpenCourseWare Finder is valuable as well. 

Monday, July 2, 2012

Popularity of R continues

No doubt those that read my blog know that the tools I use to do my Industrial Engineering and Operations Research work heavily rely on the open source side of software.  That is why I try to support as many open source projects such as COIN-OR, GLPK, and OpenOpt.  One tool that I love to perform Applied Math and Statistics is the statistical computing platform R.  So it comes as no surprise that I like to see how R is growing and its popularity among programmers.

A recent blog from RedMonk produced results of a programming language popularity study.  The study involved ranking popularity using common social media online sites such as Stack Overflow and GitHub.  These sites draw in a lot of programmers for their popularity around Q&A and code review.  I was surprised to see that R ranks highly compared to some very prominant programming languages.


Also interesting to note that the only other "Data Science" type of programming language I could find was Matlab.  As far as I could tell SAS, S, SPSS, Stata are still rather popular but apparently not among the programming community.

Friday, June 15, 2012

OpenOpt Suite 0.39

Hi all,

I'm glad to inform you about new OpenOpt release 0.39 (quarterly since 2007).

OpenOpt is free, even for commercial purposes, cross-platform software for mathematical modeling and (mainstream) optimization. Our website have reached 259 visitors daily, that is same to tomopt.com and ~ 1/3 of gams.com ( details ).

In the new release:
  • interalg (medium-scaled solver with specifiable accuracy abs(f-f*) <= fTol): add categorical variables and general logical constraints, many other improvements
  • Some improvements for automatic differentiation
  • DerApproximator and some OpenOpt/FuncDesigner functionality now works with PyPy (Python with dinamic compilation, some problems are solved several times faster now)
  • New solver lsmr for dense/sparse LLSP (linear least squares)
  • Some bugfixes and some other changes
In our website (openopt.org) you could vote for most required OpenOpt Suite development direction(s).

Monday, May 21, 2012

National Registry of Exonerations charts with R

According to recent news (dallasnews.com) there is a new release of a public national database for wrongful convictions.  There are plenty of details in the public list including Age, Race, and how the conviction was overturned.  According to the database it seems that most of the convictions were overturned due to DNA evidence.

I thought it would be interesting to plot summaries of the details using the open source statistical computing environment R Project.  The following are the plots from the National Registry of Exonerations database.



Here is the R code used to create the above pie charts.


# National Registry of Exonerations
# pie charts

library(XML)

u <- "http://www.law.umich.edu/special/exoneration/Pages/detaillist.aspx"

listu <- readHTMLTable(u)

exondf <- listu[[7]]
data <- exondf[24:nrow(exondf),]
names(data) <- as.character(unlist(exondf[4,]))

# transform data
data$Age <- droplevels(data$Age)
data$Race <- droplevels(data$Race)
data$State <- droplevels(data$State)
data$Crime <- droplevels(data$Crime)
data$Sentence <- droplevels(data$Sentence)
data$Convicted <- droplevels(data$Convicted)
data$Exonerated <- droplevels(data$Exonerated)

data$AgeCNV <- as.numeric(as.character(data$Age))
data$ConvictedCNV <- as.numeric(as.character(data$Convicted))
data$ExoneratedCNV <- as.numeric(as.character(data$Exonerated))

data$AgeCNV_floor <- floor(data$AgeCNV/10)*10
data$ConfinedYrs <- data$ExoneratedCNV - data$ConvictedCNV
data$ConfinedYrs_floor <- floor(data$ConfinedYrs/5)*5

# plot pie charts

LABELS <- c("10-19","20-29","30-39","40-49","50-59","60-69","")
pie(table(data$AgeCNV_floor), labels=LABELS, main="Age Exonerated")

pie(table(data$Race), main="Race")

pie(tail(sort(table(data$State)),10), main="Top 10 States")

LABELS <- c("0-4","5-9","10-14","15-19","20-24","25-29","30-34","35+")
pie(table(data$ConfinedYrs_floor), labels=LABELS, main="Years Confined")

Wednesday, April 4, 2012

Google Scholar Metrics

Google Scholar, the Google produced search method for scholarly journals and publications, has a new way of tracking publication metrics.  Google Scholar Metrics for publications gives an indexed look at the publishers and figure out which publishers are cited the most.  Google Scholar Metrics will allow the searcher to find publishers that are well respected in any given field of research.  As an example here are the Top 100 Google Scholar publishers.

So I naturally want to see what Google thinks of some of the fields that I find interesting and useful.

Operations Research
  1.  European Journal of Operational Research
  2.  Computers & Operations Research
  3.  Operations Research
  4.  Journal of the Operational Research Society
  5.  Annals of Operations Research
Machine Learning
  1. The Journal of Machine Learning Research
  2. Annual International Conference on Machine Learning
  3. Machine Learning
  4. European Conference on Machine learning and knowledge discovery in databases
  5. International Conference on Machine Learning and Cybernetics
Applied Statistics
  1. The Annals of Applied Statistics
  2.  Journal of the Royal Statistical Society: Series C (Applied Statistics)
  3.  Journal of Applied Statistics
  4.  QUALITY CONTROL AND APPLIED STATISTICS
  5.  International Journal of Applied Mathematics and Statistics
Management Science
  1.  Management Science
  2.  Pest Management Science (????)
  3.  Health Care Management Science
  4.  Conflict Management and Peace Science (????)
  5.  Computational Management Science



The Management Science category looks like it needs a lot of work.  I didn't know there were so many other forms of Management Science.  I guess that it is too generic of a term and the age old debate continues.
























Thursday, March 15, 2012

OpenOpt Suite 0.38

I'm glad to inform you about new OpenOpt Suite quarter release 0.38, free (BSD license) Python-written software:

OpenOpt:

interalg can handle discrete variables (see MINLP for examples)
interalg can handle multiobjective problems (MOP)
interalg can handle problems with parameters fixedVars/freeVars
Many interalg improvements and some bugfixes
Add another EIG solver: numpy.linalg.eig
New LLSP solver pymls with box bounds handling

FuncDesigner:

Some improvements for sum()
Add funcs tanh, arctanh, arcsinh, arccosh
Can solve EIG built from derivatives of several functions, obtained by automatic differentiation by FuncDesigner

SpaceFuncs:

Add method point.symmetry(Point|Line|Plane)
Add method LineSegment.middle
Add method Point.rotate(Center, angle)

DerApproximator:

Minor changes

See also: FuturePlans.

Regards, D.

Thursday, February 2, 2012

R graphic used for Facebook IPO

Apparently former Facebook intern, Paul Butler,  graphic of the Facebook social network graph is being used for Facebook's IPO.  The social network graphic is featured on Page 7 of the IPO filing.  His graphic was featured on mashable and R-bloggers not too long ago.  The graphic is of Facebook connections between city centers around the world.  Paul used an ingenious method of color transparency and great circle arcs to display the social network graph.

This is just one of the really cool things you can do with R.  Not only is R used as a visual medium but also to calculate the great circle paths.  This is really neat to see R in such a high profile setting.  If you want to learn more about R you can read an IEORTools post about R links for beginners on World Statistics Day.  Also there are many books that you can buy on R programming at the IEORTools Online Store.

Thursday, January 12, 2012

Should science be open

Two interesting articles appeared this week in some blogs I frequent about technology and science.  The first is an Op-ed in the New York Times titled Research Bought, Then Paid For and the next is Open Science: why is it so hard?  The two articles are a different take on the idea that scientific findings should be open for everyone.  Someone who is outside the scientific community might think that statement is silly.  Of course science is open.  No one has a copyright or a monopoly on scientific or mathematical discoveries.  Yet that is not the real issue.  The real issue is the access to those scientific discoveries.  In some cases the scientific discoveries are paid for by public subsidies.

The main focus of those two articles is that science has been hijacked by the publishers.  The articles even go so far as saying the hijacking is a monopoly of sorts.  I think monopoly is too strong of an analogy but the publishers do have a lot of control.  The control is mostly about access to the science.  The publishers own the copyright and can limit access to anyone unless a fee is paid.  A lot of the times these fees are rather high.  Now it looks like with the Research Works Act the access to publicly funded scientific research will be limited as well.  Access to the science is the crux of the debate.

Academics rely on publishing of their scientific findings for further funding of their research.  It is part of the academic circle of life.  Publishing begets more funding which begets more publishing and the cycle continues.  I do believe academic community deserves to get compensated for their research.  I'm not sure how much residual income they get other than peer review notoriety from their published content.  Publishers seem, again, to have a lot of the control. 

I am not an academic researcher.  My work is trying to help organizations better themselves by using the learning, skills, and knowledge I have acquired through the years as an Operations Research professional.  I try to keep up to date on the latest research and methods by studying journals, networking with colleagues, and reading articles.  I rely on scientific access quite a bit in staying up to date with the latest findings.  I rely on the academic community so I can improve my knowledge and skills.  Yet it seems very difficult for my to gain access to a lot of good research.  There has to be a common ground for access to the science.  I wish I had a simple solution to this issue but it seems very large and very complicated.  There are a lot of interactions that I am sure I am glossing over.  Yet I am a big fan of the idea of Open Science.

There are some publishers that do understand this problem.  INFORMS seems to get this issue rather well.  They do not charge a lot for their journals.  In fact as part of membership INFORMS allows two free subscriptions to any journal of your choosing.  In addition to that the PubsOnLine Suite is available for $99 which is 12 journals for a whole year.  That is a bargain compared to some other publishers.  So not all publishers are pure evil.  There are some good ones.

Monday, January 2, 2012

IEORTools.com Resources added

I've decided to spruce up my personal website IEORTools.com.  I want to add some additional resources to it along with the book store.  Most of the content will be relevant reference links to Industrial Engineering and Operations Research professionals.

The first thing I did was added a Resources side menu.  The Resources side menu will link to relevant resource sections.  So far I have created the following resources
These links are a collection of resources that I have accumulated over the years.  The links are a great reference and hopefully I can build them up more.  I'm going to be creating more content on ieortools.com site as opposed to the blog because I'm just running out of room.