Thursday, June 30, 2011

How not to do data visualization

I was glancing over Hacker News and came across an article from the Economist Daily Chart blog.  The daily chart was about nations debt management.  The following can be shown here.

Seems innocent enough.  It shows in declining order the debt per nation.  What a second?  Why is Ireland have more debt than USA? After reading the article more thoroughly it looks like it is a percentage of GDP.  What a third time?  Is the bar graph the percentage of GDP or the number in the white box a percentage of GDP?  And how does this relate to debt management?  So apparently in the article it explains the change in primary balance for each nation to be 60% of GDP.  So the bar graph is a % change of GDP to get to 60% of GDP.  Are we crystal?  I'm not sure I totally understand but that is my basic understanding.

Data visualization is important in Analytics and Operations Research.  We need to model real world applications quite a lot.  Often times there is no better way to do this than to use a chart or graph.  The real art is conveying the crux of the message to the recipient.  There is an internet meme devoted to the art of bad chart making.  I feel bad using the Economist as an example because after all I did finally (I think) come away with the right idea.  But still notice how there are no data or axis labels across the top of the chart.  Also the numbers in the white boxes are not given any units.  I'm still not sure if those numbers in the white box are a percentage or a debt value.  Sometimes the visual art clutters the real message.  It is important to make sure that recipient has the right frame of reference and can understand each graphic and label.

Sunday, June 26, 2011

Recommended Machine Learning blogs

I happened upon an OSQA site called metaoptimize about recommended Machine Learning blogs.  There were a lot of blogs listed on this site that I had not seen before so it really got me interested.

Good Machine Learning blogs

Machine Learning is the scientific process of developing algorithms for computers to evolve based on empirical data.  For instance one may develop a decision tree that helps predict a certain behavior from a data set.  The decision tree itself is just a method to predict behavior.  Yet perhaps more data can be acquired and more behaviors can be realized.  Then the decision tree is computed again based on the newer data (and perhaps combined with the older).  New behaviors are learned from the newer data and a new implementation of the decision tree is evolved for new behaviors.  This process becomes algorithmic and continues.

Machine Learning developed out of the field of Artificial Intelligence.  The idea of having computers learn has been around since as long as computers itself.  Machine Learning is really starting to develop as computing power has caught up to theory.  Machine Learning has a lot of uses and may be used by some of your favorite computer applications.  Some examples include product recommendation systems like Amazon or Netflix, search engines like Google or Bing.  Machine Learning is seeing practical uses in many places and its only just touching the surface.

Monday, June 20, 2011

Moneyball coming to the big screen

Recently I found out that the book Moneyball by Michael Lewis will be shown as a motion picture.  The Moneyball trailer can be viewed online.  In case you have never heard of Michael Lewis then you might have heard about the movie "The Blind Side" which he also wrote the accompanying book.  The book Moneyball is about the Oakland Athletics and how they used analytics and mathematcal know-how to turn around a professional baseball franchise.

The story centers around Billy Beane which is played by Brad Pitt in the movie.  Billy Beane is a professional ballplayer turned General Manager.  Billy Beane inherits the top organizational management job for the losing Oakland Athletics.  He is immediately frustrated with the same old losing ways and believes he needs to shake up the system.  He finds out about the curious world of baseball analytics or otherwise know as sabermetrics and hires a curious crew of young mathematically gifted folks.

The story is fascinating even if you are not a fan of baseball.  The use of mathematics to help make business decisions is nothing new.  Yet employing this analytics method to an industry that is deep rooted in old ways and practices is intriguing.  Changing the ways of the "good ole boy" network requires risk, knowledge, and sometimes good fortune.  This can translate to almost any industry or even organization.  I am most definitely looking forward to seeing this movie.

Thursday, June 16, 2011

OpenOpt Suite 0.34

I'm glad to inform you about new quarterly release 0.34 of our free OOSuite package software (OpenOpt, FuncDesigner, SpaceFuncs, DerApproximator) .

Main changes:

* Python 3 compatibility

* Lots of improvements and speedup for interval calculations

* Now interalg can obtain all solutions of nonlinear equation (example) or systems of them (example) in the involved box lb_i <= x_i <= ub_i (bounds can be very large), possibly constrained (e.g. sin(x) + cos(y+x) > 0.5 or [sin(i*x) + y/i < i for i in range(100)] )

* Many other improvements and speedup for interalg

Regards, D.

Monday, June 13, 2011

Analytics geeks win NBA championships

The Dallas Mavericks win their franchise first NBA Title.  They won their first championship by beating teams that everyone thought they could not beat.  The Mavericks were able to beat juggernauts like the Los Angeles Lakers, a fast paced Portland Trailblazers team, up-and-coming youngster superstars in the Oklahoma City Thunder and of course the Big Three from the Miami Heat.  As good as the Mavericks were executing on the basketball court they were equally as good executing a between-the-ears approach to basketball.  The Mavericks were able to win the game by studying the numbers of professional basketball.  Some of the champions on the Mavericks team may not be able to hit even 10% of three pointers but they sure know how to analyze a winning combination.

The analytics culture starts with Dallas Mavericks owner Mark Cuban.  According to ESPN when Mark Cuban was looking for a coach he studied games and found out that Rick Carlisle used the most efficient lineups most frequently.  Mark Cuban hiring Rick Carlisle to coach the Mavericks was a no-brainer because the numbers do not lie.  As for Rick Carlisle, he is known for being a very cerebral coach and very handy with crunching NBA statistics as well. 

Another known fact about the Dallas Mavericks is that they use an analytics staff to gain a competitive edge.  Most recently they have retained the NBA analytics stat guru Roland Beech of  In the past they had used the services of Wayne Winston, an Operations Research professor, to help analyze their lineups to be more competitive.

Mark Cuban gives a lot of attention to the geeks for Mavericks winning.  From the ESPN article

I give a lot of credit to Coach Carlisle for putting Roland on the bench and interfacing with him, and making sure we understood exactly what was going on. Knowing what lineups work, what the issues were in terms of play calls and training.

That is a lot of brainpower on the bench in every game.  It is good to see the geeks get their due.  Way to go Mavericks and looking forward to seeing what the geeks put on the court next season!