Friday, December 31, 2010
Video of Joy of Stats by Hans Rosling
http://www.gapminder.org/videos/the-joy-of-stats/
Hans Rosling's passion for statistics is infectious. He definitely has a joy about him that persuades the viewer to really enjoy finding new and invigorating ways to explore data. Now for me this is not hard to do as I love data and analyzing. Yet for many in the world mathematics, let alone statistics, is considered a universe all unto its own that they dare not search. Hans breaks down that barrier with The Joy of Stats. No matter your educational interests or background I find it very hard to ignore his plea that statistics is not boring, and dare I say it, sexy.
If you are interested in this video as a eulogy to statistics you would also enjoy Dr. Robert Lewis's essay on Mathematics. Both of these works explain how a world without number analysis is merely a world not worth living. There is so much to explore in so little time. I am so happy that I decided to take a career in Engineering and Operations Research to help the world one datum at a time.
Tuesday, December 28, 2010
Top IEORTools Blog Articles of 2010
1. Favorite Operations Research books from OR-Exchange
2. R references for handling Big Data
3. IEORTools Tutorial: Learning XML with R
4. My 5 Favorite Operations Research Blogs
5. Where to find good data sets
A lot of the pages had to do with using the statistical computing software R. I'm also a contributor to the R-bloggers website so that has a lot to do with the traffic. I'm excited to see what 2011 will have in store for the OR blogging world. Happy New Year to the Operations Research community.
Wednesday, December 22, 2010
An essay on Mathematics and education
Educating Math to our newer generations is definitely a concern. I really like how Dr. Lewis explains that education is not just about transfer of information but the understanding of underlying principles of specific knowledge. The parables are a very clever device to relay those principles of math.
I also love how he portrays Math as not just a device for the technologically minded but also for the liberal arts. Dr. Lewis conveys that Math is not merely knowing numbers but the processes of finding solutions. My own example is when people often ask me how I am so good at math. I usually tell them its just like learning a language. Once you understand the language and are fluent then you can start applying it in everyday life. Math is a language to learn just as much a foreign language. It may take some time to learn but it will take a lifetime to master.
I highly recommend reading this essay. I also recommend saving this essay for our future generations, teachers, educators, family members, and friends. This essay can be used to help bridge understanding that may be missing from our own words.
Thursday, December 16, 2010
New OpenOpt/FuncDesigner quarterly release
New OpenOpt and FuncDesigner quarterly release is out: 0.32.
OpenOpt:
* New class: LCP (and related solver)
* New QP solver: qlcp
* New NLP solver: sqlcp
* New large-scale NSP solver gsubg. Currently it still requires lots of improvements (especially for constraints - their handling is very premature yet and often fails), but since the solver sometimes already works better than ipopt, algencan and other competitors it was tried with, I decided to include the one into the release.
* Now SOCP can handle Ax <= b constraints (and bugfix for handling lb <= x <= ub has been committed) * Some other fixes and improvements
FuncDesigner:
* Add new functions removeAttachedConstraints, min and max
* Systems of nonlinear equations: possibility to assign personal tolerance for an equation
* Some fixes and improvements (especially for automatic differentiation)
Where to find good data sets
Freebase
An all-things graph database. The website focuses on trends of certain cultural and interest topics.
Amazon Public Data Sets
Amazon is probably considered the cloud computing mecca next to Google. Amazon Web Services offers a lot. One of which is storage of public data sets. They offer a huge variety of public data.
Windows Azure Data Marketplace
Surprisingly Microsoft has an open data protocol data source. This data market offers quite a few points of interest data sets.
Yahoo Query Language
YQL is an interesting API that is very similar to SQL. YQL is essentially a language that allows to grab data from cloud services. This could be very handy to grabbing data quickly and dynamically. YQL offers to connect to a lot of data sources as well.
Infochimps
Infochimps is a data marketplace warehouse. They offer to host, sell, and distribute data sets. Some of their data comes at a cost but a lot of their data is free as well. This is an interesting startup and will be very interesting to follow their growth. Also there is a new Infochimps R package that uses their API to gather data and process Infochimps data.
DBpedia
DBpedia is a wikipedia for data sets. In fact the data itself comes from Wikipedia.
Some other sources not from the article include the World Bank open data and the U.S. Census data.
Sunday, December 12, 2010
Shortest past algorithm solved by ants?
There has already been some algorithms developed out of studying the ants. One method is the Ant Colony Optimisation (ACO) algorithms. Ants solve the complex problem of shortest path by communicating to other ants in the colony by pheromone trails. Each ant leaves a pheromone trail as a signal back to a following ant. The trail has a certain "optimal path" signal telling other ants the best way to get to the intended destination.
It would be really interesting to find out that the best shortest path algorithm might have been literally under our noses the entire time. This will be an interesting study to follow for the Operations Research community.
Wednesday, December 8, 2010
2 years of blogging with IEOR Tools
An update to the blog is that I'm starting to contribute Amazon content to the site. Amazon has been a valuable resource for linking books on content matter. I've thought about adding a website that will be a "store" or compilation of some of the better resources with Amazon being a partner. I thought I would bring this up with the readers first to see if this would be a valuable addition to this blog. It would be a clearinghouse or aggregator for all the best tools and resources in Operations Research, Industrial Engineering, Analytics and Data Mining. I'm not sure there is anything on the internet besides doing searches in Google or Amazon. I hope the site would be nice layout to help easily find resources.
Since it is the holiday season I would like to send my warmest regards to all those reading. I thank you so much for your readership. I wish you and your family a safe and happy holidays.
Tuesday, December 7, 2010
Big Data Logistic Regression with R and ODBC
A great alternative to performing usual logistic regression analyses on big data is using the biglm package. Biglm performs the same regression optimization but processes the data in "chunks" at a time. This allows R to only perform calculations on smaller data sets without the need for large memory allocations to the computer. Biglm also has an interesting option that it not only can perform calculations on imported dataframes and text files but also database connectivity. This is where the helpful package RODBC comes in to the aid.
I have be looking all over the R support lists and blogs in hopes of finding a good tutorial using biglm and RODBC. I was not successful yet I was able to find out how to perform this myself.