Wednesday, February 4, 2009
Open source data mining tools
Data mining is a necessary task for most Industrial Engineers and Operations Research analysts. Often times the analyst is forced to comb through a lot of data in order to gather information. There are many tools, often proprietary, to attack the piles of data. Some examples of great data mining tools include Excel or Access. Other more complicated tools include Microsfoft SQL Server or Oracle databases.
The Open Source community has been no stranger to developing tools for data mining. There are many free software equivalents for data mining for any analyst. Data can be mined in spreadsheets such as Openoffice Calc or KSpread. They can be compatible with windows files such as .xls for quick transfer from other data sources. These office suites are free to download on multiple operating systems.
Databases have been developed extensively in the open source community. Openoffice has an application similar to Access called Base. There are many free software database servers available for use including MySQL and PostgreSQL. There are numerous free software SQL clients including Knoda and GNOME-DB. PhpMyAdmin is a web-based MySQL client that can be setup with a webserver, such as Apache and interfaced with an internet browser.
There are also free software applications developed especially for data mining. These free software options include Weka and RapidMiner. These tools can be used as stand alone clients that can connect to any number of data souces, including proprietary ones.