Popular Tools in 5 points
Octave
- This is an Open Source project.
- Easy to write complex Machine Learning equations.
- Vectorization helps with easy manipulations for matrices operations.
- Python like CLI
- Limited to small data sets (cannot handle BIGDATA).
R
- This is an Open Source project.
- R can create graphics to be displayed on the screen or saved to file. It can also prepare models that can be queried and updated.
- R is a tool to use when you need to analyze data, plot data or build a statistical model for data.
- Build with an idea of statistic centric design for computation.
- This Blog covers almost every thing: http://machinelearningmastery.com/what-is-r/
Python
- NumPy and Pandas are two most recommended libraries to get you started with data manipulation and some statistical calculations.
- IPython notebook is also gaining popularity now a days.
- Sci-kit learn and Tensor-Flow are machine learning libraries that are available which makes model building simpler for everyone.
- Python is more popular choice amongst the programmers.
Apache Mahout
- Open source Scalable Machine learning platform.
- Runs multiple map-reduce jobs to run a machine learning algorithm.
- Its build over top of Hadoop.
- Its batch processing.
Apache Spark
- Open source Big Data platform.
- MLlib is the machine learning library available here.
- It is in-memory processing, that's why faster than Mahout.
- It supports micro-batch processing.