Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More
Articles
Create
SourceForge Podcast
Site Documentation
Subscribe to our Newsletter
Support Request
For Vendors
Help
Create
Join
Login
Business Software
Open Source Software
SourceForge Podcast
Resources
Articles
Case Studies
Blog
Menu
Help
Create
Join
Login
Home
Browse
NiCE
Wiki
NiCE Wiki
Modeling and Simulation made NiCE!
Brought to you by:
amccaskey
,
jayjaybillings
Summary
Files
Reviews
Support
Tickets ▾
Feature Requests
MOOSE Feature Requests
MOOSE Bugs
News
Discussion
Code
Wiki
Menu
▾
▴
Wiki Home
Browse Pages
Browse Labels
Formatting Help
Clustering_with_Hadoop
Authors:
There is a newer version of this page. You can find it
here
.
The Map-Reduce paradigm
http://cacm.acm.org/magazines/2010/1/.../fulltext
was explored, as it is imminent that the data will quickly become large-scale, using Hadoop
[<http://hadoop.apache.org/>]
(which is an open-source implementation of Map-Reduce).
As preliminary investigation kmeans clustering
http://nlp.stanford.edu/IR-book/html/.../k-means-1.html
available from mahout
[<http://mahout.apache.org/>]
was employed. The results were similar to the ones we published in the paper, "Knowledge Discovery from Nuclear Reactor Simulation Data".
Important resources to learn Map-Reduce are:
Book Chapter 2 in Mining of Massive Datasets by Anand Rajaraman and Jeffrey David Ullman
http://infolab.stanford.edu/~ullman/m.../ch2.pdf
.
The paper
http://cacm.acm.org/magazines/2010/1/.../fulltext
] and the wiki article
[<http://en.wikipedia.org/wiki/MapReduce>]
.
The tutorial
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html...
Here are the steps
Name
×
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.
Submit