Skip to Content

You are currently visiting an old archive website with limited functionality. If you are looking für the current Berlin Buzzwords Website, please visit https://berlinbuzzwords.de

Urania Berlin, June 6-7, 2011

Composing Mahout clustering jobs

Location: 
Loft
Date and time: 
Tue, 2011-06-07 15:50 - 16:20
Speaker: 
Frank Scholten

Clustering is a popular technique to analyze and understand large corpora and is a key feature of for instance Google News. This talk introduces you to clustering, how it's implemented in Mahout and it will show you step-by-step, how to compose a sequence of Mahout jobs in Java to cluster text. Additionally, it will show you how to tweak your chain of Mahout jobs and how it affects clustering results. This will be talk suitable for people having some experience with Hadoop and perhaps Mahout. Knowledge of clustering is not required.