What I have been up to.

Labels: ,

Its been a very rough ride so far. Have been busy with my thesis and had to spend the whole spring break doing some coding. No life you may, well, that's the way it is. I also watched quite a few movies in theaters and at home.

I have been doing a write on the first 3 chapters of my thesis. I have also implemented the Naive Bayes Classifier and after I have completely tested it, I am going to implement the mapreduce version.

I have been very happy about my progress in mastering mapreduce. Its almost like learning a new programming language : constantly looking up syntax, etc. Although I had written some mapreduce programs (mostly simple programs similar to the examples that accompany Hadoop), the real lessons came from my implementation of the kmeans algorithm.

One of the main problems I had was to figure out how to read a file. I am one of those people who learn by example and I had not seen any example that read a file. It took me a lot of digging aroung to figure out that this can by done through the config method of the mapper. There was also some other useful things I picked up especially how to write an iterative program.

Hadoop and Distributed Computing at Yahoo!

My Blog List