Posted in Reading | Posted on 19-05-2014
I am happy to say that my blog post “Introduction To YARN” has been published by IBM developerWorks. Please find the abstract of the article below:
Apache Hadoop is currently one of the most popular tools for big data processing. It has been successfully deployed in production by many companies for several years. Though Hadoop is considered as a reliable, scalable, and cost-effective solution, it is constantly being improved by a large community of developers. As a result, the 2.0 version offers several revolutionary features including YARN, HDFS Federation, and a highly-available NameNode which make the Hadoop cluster much more efficient,powerful, and reliable. In this article, learn about the advantages YARN provides over the previous version of the distributed processing layer in Hadoop.
Read more at IBM developerWorks.
At Spotify, we have a company-wide culture of celebrating successes and … failures. Because we want to iterate fast, we do realize that failures can happen. On the other hand, we can not afford to make the same mistake more than once. One way of preventing from that is sharing our failures, mistakes and learning across the company.
Today however, I would like to share my failures … outside of the company ;) While my failures relate to my recent work with Apache Hadoop cluster, I think that the lessons that I have learned are generic enough, so that many people can benefit from them.
Posted in Presentations | Posted on 24-11-2013
I am very happy to present the slides from my presentation at Strata + Hadoop World 2013.
The presentation is titled ” Hadoop adventures at Spotify” and I am simply talking about five real-world Hadoop issues that either broke our cluster at Spotify or made it very unstable. Each story comes from our JIRA dashboard and is based on facts! ;) To make it even more engaging, I am exposing real graphs, numbers, even our emails and conversations. For each story, I am sharing the mistakes that we made and I am describing the lessons that we learned.
This includes also the mistake that I made and I do not like to talk about, but today I will share it as well ;)
Posted in Monitoring | Posted on 06-10-2013
A couple months ago, we got an email from Chris:
The Hadoop cluster has been a bit slow the past few days and I noticed that the bottleneck seems to be coming from the map tasks. We have separate map and reduce task capacities and it continuously looks like the mapper slots are all taken while there’s a surplus of open reduce slots. Is there any reason that we can’t open any of the free reduce slots to map tasks?