Posted in Tips, Uncategorized | Posted on 27-05-2015
I am excited to say that the blog post that I have co-authored Avoiding The Mess In The Hadoop Cluster (Part 1) has been published by GetInData and and Apache Software Foundation.
In the first part of this blog series, we describe possible open-source solutions for data cataloguing, data discovery and process scheduling such as Apache Hive, HCatalog and Apache Falcon.
If interested, please read more at Avoiding The Mess In The Hadoop Cluster (Part 1).
Recently, I have been refactoring one of our Hive scripts. Because, I introduced significant changes, a query is resource-intensive (process almost two terabytes of data) and … I wanted to iterate fast, I decided to test it locally.
To make it easier, I implemented Beetest – a super simple utility that helps you to test your Apache Hive scripts locally without any Java knowledge.
Posted in Presentations | Posted on 24-11-2013
I am very happy to present the slides from my presentation at Strata + Hadoop World 2013.
The presentation is titled ” Hadoop adventures at Spotify” and I am simply talking about five real-world Hadoop issues that either broke our cluster at Spotify or made it very unstable. Each story comes from our JIRA dashboard and is based on facts! ;) To make it even more engaging, I am exposing real graphs, numbers, even our emails and conversations. For each story, I am sharing the mistakes that we made and I am describing the lessons that we learned.
This includes also the mistake that I made and I do not like to talk about, but today I will share it as well ;)
Mysterious Mass Murder
This is one of the most bloodcurling (and my favorites) stories, that we have recently seen in our 190-square-meter Hadoopland. In a nutshell, some jobs were surprisingly running extremely long, because thousands of their tasks were constantly being killed for some unknown reasons by someone (or something).
For example, a photo, taken by our detectives, shows a job running for 12hrs:20min that spawned around 13,000 tasks until that moment. However (only) 4,118 of map tasks had finished successfully, while 8,708 were killed (!) and … surprisingly only 1 task failed (?) – obviously spreading panic in the Hadoopland.