A typical day of a data engineer at Spotify revolves around Hadoop and music. However after some time of simultaneous developing MapReduce jobs, maintaining a large cluster and listening to perfect music for every moment, something surprising might happen…!
Well, after some time, a data engineer starts discovering Hadoop (and its related concepts) in the lyrics of many popular songs. How can Coldplay, Black Eyed Peas, Michael Jackson or Justin Timberlake sing about Hadoop?
Maybe it is some kind of illness? Definitely! A doctor could call it “inlusio elephans” ;)
Mysterious Mass Murder
This is one of the most bloodcurling (and my favorites) stories, that we have recently seen in our 190-square-meter Hadoopland. In a nutshell, some jobs were surprisingly running extremely long, because thousands of their tasks were constantly being killed for some unknown reasons by someone (or something).
For example, a photo, taken by our detectives, shows a job running for 12hrs:20min that spawned around 13,000 tasks until that moment. However (only) 4,118 of map tasks had finished successfully, while 8,708 were killed (!) and … surprisingly only 1 task failed (?) – obviously spreading panic in the Hadoopland.
Posted in Presentations | Posted on 30-05-2013
The slides from my presentation about new features available in Apache Hadoop (YARN, NameNode HA, HDFS Federation) that I gave at DataKRK meetup (Krakow, Poland) in April, 2013.
Hope you will find it useful!
Posted in Troubleshooting | Posted on 26-05-2013
A couple of months ago, one of our data analysts pernamently run into troubles when he wanted to run more resource-intensive Hive queries. Surprisingly, his queries were valid, syntactically-correct and run successfully on small data, but they just failed on larger datasets. On the other hand, other users were able to run the same queries successfully on the same large datasets. Obviously, it sounds like some permissions problem, however the user had right HDFS and Hive permissions.
A couple of weeks ago, we got a JIRA ticket complaining about JobTracker being super slow (while it used to be super snappy most of the time). Obviously in such a situation, developers and analysts are a bit annoyed because it results in difficulties in submitting and tracking status of MapReduce jobs (however, the side effect is having a time for unplanned coffee break, what should not be so bad ;)) Anyway, we are also a bit ashamed and sad, because we aim for a perfect Hadoop cluster and no unplanned
coffee breaks interruptions.