Hakuna MapData! » 2013 » May
rss

Slides: Apache Hadoop YARN, NameNode HA, HDFS Federation

| Posted in Presentations |

0

The slides from my presentation about new features available in Apache Hadoop (YARN, NameNode HA, HDFS Federation) that I gave at DataKRK meetup (Krakow, Poland) in April, 2013.

Hope you will find it useful!

A user having surprising troubles running more resource-intensive Hive queries

| Posted in Troubleshooting |

0

The problem

A couple of months ago, one of our data analysts pernamently run into troubles when he wanted to run more resource-intensive Hive queries. Surprisingly, his queries were valid, syntactically-correct and run successfully on small data, but they just failed on larger datasets. On the other hand, other users were able to run the same queries successfully on the same large datasets. Obviously, it sounds like some permissions problem, however the user had right HDFS and Hive permissions.

JobTracker slowness, guesstimation and a data-driven answer

| Posted in Monitoring, Troubleshooting |

0

JobTracker slowness issue

The problem

A couple of weeks ago, we got a JIRA ticket complaining about JobTracker being super slow (while it used to be super snappy most of the time). Obviously in such a situation, developers and analysts are a bit annoyed because it results in difficulties in submitting and tracking status of MapReduce jobs (however, the side effect is having a time for unplanned coffee break, what should not be so bad ;)) Anyway, we are also a bit ashamed and sad, because we aim for a perfect Hadoop cluster and no unplanned coffee breaks interruptions.

Two memory-related issues on the Apache Hadoop cluster (memory swapping and the OOM killer)

| Posted in Monitoring, Troubleshooting |

0

In this blog post, I will describe two memory-related issues that we have recently experienced on our 190-node Apache Hadoop cluster at Spotify.

Hadoop Unreachable Nodes Jira Ticket

We have noticed that some nodes were suddenly marked dead by both NameNode and JobTracker. Although we could ping them, we were unable to ssh into them, what often suggests some really heavy load on these machines. When looking at Ganglia graphs, we have discovered that all nodes that were marked dead have one common issue – a heavy swapping (in case of Apache Hadoop, the practice shows that a heavy swapping of JVM process usually means some kind of unresponsiveness and/or even the “death”).

Servers swapping