Hakuna MapData! » tasktracker
rss

Be map slot or not to be: that is the question!

| Posted in Monitoring |

0

A couple months ago, we got an email from Chris:

Hi!

The Hadoop cluster has been a bit slow the past few days and I noticed that the bottleneck seems to be coming from the map tasks. We have separate map and reduce task capacities and it continuously looks like the mapper slots are all taken while there’s a surplus of open reduce slots. Is there any reason that we can’t open any of the free reduce slots to map tasks?

Regards,
Chris

Two memory-related issues on the Apache Hadoop cluster (memory swapping and the OOM killer)

| Posted in Monitoring, Troubleshooting |

0

In this blog post, I will describe two memory-related issues that we have recently experienced on our 190-node Apache Hadoop cluster at Spotify.

Hadoop Unreachable Nodes Jira Ticket

We have noticed that some nodes were suddenly marked dead by both NameNode and JobTracker. Although we could ping them, we were unable to ssh into them, what often suggests some really heavy load on these machines. When looking at Ganglia graphs, we have discovered that all nodes that were marked dead have one common issue – a heavy swapping (in case of Apache Hadoop, the practice shows that a heavy swapping of JVM process usually means some kind of unresponsiveness and/or even the “death”).

Servers swapping