Hakuna MapData! » Blog Archive » JobTracker slowness, guesstimation and a data-driven answer
rss

JobTracker slowness, guesstimation and a data-driven answer

| Posted in Monitoring, Troubleshooting |

JobTracker slowness issue

The problem

A couple of weeks ago, we got a JIRA ticket complaining about JobTracker being super slow (while it used to be super snappy most of the time). Obviously in such a situation, developers and analysts are a bit annoyed because it results in difficulties in submitting and tracking status of MapReduce jobs (however, the side effect is having a time for unplanned coffee break, what should not be so bad ;)) Anyway, we are also a bit ashamed and sad, because we aim for a perfect Hadoop cluster and no unplanned coffee breaks interruptions.

The observations

After a quick look at an unresponsive JobTracker web interface (yes, it took a while!), we have discovered that very large jobs are running on the cluster. Only the largest one, at that time (a Hive query, by the way), run 58,538 tasks and it aimed to process 21.82 TB of data (1.5 year of data from one of our dataset). Apart from this one, there were dozen or so jobs that were running tens of thousands of tasks, as well as tens of jobs running thousands of tasks. Most of them were ad-hoc Hive jobs – mostly because it is really easy to implement a large Hive job unintentionally, even if hive.mapred.mode is set to strict (however we does not use the strict more currently). Another (not so much surprising) observation was the fact that the very large Hive queries spawned much more map tasks (because they processed huge datasets) than reduce tasks (because it is limited by hive.exec.reducers.max).

The dillema

To attack this problem, we decided to limit the number of tasks per job using mapred.jobtracker.maxtasks.per.job property. Unfortunately, there are no properties for limiting the number of map and reduce tasks separately (it would be very useful for us since the JobTracker was overloaded by map tasks). We had a small dilemma when we needed to choose what value we should set mapred.jobtracker.maxtasks.per.job to, and it resulted in following real-life dialogue:

Casper: What value for should we use?
Gustav: Maybe 50K
Casper: Why not 30K?
Gustav: Because we might be running large production jobs and it is safer to start with a high value and eventually decrease it later.
Casper: Hmmm? Sounds good…! But maybe 40K, not 50K? ;)
Gustav: OK. Let’s meet in the middle…

The previous approach is a great example of “guesstimation” where negotiation skills usually win. The perfect song for such a moment could be one by The Black Eyed Peas ;)

The question

Yes, the dialogue above did happen, but quickly after one more question was asked – Should a real data-driven company make such a decision based on a guess or data? Actually, the answer can be only one – data!.

The answer

To find data! we have looked at logs from JobTracker. Based on 521,313 jobs from last 70 days, we did found that our largest production job creates 22,609 tasks (and it will be increasing each day, because our its input dataset grow each day). In consequence, the jobs that create more than 22,609 tasks are ad-hoc jobs (in our case, mostly Hive queries or Python streaming jobs implemented in Python MapReduce module of Luigi, an open-sourced scheduler – please watch the video and browse the code). What is really interesting, these ad-hoc jobs usually fail or is killed for some reason (most often due to some parsing error, out of memory exception or a kill command invoked by user). To give just one example – a job created 56,024 tasks and it was running for 4h:24min and … it was killed by a user (probably because he/she expected it to fail soon, or he/she could not wait longer any more and simply started the same job on smaller input ;). Now with mapred.jobtracker.maxtasks.per.job set to 23,000 tasks, we will “completely prevent from” such a heroic death ;) Just to give some more insights, our failed (or killed) jobs, that are larger than 23,000 task, fail (or are killed) 30 minutes after the submission on average, so that with mapred.jobtracker.maxtasks.per.job we save a lot of “cluster” time that would be mostly wasted otherwise.

Having mapred.jobtracker.maxtasks.per.job set, one will get an exception on the JobTracker web interface, when he/she tries to submit a job larger than a limit.

Hadoop job_201305010824_140585 on jobtracker
 
User: kawaa
Job Name: select * from end_song_cleaned where dt...10(Stage-1)
Job File: hdfs://namenode:54310/user/kawaa/.staging/job_201305010824_140585/job.xml
Submit Host: gina
Submit Host Address: 10.255.3.21
Job-ACLs: All users are allowed
Job Setup:None
Status: Failed
Failure Info:Job initialization failed: java.io.IOException:
The number of tasks for this job 42987 exceeds the configured limit 23000
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:717)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3750)
at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob
.run(EagerTaskInitializationListener.java:79) at java.util.concurrent.ThreadPoolExecutor$Worker
.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker
.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
Started at: Sun May 19 03:20:59 UTC 2013
Failed at: Sun May 19 03:21:00 UTC 2013
Failed in: 1sec
Job Cleanup:None

The possible future

Probably, sooner than later, a very important and large enough job will not be allowed to start due to such an exception. How will we respond to it? First we will try to reduce the number of map and reduce tasks for this job by increasing the values of mapred.max.split.size and hive.exec.reducers.bytes.per.reducer properties respectively. If it does not help, we will probably reconsider increasing mapred.jobtracker.maxtasks.per.job (unfortunately, it would require restarting the JobTracker). Fortunately, we have seen no such an email or a JIRA ticket so far, which might mean that everybody has adopted to the limit and/or is able to successfully tune the number of tasks.

VN:F [1.9.20_1166]

Rate this post!

Rating: 5.0/5 (2 votes cast)
JobTracker slowness, guesstimation and a data-driven answer, 5.0 out of 5 based on 2 ratings

Comments

comments

Comments are closed.