Jason Lowe commented on YARN-2263:

1 is an appropriate lower bound since we don't ever want the maximum number of 
applications for a user to be zero or less.  (That would be a worthless queue 
since we could submit jobs to it but no jobs would activate.) 

I'm assuming it only causes a deadlock in the case where the active job submits 
and waits for the completion of other jobs?  If it simply submits jobs and 
exits then even if the queue is so tiny that only 1 active job per user is 
allowed then the jobs should eventually complete (assuming sufficient resources 
to launch an AM _and_ at least one task simultaneously if this is MapReduce).

If the concern is that the queue can be too small to allow running more than 
one application simultaneously for a user and some app frameworks might not 
like that, then yes that could be an issue.  However I'm not sure that is 
YARN's problem to solve.  I could have an application framework that for 
whatever reason requires 10 jobs to be running simultaneously to work.  There 
could definitely be a queue config that will not allow that to run properly 
because the queue is too small to support 10 simultaneous applications by a 
single user.  Should YARN handle this scenario?  If so, how would it detect it, 
and what should it do to mitigate it?  I would argue the same applies to the 
simpler job-launching-job-and-waiting scenario.  Some queues are going to be 
too small to support that.

Users can work around issues like this with smarter queue setups.  This is 
touched upon in MAPREDUCE-4304 and elsewhere for the Oozie case which is a 
similar scenario.  We can setup a separate queue for the launcher jobs separate 
from a queue where the other jobs run.  That way we can't accidentally fill the 
cluster/queue with just launcher jobs and deadlock.

> CSQueueUtils.computeMaxActiveApplicationsPerUser may cause deadlock for 
> nested MapReduce jobs
> ---------------------------------------------------------------------------------------------
>                 Key: YARN-2263
>                 URL: https://issues.apache.org/jira/browse/YARN-2263
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 0.23.10, 2.4.1
>            Reporter: Chen He
> computeMaxActiveApplicationsPerUser() has a lower bound "1". For a nested 
> MapReduce job which files new mapreduce jobs in its mapper/reducer, it will 
> cause job stuck.
> public static int computeMaxActiveApplicationsPerUser(
>       int maxActiveApplications, int userLimit, float userLimitFactor) {
>     return Math.max(
>         (int)Math.ceil(
>             maxActiveApplications * (userLimit / 100.0f) * userLimitFactor),
>         1);
>   }

This message was sent by Atlassian JIRA

Reply via email to