Jian He created YARN-2456:

             Summary: Possible deadlock in CapacityScheduler when RM is 
recovering apps
                 Key: YARN-2456
                 URL: https://issues.apache.org/jira/browse/YARN-2456
             Project: Hadoop YARN
          Issue Type: Sub-task
            Reporter: Jian He
            Assignee: Jian He

Consider this scenario:
1. RM is configured with a single queue and only one application can be active 
at a time.
2. Submit App1 which uses up the queue's whole capacity
3. Submit App2 which remains pending.
4. Restart RM.
5. App2 is recovered before App1, so App2 is added to the activeApplications 
list. Now App1 remains pending (because of max-active-app limit)
6. All containers of App1 are now recovered when NM registers, and use up the 
whole queue capacity again.
7. Since the queue is full, App2 cannot proceed to allocate AM container.
8. In the meanwhile, App1 cannot proceed to become active because of the 
max-active-app limit 

This message was sent by Atlassian JIRA

Reply via email to