[ 
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072414#comment-15072414
 ] 

Jian He commented on YARN-4138:
-------------------------------

I think it may be true that this will lead to dead lock.
- CapacityScheduler#allocateContainersToNode will grab scheduler lock and then 
SchedulerApp's lock at LeafQueue#assignContainers.
- CapacityScheduler#rollbackContainerResource first acquires SchedulerApp's 
lock and then scheduler lock.  
-- This will also happen when AM calls CapacityScheduler#allocate to decrease 
the container. This is introduced in YARN-1651. I had a 
[comment|https://issues.apache.org/jira/browse/YARN-1651?focusedCommentId=14738568&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14738568]
  earlier that every AM allocate call will hold scheduler and queue's 
lock,which is too expensive, but missed that this may lead to deadlock. 

> Roll back container resource allocation after resource increase token expires
> -----------------------------------------------------------------------------
>
>                 Key: YARN-4138
>                 URL: https://issues.apache.org/jira/browse/YARN-4138
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, nodemanager, resourcemanager
>            Reporter: MENG DING
>            Assignee: MENG DING
>         Attachments: YARN-4138-YARN-1197.1.patch, 
> YARN-4138-YARN-1197.2.patch, YARN-4138.3.patch
>
>
> In YARN-1651, after container resource increase token expires, the running 
> container is killed.
> This ticket will change the behavior such that when a container resource 
> increase token expires, the resource allocation of the container will be 
> reverted back to the value before the increase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to