[ https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072414#comment-15072414 ]
Jian He commented on YARN-4138: ------------------------------- I think it may be true that this will lead to dead lock. - CapacityScheduler#allocateContainersToNode will grab scheduler lock and then SchedulerApp's lock at LeafQueue#assignContainers. - CapacityScheduler#rollbackContainerResource first acquires SchedulerApp's lock and then scheduler lock. -- This will also happen when AM calls CapacityScheduler#allocate to decrease the container. This is introduced in YARN-1651. I had a [comment|https://issues.apache.org/jira/browse/YARN-1651?focusedCommentId=14738568&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14738568] earlier that every AM allocate call will hold scheduler and queue's lock,which is too expensive, but missed that this may lead to deadlock. > Roll back container resource allocation after resource increase token expires > ----------------------------------------------------------------------------- > > Key: YARN-4138 > URL: https://issues.apache.org/jira/browse/YARN-4138 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, nodemanager, resourcemanager > Reporter: MENG DING > Assignee: MENG DING > Attachments: YARN-4138-YARN-1197.1.patch, > YARN-4138-YARN-1197.2.patch, YARN-4138.3.patch > > > In YARN-1651, after container resource increase token expires, the running > container is killed. > This ticket will change the behavior such that when a container resource > increase token expires, the resource allocation of the container will be > reverted back to the value before the increase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)