[ https://issues.apache.org/jira/browse/YARN-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116648#comment-15116648 ]
Wangda Tan commented on YARN-4519: ---------------------------------- Thanks [~mding] for working on this patch, IIUC, after this patch, increase/decrease container logic needs to acquire LeafQueue's lock. Since container allocation/release acquires Leafqueue's lock too, race condition of container/resource will be avoided. One question not related to the patch, it looks safe to remove synchronized lock of CS#completedContainerInternal, correct? > potential deadlock of CapacityScheduler between decrease container and assign > containers > ---------------------------------------------------------------------------------------- > > Key: YARN-4519 > URL: https://issues.apache.org/jira/browse/YARN-4519 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Reporter: sandflee > Assignee: MENG DING > Attachments: YARN-4519.1.patch, YARN-4519.2.patch, YARN-4519.3.patch > > > In CapacityScheduler.allocate() , first get FiCaSchedulerApp sync lock, and > may be get CapacityScheduler's sync lock in decreaseContainer() > In scheduler thread, first get CapacityScheduler's sync lock in > allocateContainersToNode(), and may get FiCaSchedulerApp sync lock in > FicaSchedulerApp.assignContainers(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)