MENG DING commented on YARN-4519:

Hi, [~leftnoteasy]

bq. IIUC, after this patch, increase/decrease container logic needs to acquire 
LeafQueue's lock. Since container allocation/release acquires Leafqueue's lock 
too, race condition of container/resource will be avoided.
Yes, exactly.

bq. One question not related to the patch, it looks safe to remove synchronized 
lock of CS#completedContainerInternal, correct?
I think we don't need to synchronize the entire function with cs lock, only the 
part that updates the {{schedulerHealth}}. If you think this is worth fixing, I 
will log a separate ticket.

> potential deadlock of CapacityScheduler between decrease container and assign 
> containers
> ----------------------------------------------------------------------------------------
>                 Key: YARN-4519
>                 URL: https://issues.apache.org/jira/browse/YARN-4519
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: sandflee
>            Assignee: MENG DING
>         Attachments: YARN-4519.1.patch, YARN-4519.2.patch, YARN-4519.3.patch
> In CapacityScheduler.allocate() , first get FiCaSchedulerApp sync lock, and 
> may be get CapacityScheduler's sync lock in decreaseContainer()
> In scheduler thread,  first get CapacityScheduler's sync lock in 
> allocateContainersToNode(), and may get FiCaSchedulerApp sync lock in 
> FicaSchedulerApp.assignContainers(). 

This message was sent by Atlassian JIRA

Reply via email to