[ https://issues.apache.org/jira/browse/YARN-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
MENG DING updated YARN-4671: ---------------------------- Attachment: YARN-4671.1.patch Attaching the initial patch for review. * I need to make {{lastNodeUpdateTime}} volatile, otherwise findbugs will complain that access to {{lastNodeUpdateTime}} are not synchronized all the time (as a result of removing the CS lock for {{completedContainerInternal}}) * Update the test case of {{testAllocateDoesNotBlockOnSchedulerLock}}. The test sequence is: ** Submit an application, and wait for AM to be launched ** AM registers with RM ** AM allocates a new container ** Wait until the container is acquired and launched ** Grab the CS scheduler lock from another thread ** AM allocates with a release request ** Without this fix, the allocate call would block at {{CapacityScheduler.completedContainerInternal}}. With this fix, the allocate call will not block > There is no need to acquire CS lock when completing a container > --------------------------------------------------------------- > > Key: YARN-4671 > URL: https://issues.apache.org/jira/browse/YARN-4671 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: MENG DING > Assignee: MENG DING > Attachments: YARN-4671.1.patch > > > In YARN-4519, we discovered that there is no need to acquire CS lock in > CS#completedContainerInternal, because: > * Access to critical section are already guarded by queue lock. > * It is not essential to guard {{schedulerHealth}} with cs lock in > completedContainerInternal. All maps in schedulerHealth are concurrent maps. > Even if schedulerHealth is not consistent at the moment, it will be > eventually consistent. > With this fix, we can truly claim that CS#allocate doesn't require CS lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)