[ https://issues.apache.org/jira/browse/YARN-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tsuyoshi OZAWA updated YARN-2945: --------------------------------- Description: After YARN-2910, assignContainer hold WriteLock while sorting and ReadLock while referencing runnableApps. This can cause interrupted assignment of containers regardless of the policy. {code} writeLock.lock(); try { Collections.sort(runnableApps, comparator); } finally { writeLock.unlock(); } readLock.lock(); try { for (FSAppAttempt sched : runnableApps) { if (SchedulerAppUtils.isBlacklisted(sched, node, LOG)) { continue; } assigned = sched.assignContainer(node); if (!assigned.equals(Resources.none())) { break; } } } finally { readLock.unlock(); } {code} was: After YARN-2910, assignContainer hold WriteLock while sorting and ReadLock while referencing runnableApps. This can cause interrupted assignment of containers regardless of the result of policy. {code} writeLock.lock(); try { Collections.sort(runnableApps, comparator); } finally { writeLock.unlock(); } readLock.lock(); try { for (FSAppAttempt sched : runnableApps) { if (SchedulerAppUtils.isBlacklisted(sched, node, LOG)) { continue; } assigned = sched.assignContainer(node); if (!assigned.equals(Resources.none())) { break; } } } finally { readLock.unlock(); } {code} > FSLeafQueue should hold lock before and after sorting runnableApps in > assignContainer > ------------------------------------------------------------------------------------- > > Key: YARN-2945 > URL: https://issues.apache.org/jira/browse/YARN-2945 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Tsuyoshi OZAWA > Assignee: Tsuyoshi OZAWA > Attachments: YARN-2945.1.patch > > > After YARN-2910, assignContainer hold WriteLock while sorting and ReadLock > while referencing runnableApps. This can cause interrupted assignment of > containers regardless of the policy. > {code} > writeLock.lock(); > try { > Collections.sort(runnableApps, comparator); > } finally { > writeLock.unlock(); > } > readLock.lock(); > try { > for (FSAppAttempt sched : runnableApps) { > if (SchedulerAppUtils.isBlacklisted(sched, node, LOG)) { > continue; > } > assigned = sched.assignContainer(node); > if (!assigned.equals(Resources.none())) { > break; > } > } > } finally { > readLock.unlock(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)