[ https://issues.apache.org/jira/browse/YARN-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei Yan updated YARN-2526: -------------------------- Attachment: YARN-2526-1.patch > Scheduler Load Simulator may enter deadlock if lots of applications submitted > to the RM at the same time > -------------------------------------------------------------------------------------------------------- > > Key: YARN-2526 > URL: https://issues.apache.org/jira/browse/YARN-2526 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Wei Yan > Assignee: Wei Yan > Priority: Minor > Attachments: YARN-2526-1.patch > > > The simulation may enter deadlock if all application simulators hold all > threads provided by the thread pool, and all wait for AM container > allocation. In that case, all AM simulators wait for NM simulators to do > heartbeat to allocate resource, and all NM simulators wait for AM simulators > to release some threads. The simulator is deadlocked. > To solve this deadlock, need to remove the while() loop in the MRAMSimulator. > {code} > // waiting until the AM container is allocated > while (true) { > if (response != null && ! response.getAllocatedContainers().isEmpty()) { > // get AM container > ..... > break; > } > // this sleep time is different from HeartBeat > Thread.sleep(1000); > // send out empty request > sendContainerRequest(); > response = responseQueue.take(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)