Wei Yan created YARN-2526:
-----------------------------
Summary: Scheduler Load Simulator may enter deadlock if lots of
applications submitted to the RM at the same time
Key: YARN-2526
URL: https://issues.apache.org/jira/browse/YARN-2526
Project: Hadoop YARN
Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan
The simulation may enter deadlock if all application simulators hold all
threads provided by the thread pool, and all wait for AM container allocation.
In that case, all AM simulators wait for NM simulators to do heartbeat to
allocate resource, and all NM simulators wait for AM simulators to release some
threads. The simulator is deadlocked.
To solve this deadlock, need to remove the while() loop in the MRAMSimulator.
{code}
// waiting until the AM container is allocated
while (true) {
if (response != null && ! response.getAllocatedContainers().isEmpty()) {
// get AM container
.....
break;
}
// this sleep time is different from HeartBeat
Thread.sleep(1000);
// send out empty request
sendContainerRequest();
response = responseQueue.take();
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)