[ 
https://issues.apache.org/jira/browse/YARN-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128491#comment-14128491
 ] 

Hudson commented on YARN-2526:
------------------------------

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1867 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1867/])
YARN-2526. SLS can deadlock when all the threads are taken by AMSimulators. 
(Wei Yan via kasha) (kasha: rev 28d99db99236ff2a6e4a605802820e2b512225f9)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/MRAMSimulator.java


> SLS can deadlock when all the threads are taken by AMSimulators
> ---------------------------------------------------------------
>
>                 Key: YARN-2526
>                 URL: https://issues.apache.org/jira/browse/YARN-2526
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler-load-simulator
>    Affects Versions: 2.5.1
>            Reporter: Wei Yan
>            Assignee: Wei Yan
>            Priority: Critical
>             Fix For: 2.6.0
>
>         Attachments: YARN-2526-1.patch
>
>
> The simulation may enter deadlock if all application simulators hold all 
> threads provided by the thread pool, and all wait for AM container 
> allocation. In that case, all AM simulators wait for NM simulators to do 
> heartbeat to allocate resource, and all NM simulators wait for AM simulators 
> to release some threads. The simulator is deadlocked.
> To solve this deadlock, need to remove the while() loop in the MRAMSimulator.
> {code}
>     // waiting until the AM container is allocated
>     while (true) {
>       if (response != null && ! response.getAllocatedContainers().isEmpty()) {
>         // get AM container
>         .....
>         break;
>       }
>       // this sleep time is different from HeartBeat
>       Thread.sleep(1000);
>       // send out empty request
>       sendContainerRequest();
>       response = responseQueue.take();
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to