Beckham007 created YARN-2169:
--------------------------------
Summary: NMSimulator of sls should catch more Exception
Key: YARN-2169
URL: https://issues.apache.org/jira/browse/YARN-2169
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Beckham007
In the method middleStep() of NMSimulator , sending heart beat may cause
InterruptedException or other Exception if the load is heavily. If not handler
these exceptions, the task of NMSimulator cloud not add to the executor queue
again. So the NM will lost.
In my situation, the pool size is 4000, nm size is 2000, and am is 1500. Some
NMs will lost.
--
This message was sent by Atlassian JIRA
(v6.2#6252)