[ https://issues.apache.org/jira/browse/YARN-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14331960#comment-14331960 ]
Hudson commented on YARN-3238: ------------------------------ FAILURE: Integrated in Hadoop-trunk-Commit #7175 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7175/]) YARN-3238. Connection timeouts to nodemanagers are retried at multiple (xgong: rev 92d67ace3248930c0c0335070cc71a480c566a36) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java * hadoop-yarn-project/CHANGES.txt > Connection timeouts to nodemanagers are retried at multiple levels > ------------------------------------------------------------------ > > Key: YARN-3238 > URL: https://issues.apache.org/jira/browse/YARN-3238 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.6.0 > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Blocker > Fix For: 2.7.0 > > Attachments: YARN-3238.001.patch > > > The IPC layer will retry connection timeouts automatically (see Client.java), > but we are also retrying them with YARN's RetryPolicy put in place when the > NM proxy is created. This causes a two-level retry mechanism where the IPC > layer has already retried quite a few times (45 by default) for each YARN > RetryPolicy error that is retried. The end result is that NM clients can > wait a very, very long time for the connection to finally fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)