[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934439#comment-14934439 ]
Hadoop QA commented on YARN-4180: --------------------------------- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | patch | 0m 1s | The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. | | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12764142/YARN-4180-branch-2.7.2.txt | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9735afe | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9291/console | This message was automatically generated. > AMLauncher does not retry on failures when talking to NM > --------------------------------------------------------- > > Key: YARN-4180 > URL: https://issues.apache.org/jira/browse/YARN-4180 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.7.1 > Reporter: Anubhav Dhoot > Assignee: Anubhav Dhoot > Priority: Critical > Attachments: YARN-4180-branch-2.7.2.txt, YARN-4180.001.patch, > YARN-4180.002.patch, YARN-4180.002.patch, YARN-4180.002.patch > > > We see issues with RM trying to launch a container while a NM is restarting > and we get exceptions like NMNotReadyException. While YARN-3842 added retry > for other clients of NM (AMs mainly) its not used by AMLauncher in RM causing > there intermittent errors to cause job failures. This can manifest during > rolling restart of NMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)