[ 
https://issues.apache.org/jira/browse/YARN-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494344#comment-16494344
 ] 

Hudson commented on YARN-8362:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14312 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14312/])
YARN-8362.  Bugfix logic in container retries in node manager.           
(eyang: rev 135941e00d762a417c3b4cc524cdc59b0d1810b1)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestSlidingWindowRetryPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java


> Number of remaining retries are updated twice after a container failure in NM 
> ------------------------------------------------------------------------------
>
>                 Key: YARN-8362
>                 URL: https://issues.apache.org/jira/browse/YARN-8362
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>            Priority: Critical
>             Fix For: 3.2.0, 3.1.1
>
>         Attachments: YARN-8362.001.patch, YARN-8362.002.patch
>
>
> The {{shouldRetry(int errorCode)}} in {{ContainerImpl}} with YARN-5015 also 
> updated some fields in retry context- remaining retries, restart times.
> This method is directly called from outside the ContainerImpl class as well- 
> {{ContainerLaunch.setContainerCompletedStatus}}. This causes following 
> problems:
>  # remainingRetries are updated more than once after a failure. if 
> {{maxRetries = 1}}, then a retry will not be triggered because of multiple 
> calls to {{shouldRetry(int errorCode).}}
>  # Writes to {{retryContext}} should be protected and called when the write 
> lock is held.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to