[
https://issues.apache.org/jira/browse/YARN-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494344#comment-16494344
]
Hudson commented on YARN-8362:
------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14312 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/14312/])
YARN-8362. Bugfix logic in container retries in node manager.
(eyang: rev 135941e00d762a417c3b4cc524cdc59b0d1810b1)
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestSlidingWindowRetryPolicy.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java
> Number of remaining retries are updated twice after a container failure in NM
> ------------------------------------------------------------------------------
>
> Key: YARN-8362
> URL: https://issues.apache.org/jira/browse/YARN-8362
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Chandni Singh
> Assignee: Chandni Singh
> Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8362.001.patch, YARN-8362.002.patch
>
>
> The {{shouldRetry(int errorCode)}} in {{ContainerImpl}} with YARN-5015 also
> updated some fields in retry context- remaining retries, restart times.
> This method is directly called from outside the ContainerImpl class as well-
> {{ContainerLaunch.setContainerCompletedStatus}}. This causes following
> problems:
> # remainingRetries are updated more than once after a failure. if
> {{maxRetries = 1}}, then a retry will not be triggered because of multiple
> calls to {{shouldRetry(int errorCode).}}
> # Writes to {{retryContext}} should be protected and called when the write
> lock is held.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]