[ 
https://issues.apache.org/jira/browse/YARN-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307257#comment-15307257
 ] 

Varun Vasudev commented on YARN-5015:
-------------------------------------

Thanks for the patch [~hex108]! I think you probably need to change your 
approach if we want to unify the AM and container restart policies. I think 
what's required is a common class - something like 
SlidingWindowContainerRetryPolicy or something like that which takes a 
SlidingWindowContainerRetryContext consisting of the restart timestamps, the 
validity interval, the exit codes, the exit codes to ignore, and the remaining 
retry attempts. The SlidingWindowContainerRetryPolicy can then look at the 
various parameters and tell you whether to retry the container or not.

You can look at the RetryPolicies class in org.apache.hadoop.io.retry to get an 
idea of what I'm talking about.

Once you have the common class, we can modify the AM code to use the common 
class(probably as a follow up JIRA). Does that make sense?

> Unify restart policies across AM and container restarts
> -------------------------------------------------------
>
>                 Key: YARN-5015
>                 URL: https://issues.apache.org/jira/browse/YARN-5015
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Varun Vasudev
>            Assignee: Jun Gong
>         Attachments: YARN-5015.01.patch
>
>
> We support AM restart and container restarts - however the two have slightly 
> different capabilities. We should unify them. There's no reason for them to 
> be different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to