Jason Lowe commented on YARN-3998:

Is this really a feature that YARN needs to provide?  To me this is basically a 
case of container re-use which the application itself can control.  A primitive 
example would be an application that launches a container that wraps the real 
task in a wrapper shell script or Java program that spawns the real task and 
will respawn it some number of times if the real task fails before failing the 
entire container.  I'm not sure YARN is the best place to put this 

> Add retry-times to let NM re-launch container when it fails to run
> ------------------------------------------------------------------
>                 Key: YARN-3998
>                 URL: https://issues.apache.org/jira/browse/YARN-3998
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Jun Gong
>            Assignee: Jun Gong
> I'd like to add a field(retry-times) in ContainerLaunchContext. When AM 
> launches containers, it could specify the value. Then NM will re-launch the 
> container 'retry-times' times when it fails to run(e.g.exit code is not 0). 
> It will save a lot of time. It avoids container localization. RM does not 
> need to re-schedule the container. And local files in container's working 
> directory will be left for re-use.(If container have downloaded some big 
> files, it does not need to re-download them when running again.) 
> We find it is useful in systems like Storm.

This message was sent by Atlassian JIRA

Reply via email to