[
https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795480#comment-13795480
]
Bikas Saha commented on YARN-1040:
----------------------------------
This can be achieved in a backwards compatible manner in the following way
1) StartContainer request will have a new flag that says whether the container
is attached to a process or not. Default value is true for back-compat.
2) If the above flag is false then the container is completed on the NM only
when
a) the RM terminates the container (this currently happens today)
b) when the AM call StopContainer on that (this is currently supported)
The main change in the NM would be to not trigger end of container, ie keep the
container in a running state, when there is no process associated with the
container.
3) Create a new api called startProcess() that can be used to launch a new
process in a container. NM can dis-allow starting a process while a process is
already running for the first cut. This API would be secured using existing
AMNM token.
No changes are expected to be needed in the RM since the NM will continue to
report this container as running to the RM. This should be a fairly localised
NM-only change.
> Add ability to execute multiple programs in the same long-lived container
> -------------------------------------------------------------------------
>
> Key: YARN-1040
> URL: https://issues.apache.org/jira/browse/YARN-1040
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
>
> The AM should be able to exec >1 process in a container, rather than have the
> NM automatically release the container when the single process exits.
> This would let an AM restart a process on the same container repeatedly,
> which for HBase would offer locality on a restarted region server.
> We may also want the ability to exec multiple processes in parallel, so that
> something could be run in the container while a long-lived process was
> already running. This can be useful in monitoring and reconfiguring the
> long-lived process, as well as shutting it down.
--
This message was sent by Atlassian JIRA
(v6.1#6144)