[
https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167738#comment-15167738
]
Bikas Saha commented on YARN-1040:
----------------------------------
My guess is that YARN-4725 may be redundant after we do this work because then
we would have exposed primitives to apps to make that happen. The arguments for
YARN not doing it by itself would be the same. If it can be done easily by the
app and is very likely app dependent without one-size-fits-all then let the app
do it.
Coming back to this jira. Yes, lets please track any first-class support of the
notion of upgrades separately which can be done as a follow up.
Perhaps we can put the design in a document and look at the next level of
details. We can send email to the dev list after adding a more detailed
document to this jira. Then, based on +ve feedback, we could go ahead with
jiras/code. The devil is in the details :) This would be a significant change
and we could use more eyes for reviews.
For startProcess identifier, it may be useful for the app to provide the
identifier in startProcess and then use it later to refer to the process. E.g.
stopProcess. vs YARN trying to come up with identifiers. This may make the apps
life easier because it could use meaningful terms based on its own logic. We
can discuss such details in the design document.
> De-link container life cycle from the process and add ability to execute
> multiple processes in the same long-lived container
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-1040
> URL: https://issues.apache.org/jira/browse/YARN-1040
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
>
> The AM should be able to exec >1 process in a container, rather than have the
> NM automatically release the container when the single process exits.
> This would let an AM restart a process on the same container repeatedly,
> which for HBase would offer locality on a restarted region server.
> We may also want the ability to exec multiple processes in parallel, so that
> something could be run in the container while a long-lived process was
> already running. This can be useful in monitoring and reconfiguring the
> long-lived process, as well as shutting it down.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)