[
https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167889#comment-15167889
]
Bikas Saha commented on YARN-1040:
----------------------------------
Vinod, the plan you are suggesting has merits. But my initial impression is
that reworking allocations and containers is a much bigger change than whats
proposed earlier in this jira. Not only internally in YARN but also externally
in terms of thinking about the whole larger flow of allocations and containers
for users of YARN.
The proposal discussed earlier is of much smaller scope and I believe
sufficient to take us where we need to go. And it does not need reworking the
RM related flow of allocations and containers. E.g. it may not be necessary for
the RM to understand single use allocations vs multi-use vs concurrent use
allocations. But for the RM level changes you are suggesting we may be on the
path of convergence.
At this point, the discussion is complex enough that we may want to gather
interested people and do it as a group outside jira comments and then post it
back.
> De-link container life cycle from the process and add ability to execute
> multiple processes in the same long-lived container
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-1040
> URL: https://issues.apache.org/jira/browse/YARN-1040
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
>
> The AM should be able to exec >1 process in a container, rather than have the
> NM automatically release the container when the single process exits.
> This would let an AM restart a process on the same container repeatedly,
> which for HBase would offer locality on a restarted region server.
> We may also want the ability to exec multiple processes in parallel, so that
> something could be run in the container while a long-lived process was
> already running. This can be useful in monitoring and reconfiguring the
> long-lived process, as well as shutting it down.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)