[
https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15221902#comment-15221902
]
Konstantinos Karanasos commented on YARN-2883:
----------------------------------------------
Let me make some clarifications to help, [~kasha].
In order to decide when to start a container from the queue, we need to know
the available resources, and this is knowledge that the
{{ContainersMonitorImpl}} only has. This is the reason I have added the queues
to the Monitor.
Moreover, for the NM to estimate and send its expected queue wait time to the
RM (to eventually help with overcommitment or queuing from the RM to the NMs),
it is much more convenient to have both running and queued containers at the
same class.
On the other hand, I do agree that the {{ContainersMonitorImpl}} should have a
more passive role. However, even at the moment, the Monitor is capable of
killing containers (when they exceed their allotted resources), so its role is
not that passive either. I kept the same logic by not allowing the Monitor to
actually start or stop containers, but rather inform the
{{ContainerManagerImpl}} to do so.
That said, if we were to refactor a big part of the NM code, we could make
things even cleaner. Going further, this is what has been proposed in YARN-4597.
bq. Also, ContainersMonitorImpl will then have state that needs to be persisted
for a work-preserving NM restart.
If I'm not wrong, that should not be an issue, because I am keeping the queued
containers in the Context.
> Queuing of container requests in the NM
> ---------------------------------------
>
> Key: YARN-2883
> URL: https://issues.apache.org/jira/browse/YARN-2883
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Reporter: Konstantinos Karanasos
> Assignee: Konstantinos Karanasos
> Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch,
> YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch,
> YARN-2883-trunk.008.patch, YARN-2883-yarn-2877.001.patch,
> YARN-2883-yarn-2877.002.patch, YARN-2883-yarn-2877.003.patch,
> YARN-2883-yarn-2877.004.patch
>
>
> We propose to add a queue in each NM, where queueable container requests can
> be held.
> Based on the available resources in the node and the containers in the queue,
> the NM will decide when to allow the execution of a queued container.
> In order to ensure the instantaneous start of a guaranteed-start container,
> the NM may decide to pre-empt/kill running queueable containers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)