[
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365070#comment-15365070
]
Jason Lowe commented on YARN-5215:
----------------------------------
Maybe I'm missing something, but any of the proposed approaches has YARN
assuming it can leverage the unused resources on the node. That's sort of the
whole point, we want YARN to use those unused resources rather than just
hard-partitioning the node between YARN and the other system. Some of the
approaches start with the assumption that the whole node belongs to YARN and
YARN will scale back usage of the node based on utilization feedback, while
other approaches start with YARN assuming it has a smaller portion of the node
and can reach beyond it when utilization is low. It's the same scenario from
two perspectives.
IIUC any of these approaches can react relatively quickly to the other
workload's demands by having the nodemanager take action directly (by
preempting containers) when the periodically monitored node utilization goes
above some configured limit. The original proposal in this JIRA doesn't do
that, which means it won't be super-responsive to the other subsystem. The RM
won't allocate any additional containers when the utilization gets high, but
some of the containers would have to exit on their own before YARN's existing
utilization would decrease. It sounds like the version Inigo has deployed in
production does do some sort of preemption, but it sounded like it was coming
from the RM rather than the NM which would be slightly slower response time
than if the NM did it directly.
If the latency demands of the other workload are so severe that it's impossible
for YARN to react quickly enough then I don't see how YARN can leverage those
resources when they are unused. We'd have to resort to some kind of
hard-partitioning (either giving the nodemanager less resources than the node
actually has or using proxy containers in YARN on behalf of the other workload
to reserve the resources) and live with the underutilization of those resources
when the other workload is idle.
> Scheduling containers based on external load in the servers
> -----------------------------------------------------------
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the
> resources. The proposal is to use the utilization information in the node and
> the containers to estimate how much is consumed by external processes and
> schedule based on this estimation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]