[
https://issues.apache.org/jira/browse/YARN-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162257#comment-16162257
]
Arun Suresh commented on YARN-7185:
-----------------------------------
Had an offline discussion with [~leftnoteasy]
The main issue here is that prior to YARN-6706, the NM would blindly trust the
RM's guaranteed container allocations and not perform pre-checks before
starting the container (if opp scheduling is turned off). Post YARN-6706, we
make the check for every container. [~leftnoteasy]'s fix here would revert it
to the old behavior.
IMHO, as Vinod suggested, the correct solution should be to force the NM and RM
to use the same Resource Calculator. Unfortunately, given that the
ResourceCaculator is currently a per-queue, per-scheduler setting, it is
difficult to set a "global" resource calculator. Although, I am not sure if
anybody actually uses a configuration with different Resource Calculators at
different levels of the Queue Hierarchy.
> ContainerScheduler should only look at availableResource for GUARANTEED
> containers when opportunistic scheduling is enabled
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-7185
> URL: https://issues.apache.org/jira/browse/YARN-7185
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Reporter: Sumana Sathish
> Assignee: Tan, Wangda
> Priority: Blocker
> Attachments: YARN-7185.001.patch
>
>
> Found an issue:
> When DefaultContainerCalculator is enabled and opportunistic container
> allocation is disabled. It is possible that for a NM:
> {code}
> Σ(allocated-container.vcores) > nm.configured-vores.
> {code}
> When this happens, ContainerScheduler will report errors like:
> bq. ContainerScheduler
> (ContainerScheduler.java:pickOpportunisticContainersToKill(458)) - There are
> no sufficient resources to start guaranteed.
> This will be an incompatible change after 2.8 because before YARN-6706, we
> can start containers when DefaultContainerCalculator is configured and vcores
> is overallocated.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]