[
https://issues.apache.org/jira/browse/YARN-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402088#comment-16402088
]
Jason Lowe commented on YARN-8034:
----------------------------------
{quote}I observed that the Yarn RM immediately returns a container on a
different host in the next second after the request was made.
{quote}
I believe something like YARN-6344 is relevant here even though that fix is
specific to the CapacittyScheduler. The schedulers have a heuristic where it
assumes making a small number of requests relative to the size of the cluster
should bias towards responsiveness rather than locality. It's been there a long
time. I don't know the full history behind it, but I suspect it derives from
assuming a small request is for a small job and interactivity is more important
than waiting for locality (since we are allowed to relax). See
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt#getLocalityWaitFactor
for the equivalent place in the FairScheduler for what is being discussed in
YARN-6344.
{quote}The Samza AM can cancel and resubmit a request (either for a different
host or with relaxLocality=true) when a node trying to be allocated becomes
unusable.
{quote}
You will want to keep that logic even after updating the AM to monitor the node
updates. That will cover the case where the desired node is completely full
with long-running containers.
> Clarification on preferredHost request with relaxedLocality
> -----------------------------------------------------------
>
> Key: YARN-8034
> URL: https://issues.apache.org/jira/browse/YARN-8034
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Jagadish
> Priority: Major
>
> I work on Apache Samza, a stateful stream-processing framework that leverages
> Yarn for resource management. The Samza AM requests resources on specific
> hosts to schedule stateful jobs. We set relaxLocality = true in these
> requests we make to Yarn. Often we have observed that we don't get containers
> on the hosts that we requested them on and the Yarn RM returns containers on
> arbitrary hosts.
> Do you know what the behavior of the FairScheduler/CapacityScheduler is when
> setting "relaxLocality = true".I did play around by setting a high value for
> yarn.scheduler.capacity.node-locality-delay but it did not seem to matter.
> However, when setting relaxLocality = false, we get resources on the exact
> hosts we requested on.
> The behavior I want from Yarn is "Honor locality to the best possible extent
> and only return a container on an arbitrary host if the requested host is
> down". Is there a way to accomplish this?
> If you can point me to the Scheduler code, I'm happy to look at it as well.
> For context, we have continuous scheduling enabled in our clusters.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]