[
https://issues.apache.org/jira/browse/YARN-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401950#comment-16401950
]
Jason Lowe commented on YARN-8034:
----------------------------------
If you need a specific host then set relaxLocality=false. Otherwise there's no
guarantee the request will be assigned to the requested host. The host could
be down, full of other containers, unhealthy, etc. When relaxLocality=true
then the RM assumes the application would prefer a container in a somewhat
timely manner somewhere else rather than waiting indefinitely for a full node
to free up space. The node locality delay gives admins some control over how
patiently the RM will wait for locality.
bq. The behavior I want from Yarn is "Honor locality to the best possible
extent and only return a container on an arbitrary host if the requested host
is down". Is there a way to accomplish this?
Yes, although it will require some work on the Samza AM's part. Samza's AM can
make requests for specific nodes with relaxLocality=false, but it also should
monitor the updatedNodes field of each AllocateResponse. The RM will notify
applications in that response when a node becomes unusable or becomes usable
again. The Samza AM can cancel and resubmit a request (either for a different
host or with relaxLocality=true) when a node trying to be allocated becomes
unusable.
> Clarification on preferredHost request with relaxedLocality
> -----------------------------------------------------------
>
> Key: YARN-8034
> URL: https://issues.apache.org/jira/browse/YARN-8034
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Jagadish
> Priority: Major
>
> I work on Apache Samza, a stateful stream-processing framework that leverages
> Yarn for resource management. The Samza AM requests resources on specific
> hosts to schedule stateful jobs. We set relaxLocality = true in these
> requests we make to Yarn. Often we have observed that we don't get containers
> on the hosts that we requested them on and the Yarn RM returns containers on
> arbitrary hosts.
> Do you know what the behavior of the FairScheduler/CapacityScheduler is when
> setting "relaxLocality = true".I did play around by setting a high value for
> yarn.scheduler.capacity.node-locality-delay but it did not seem to matter.
> However, when setting relaxLocality = false, we get resources on the exact
> hosts we requested on.
> The behavior I want from Yarn is "Honor locality to the best possible extent
> and only return a container on an arbitrary host if the requested host is
> down". Is there a way to accomplish this?
> If you can point me to the Scheduler code, I'm happy to look at it as well.
> For context, we have continuous scheduling enabled in our clusters.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]