[ 
https://issues.apache.org/jira/browse/YARN-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402088#comment-16402088
 ] 

Jason Lowe commented on YARN-8034:
----------------------------------

{quote}I observed that the Yarn RM immediately returns a container on a 
different host in the next second after the request was made.
{quote}
I believe something like YARN-6344 is relevant here even though that fix is 
specific to the CapacittyScheduler. The schedulers have a heuristic where it 
assumes making a small number of requests relative to the size of the cluster 
should bias towards responsiveness rather than locality. It's been there a long 
time. I don't know the full history behind it, but I suspect it derives from 
assuming a small request is for a small job and interactivity is more important 
than waiting for locality (since we are allowed to relax). See 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt#getLocalityWaitFactor
 for the equivalent place in the FairScheduler for what is being discussed in 
YARN-6344.
{quote}The Samza AM can cancel and resubmit a request (either for a different 
host or with relaxLocality=true) when a node trying to be allocated becomes 
unusable.
{quote}
You will want to keep that logic even after updating the AM to monitor the node 
updates. That will cover the case where the desired node is completely full 
with long-running containers.

> Clarification on preferredHost request with relaxedLocality
> -----------------------------------------------------------
>
>                 Key: YARN-8034
>                 URL: https://issues.apache.org/jira/browse/YARN-8034
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jagadish
>            Priority: Major
>
> I work on Apache Samza, a stateful stream-processing framework that leverages 
> Yarn for resource management. The Samza AM requests resources on specific 
> hosts to schedule stateful jobs. We set relaxLocality = true in these 
> requests we make to Yarn. Often we have observed that we don't get containers 
> on the hosts that we requested them on and the Yarn RM returns containers on 
> arbitrary hosts. 
> Do you know what the behavior of the FairScheduler/CapacityScheduler is when 
> setting "relaxLocality = true".I did play around by setting a high value for 
> yarn.scheduler.capacity.node-locality-delay but it did not seem to matter. 
> However, when setting relaxLocality = false, we get resources on the exact 
> hosts we requested on.
> The behavior I want from Yarn is "Honor locality to the best possible extent 
> and only return a container on an arbitrary host if the requested host is 
> down". Is there a way to accomplish this?
> If you can point me to the Scheduler code, I'm happy to look at it as well. 
> For context, we have continuous scheduling enabled in our clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to