Konstantinos Karanasos commented on YARN-2877:

[~wangda], regarding your question about how the AM will know which NM is more 
idle than others, this is related with YARN-2886. Each NM estimates its waiting 
queue time (based on the tasks running and those waiting in the queue already) 
and sends this waiting time to the RM through the heartbeat. Note that this is 
just an integer, so it is very lightweight. Then the RM can push this 
information to the rest of the NMs (again through the heartbeats). This way 
each node knows the queue status of the other NMs and can decide where to queue 
its queueable requests. However, since this information may be always precise 
(due to bad estimation or stale info), we also introduce correction mechanisms 
for rebalancing the queues, if need be (YARN-2888).

Regarding your other questions:
# These "malicious" AMs is one of the basic reasons we have introduced the 
Local RM. The AMs can make queueable requests only to the Local RM, who can 
throttle down "aggressive" AMs without even needing to reach the central RM. 
Clearly, as you mention, the central RM can also be involved for imposing 
elaborate fairness/capacity constraints, if those are needed.
# Promoting a queueable container to a guaranteed-start one is indeed 
interesting, and we have been investigating the cases for which it would bring 
benefits. One is the case you mention. Another is in case a queueable container 
has been pre-empted/killed many times due to other guaranteed-start requests.

> Extend YARN to support distributed scheduling
> ---------------------------------------------
>                 Key: YARN-2877
>                 URL: https://issues.apache.org/jira/browse/YARN-2877
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Sriram Rao
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).

This message was sent by Atlassian JIRA

Reply via email to