[ 
https://issues.apache.org/jira/browse/YARN-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350731#comment-16350731
 ] 

Arun Suresh commented on YARN-7839:
-----------------------------------

[~sunilg], regarding the {{CandidateNodeSet}}, lets move the discussion to when 
we refactor the {{AppSchedulingInfo}} - since this patch is isolated to the 
algorithm.

[~kkaranasos] comment:
bq. However, what about the case that a node seems full but a container is 
about to finish (and will be finished until the allocate is done)? Should we 
completely reject such nodes, or simply give higher priority to nodes that 
already have available resources?
We are not rejecting those resources. If a Scheduling request cannot be 
satisfied by any node in the algorithm round, it will be retried in the next AM 
heartbeat - and hopefully some of those containers would complete by then. We 
can set the retry to a higher value for clusters that are running at a higher 
utilization.

> Check node capacity before placing in the Algorithm
> ---------------------------------------------------
>
>                 Key: YARN-7839
>                 URL: https://issues.apache.org/jira/browse/YARN-7839
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>         Attachments: YARN-7839-YARN-6592.001.patch
>
>
> Currently, the Algorithm assigns a node to a request purely based on if the 
> constraints are met. It is later in the scheduling phase that the Queue 
> capacity and Node capacity are checked. If the request cannot be placed 
> because of unavailable Queue/Node capacity, the request is retried by the 
> Algorithm.
> For clusters that are running at high utilization, we can reduce the retries 
> if we perform the Node capacity check in the Algorithm as well. The Queue 
> capacity check and the other user limit checks can still be handled by the 
> scheduler (since queues and other limits are tied to the scheduler, and not 
> scheduler agnostic)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to