[ 
https://issues.apache.org/jira/browse/YARN-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773299#comment-16773299
 ] 

Yufei Gu edited comment on YARN-9278 at 2/20/19 7:47 PM:
---------------------------------------------------------

Hi [~uranus], this seems a perf issue for a busy large cluster due to the 
preemption implementation, which is iteration and check. 
The idea of setting a node # threshhold doesn't look elegant, but reasonable if 
we can't change the iteration-and-check way to identify preemptable containers. 
It may not be the only idea though.

Without introduce more complexity to FS preemption, it is already very 
complicated, there are some workarounds you can try: To increase FairShare 
Preemption Timeout and FairShare Preemption Threshold to reduce the chance of 
preemption. This is specially useful for a large cluster, since there is more 
chance to get resources just by waiting. 



was (Author: yufeigu):
Hi [~uranus], this seems a perf issue for a busy large cluster due to the 
preemption implementation, which is iteration and check. 

I would suggest lower 
{{yarn.scheduler.fair.preemption.cluster-utilization-threshold}} to let 
preemption kick in earlier for a large cluster. The default value is 80%, which 
means preemption won't kick in until 80% resources of the whole cluster have 
been used. Please be aware that low utilization threshold may cause an 
unnecessary container churn, so you don't want it to be too low. 

> Shuffle nodes when selecting to be preempted nodes
> --------------------------------------------------
>
>                 Key: YARN-9278
>                 URL: https://issues.apache.org/jira/browse/YARN-9278
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: fairscheduler
>            Reporter: Zhaohui Xin
>            Assignee: Zhaohui Xin
>            Priority: Major
>
> We should *shuffle* the nodes to avoid some nodes being preempted frequently. 
> Also, we should *limit* the num of nodes to make preemption more efficient.
> Just like this,
> {code:java}
> // we should not iterate all nodes, that will be very slow
> long maxTryNodeNum = 
> context.getPreemptionConfig().getToBePreemptedNodeMaxNumOnce();
> if (potentialNodes.size() > maxTryNodeNum){
>   Collections.shuffle(potentialNodes);
>   List<FSSchedulerNode> newPotentialNodes = new ArrayList<FSSchedulerNode>();
> for (int i = 0; i < maxTryNodeNum; i++){
>   newPotentialNodes.add(potentialNodes.get(i));
> }
> potentialNodes = newPotentialNodes;
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to