Wangda Tan commented on YARN-8149:

Thanks [~cheersyang] for pointing to the original Jira. 

I would say this could be more harmful than useful: re-reservation can be as 
large as MAX_INT, which means an app could reserve on many node even if the app 
has only one pending large resource request. With preemption enhancements like 
surgical preemption, etc. I think we don't need this any more.

Still want to hear thoughts from others before taking action.

> Revisit behavior of Re-Reservation in Capacity Scheduler
> --------------------------------------------------------
>                 Key: YARN-8149
>                 URL: https://issues.apache.org/jira/browse/YARN-8149
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Wangda Tan
>            Priority: Critical
> Frankly speaking, I'm not sure why we need the re-reservation. The formula is 
> not that easy to understand:
> Inside: 
> {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#shouldAllocOrReserveNewContainer}}
> {code:java}
> starvation = re-reservation / (#reserved-container * 
>      (1 - min(requested-resource / max-alloc, 
>               max-alloc - min-alloc / max-alloc))
> should_allocate = starvation + requiredContainers - reservedContainers > 
> 0{code}
> I think we should be able to remove the starvation computation, just to check 
> requiredContainers > reservedContainers should be enough.
> In a large cluster, we can easily overflow re-reservation to MAX_INT, see 
> YARN-7636. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to