Thomas Graves commented on YARN-3434:

And I've a question about continous reservation checking behavior, may or may 
not related to this issue: Now it will try to unreserve all containers under a 
user, but actually it will only unreserve at most one container to allocate a 
new container. Do you think is it fine to change the logic to be:
When (continousReservation-enabled) && (user.usage + required - 
min(max-allocation, user.total-reserved) <=user.limit), assignContainers will 
continue. This will prevent doing impossible allocation when user reserved lots 
of containers. (As same as queue reservation checking).

I do think the reservation checking and unreserving can be improved.  I 
basically started with very simple thing and figured we could improve.  I'm not 
sure how much that check would help in practice.  I guess it might help the 
cases where you have 1 user in the queue and a second one shows up and your 
user limit gets decreased by a lot.  In that case it may prevent it from 
continuing when it can short circuit here.  So it would seem to be ok for that. 

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --------------------------------------------------------------------------------------
>                 Key: YARN-3434
>                 URL: https://issues.apache.org/jira/browse/YARN-3434
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>         Attachments: YARN-3434.patch
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.

This message was sent by Atlassian JIRA

Reply via email to