[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503919#comment-14503919
 ] 

Wangda Tan commented on YARN-3434:
----------------------------------

[~tgraves], thanks for updating, some comments on latest patch:
1) I suggest to rename LeafQueue.currentResourceLimit -> 
cachedResourceLimitsToComputeHeadroom (or other shorter name), to make a clear 
scope of this field.

2) Better to copy currentResourceLimit in updateCurrentResourceLimits and save 
to cachedResourceLimitsToComputeHeadroom, but it's not necessary since we 
copied when passing down from ParentQueue and we don't change getLimit in 
following computings. 

3) If you agree with 2), we don't need to copy resourceLimit:
{code}
          ResourceLimits userResourceLimits = new 
ResourceLimits(this.cachedResourceLimitsToComputeHeadroom
              .getLimit());
{code}.
The ResourceLimit is already copied when passing down to LeafQueue in 
ParentQueue.getResourceLimitsOfChild, so we don't need to copy it here.

4) canAssignToUser parameter list: localResourceLimits->currentResoureLimits, 
it should as same as other methods. And limit->userLimit

5) There're several limitInfo, rename to currentResourceLimits for consistency?

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --------------------------------------------------------------------------------------
>
>                 Key: YARN-3434
>                 URL: https://issues.apache.org/jira/browse/YARN-3434
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>         Attachments: YARN-3434.patch, YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to