Wangda Tan commented on YARN-1198:

Hi [~cwelch],
Sorry for this late response, I've just looked your ver.8 patch and comments,
My reply,
bq. -re "we don't need write HeadroomProvider for each scheduler" 
bq. Provider vs Reference
I agree with this, I think we need write different Headroom Provider and it's 
better to keep Provider since its more general.

bq. -re "As mentioned by Jason, currently ...
Agree, this can be done in a separated JIRA

bq. -re the cost of the calculation
Agree, it's just a small computation effort.

In the past, I suggest do as I mentioned 
 because I think that will make code more clean.
But according to your ver.8 patch, I realized that may not doable. In 
LeafQueue#computeUserLimit, it uses required to get user limit. In your patch, 
you save the lastRequired to user class. However, we need different required 
for different app under a same user. We can only do the calculate when app 
heartbeats (We can also loop and set all app's headroom, but that's a way we 
abandoned before). 

So basically, IMHO, I think your ver.7 is a more correct way to go. Which keeps 
complexity/efficiency balanced. 
Any thoughts? [~jianhe], [~cwelch].


> Capacity Scheduler headroom calculation does not work as expected
> -----------------------------------------------------------------
>                 Key: YARN-1198
>                 URL: https://issues.apache.org/jira/browse/YARN-1198
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Craig Welch
>         Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
> YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch, 
> YARN-1198.8.patch
> Today headroom calculation (for the app) takes place only when
> * New node is added/removed from the cluster
> * New container is getting assigned to the application.
> However there are potentially lot of situations which are not considered for 
> this calculation
> * If a container finishes then headroom for that application will change and 
> should be notified to the AM accordingly.
> * If a single user has submitted multiple applications (app1 and app2) to the 
> same queue then
> ** If app1's container finishes then not only app1's but also app2's AM 
> should be notified about the change in headroom.
> ** Similarly if a container is assigned to any applications app1/app2 then 
> both AM should be notified about their headroom.
> ** To simplify the whole communication process it is ideal to keep headroom 
> per User per LeafQueue so that everyone gets the same picture (apps belonging 
> to same user and submitted in same queue).
> * If a new user submits an application to the queue then all applications 
> submitted by all users in that queue should be notified of the headroom 
> change.
> * Also today headroom is an absolute number ( I think it should be normalized 
> but then this is going to be not backward compatible..)
> * Also  when admin user refreshes queue headroom has to be updated.
> These all are the potential bugs in headroom calculations

This message was sent by Atlassian JIRA

Reply via email to