Craig Welch created YARN-2848:

             Summary: (FICA) Applications should maintain an application 
specific 'cluster' resource to calculate headroom and userlimit
                 Key: YARN-2848
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacityscheduler
            Reporter: Craig Welch
            Assignee: Craig Welch

Likely solutions to [YARN-1680] (properly handling node and rack blacklisting 
with cluster level node additions and removals) will entail managing an 
application-level "slice" of the cluster resource available to the application 
for use in accurately calculating the application headroom and user limit.  
There is an assumption that events which impact this resource will change less 
frequently than the need to calculate headroom, userlimit, etc (which is a 
valid assumption given that occurs per-allocation heartbeat).  Given that, the 
application should (with assistance from cluster-level code...) detect changes 
to the composition of the cluster (node addition, removal) and when those have 
occurred, calculate a application specific cluster resource by comparing 
cluster nodes to it's own blacklist (both rack and individual node).  I think 
it makes sense to include nodelabel considerations into this calculation as it 
will be efficient to do both at the same time and the single resource value 
reflecting both constraints could then be used for efficient frequent headroom 
and userlimit calculations while remaining highly accurate.  The application 
would need to be made aware of nodelabel changes it is interested in (the 
application or removal of labels of interest to the application to/from nodes). 
 For this purpose, the application submissions's nodelabel expression would be 
used to determine the nodelabel impact on the resource used to calculate 
userlimit and headroom (Cases where application elected to request resources 
not using the application level label expression are out of scope for this - 
but for the common usecase of an application which uses a particular expression 
throughout, userlimit and headroom would be accurate) This could also provide 
an overall mechanism for handling application-specific resource constraints 
which might be added in the future.

This message was sent by Atlassian JIRA

Reply via email to