Chen He commented on YARN-1680:

Hi [~cwelch], thank you for the comments. 

{quote}Unfortunately, I don't think that this can be solved with checks during 
addition and removal - I believe that we will need to keep a persistent picture 
of all blacklisted nodes for an application regardless of their cluster state 
because the two can vary independently and changes after a blacklist request 
may invalidate things{quote}

Yes, I agree. However, as [~jlowe] suggested, just simply fix the headroom 
calculation without introducing new mechanism or facts can help some clusters 
that are not using label scheduling. Can we leave the new feature and fact 
introduction in YARN-2848, and just fix the headroom calculation here? 

{quote} (for example, cluster blacklists just before app blacklists, the app 
blacklist request is discarded, the cluster reinstates but the app still cannot 
use the node for reasons different from the nodes cluster availability - we 
will still include that node in headroom incorrectly...).{quote}

Do you mean this scenario:
Cluster just add Node A into its blacklist. A millisecond later, app requests 
so but suddenly discards it. (But why it is discarded? according to my proposed 

I worte all possible conditions here:
1. Cluster add A to blacklist, app never requests so, then, we just remove A's 
resource from clusterResource, this has been covered in my previous patch;
2. Cluster add A to blacklist, app adds so, according to previous solution, A's 
resource can be subtract from clusterResource twice, it is incorrect, we need 
only subtract it once. 
3. Cluster does not add A, app adds A, it is normal case. Not a problem.
4. Cluster add A to blacklist, app did so, but a millisecond later Cluster 
removes A (a node becomes healthy suddenly?), we only need to subtract A's 
resource once,  
5. Cluster does not add A, app adds A, but during the headroom calculation, 
Cluster add A, We may get incorrect headroom anyway, but we can finally get a 
correct headroom in the next heartbeat. 

Please let me know if there is any scenario that I did not cover. 

> availableResources sent to applicationMaster in heartbeat should exclude 
> blacklistedNodes free memory.
> ------------------------------------------------------------------------------------------------------
>                 Key: YARN-1680
>                 URL: https://issues.apache.org/jira/browse/YARN-1680
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 2.2.0, 2.3.0
>         Environment: SuSE 11 SP2 + Hadoop-2.3 
>            Reporter: Rohith
>            Assignee: Chen He
>         Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
> YARN-1680-v2.patch, YARN-1680.patch
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
> slow start is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
> become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
> NodeManager(NM-4). All reducer task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption 
> calculation, headRoom is considering blacklisted nodes memory. This makes 
> jobs to hang forever(ResourceManager does not assing any new containers on 
> blacklisted nodes but returns availableResouce considers cluster free 
> memory). 

This message was sent by Atlassian JIRA

Reply via email to