Wangda Tan commented on YARN-1680:

Had some offline discussion with [~jianhe] and [~cwelch]. Some takings from my 
- If we want to get accurate headroom for application which has blacklisted 
nodes, it looks unavoidable to get sum(app.blacklisted_nodes.avail) while 
calculation headroom for the app. This requires when a node doing heartbeat 
with changed available resource, all apps blacklisted the node need to be 
notified, when there're lots of application blacklisted large amount of nodes, 
performance regression could happen.
- If we consider sum(app.blacklisted_nodes.total) instead of considering 
sum(app.blacklisted_nodes.avail), headroom for app could be under estimated, 
this could lead to app with blacklisted nodes always receive 0 headroom when a 
large cluster with highly resource utilization (like >99%). 

Some fallbacks strategies:
# Only do accurate headroom calculation when there're not too much blacklisted 
nodes as well as apps with blacklisted nodes.
# Tolerance under estimation of headroom

Some alternatives:
- MAPREDUCE-6302 is targeting to preempt reducer even if we reported inaccurate 
headroom for apps. I think the approach looks good to me.
- Move headroom calculation to application side, I think now we cannot do it at 
least for now. Application will only receive updated NodeReport from when node 
changes heathy status instead of regular heartbeat. We cannot send so much data 
to AM during heartbeat.

> availableResources sent to applicationMaster in heartbeat should exclude 
> blacklistedNodes free memory.
> ------------------------------------------------------------------------------------------------------
>                 Key: YARN-1680
>                 URL: https://issues.apache.org/jira/browse/YARN-1680
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>    Affects Versions: 2.2.0, 2.3.0
>         Environment: SuSE 11 SP2 + Hadoop-2.3 
>            Reporter: Rohith
>            Assignee: Craig Welch
>         Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
> YARN-1680-v2.patch, YARN-1680.patch
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
> slow start is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
> become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
> NodeManager(NM-4). All reducer task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption 
> calculation, headRoom is considering blacklisted nodes memory. This makes 
> jobs to hang forever(ResourceManager does not assing any new containers on 
> blacklisted nodes but returns availableResouce considers cluster free 
> memory). 

This message was sent by Atlassian JIRA

Reply via email to