[ 
https://issues.apache.org/jira/browse/YARN-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815950#comment-13815950
 ] 

Mck SembWever commented on YARN-80:
-----------------------------------

How can one debug this process? 
It was easy before with just `grep "Choosing" hadoop-xxx-jobtracker.log`.
I can't find any similar information in YARN log files.

Background: I just upgraded to YARN (hadoop-2.2.0). And despite setting 
yarn.scheduler.capacity.node-locality-delay=3 in capacity-scheduler.xml  
data-locality is poor. (It was 100% with hadoop-0.22 and fair-scheduler).



> Support delay scheduling for node locality in MR2's capacity scheduler
> ----------------------------------------------------------------------
>
>                 Key: YARN-80
>                 URL: https://issues.apache.org/jira/browse/YARN-80
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>            Reporter: Todd Lipcon
>            Assignee: Arun C Murthy
>             Fix For: 2.0.2-alpha, 0.23.6
>
>         Attachments: YARN-80.patch, YARN-80.patch
>
>
> The capacity scheduler in MR2 doesn't support delay scheduling for achieving 
> node-level locality. So, jobs exhibit poor data locality even if they have 
> good rack locality. Especially on clusters where disk throughput is much 
> better than network capacity, this hurts overall job performance. We should 
> optionally support node-level delay scheduling heuristics similar to what the 
> fair scheduler implements in MR1.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to