[
https://issues.apache.org/jira/browse/YARN-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815950#comment-13815950
]
Mck SembWever commented on YARN-80:
-----------------------------------
How can one debug this process?
It was easy before with just `grep "Choosing" hadoop-xxx-jobtracker.log`.
I can't find any similar information in YARN log files.
Background: I just upgraded to YARN (hadoop-2.2.0). And despite setting
yarn.scheduler.capacity.node-locality-delay=3 in capacity-scheduler.xml
data-locality is poor. (It was 100% with hadoop-0.22 and fair-scheduler).
> Support delay scheduling for node locality in MR2's capacity scheduler
> ----------------------------------------------------------------------
>
> Key: YARN-80
> URL: https://issues.apache.org/jira/browse/YARN-80
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: capacityscheduler
> Reporter: Todd Lipcon
> Assignee: Arun C Murthy
> Fix For: 2.0.2-alpha, 0.23.6
>
> Attachments: YARN-80.patch, YARN-80.patch
>
>
> The capacity scheduler in MR2 doesn't support delay scheduling for achieving
> node-level locality. So, jobs exhibit poor data locality even if they have
> good rack locality. Especially on clusters where disk throughput is much
> better than network capacity, this hurts overall job performance. We should
> optionally support node-level delay scheduling heuristics similar to what the
> fair scheduler implements in MR1.
--
This message was sent by Atlassian JIRA
(v6.1#6144)