Harsh J created YARN-1200:
-----------------------------
Summary: Provide a central view for rack topologies
Key: YARN-1200
URL: https://issues.apache.org/jira/browse/YARN-1200
Project: Hadoop YARN
Issue Type: Improvement
Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Harsh J
It appears that with YARN, any AM (such as the MRv2 AM) that tries to do
rack-info-based work, will need to resolve racks locally rather than get rack
info from YARN directly:
https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java#L1054
and its use of a simple implementation of
https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/RackResolver.java
This is a regression, as we've traditionally only had users maintain rack
mappings and its associated script on a single master role node (JobTracker),
not at every compute node. Task spawning hosts have never done/needed rack
resolution of their own.
It is silly to have to maintain rack configs and their changes on all nodes. We
should have the RM host a stable interface service so that there's only a
single view of the topology across the cluster, and document for AMs to use
that instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira