[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738647#comment-13738647
 ] 

shenhong commented on YARN-435:
-------------------------------

Firstly, if AM get all nodes in the cluster including their rack information by 
calling RM. This will increase pressure on the RM's network. For example, the 
cluster had more than 5000 datanodes.

Secondly, if the yarn cluster only has 100 nodemanagers, but the hdfs it 
accessed is a cluster with more than 5000 datanodes, we can't get all the nodes 
including their rack information. However, AM need all the datanode information 
in it's job.splitmetainfo file, in order to init TaskAttempt. In this case, we 
can't get all nodes by calling RM.
                
> Make it easier to access cluster topology information in an AM
> --------------------------------------------------------------
>
>                 Key: YARN-435
>                 URL: https://issues.apache.org/jira/browse/YARN-435
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Omkar Vinit Joshi
>
> ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
> nodes in the cluster including their rack information. 
> However, this requires the AM to open and establish a separate connection to 
> the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to