[jira] [Commented] (YARN-3666) Federation Intercepting and propagating AM-RM communications (part one: home RM only)

Subru Krishnan (JIRA) Thu, 27 Apr 2017 17:28:15 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987942#comment-15987942
 ]


Subru Krishnan commented on YARN-3666:
--------------------------------------

[~botong], the patch looks mostly good to me, just have few minor comments:
  * You should rebase it based on the refactoring of common methods as 
requested in YARN-5531.
  * Looks like the *allocate* logic for single sub-cluster (with 
re-registration on failover) can moved to the common class.
  * Can you have a flag to modify the *registerApplicationMaster* behavior with 
default being current RM response?
  * {code} for (ContainerId containerId : 
containerIdToSubClusterIdMap.keySet()) { {code} can be very costly for large 
applications with 10s of thousands of containers. So can we do this only in 
debug mode? Also it would be good to clarify in the comment why the id could be 
different (epoch).
 
Few nits on docs: 
  * Clarify why we are storing _containerIdToSubClusterIdMap_.
  * Call out what exactly is home & secondary.
  * {quote}This will be used for registering with the other UAMs later {quote} 
can be rephrased to {quote}This will be used for registering with secondary 
sub-clusters using UAMs later{quote}.
  * Typo _FederationManager_ --> _FederationInterceptor_.

Tests:
  * Please make sure to consolidate {{MockResourceManagerFacade}} with 
YARN-5531 and YARN-5411.
  * I don't see tests for allocate and RM failover scenarios.

> Federation Intercepting and propagating AM-RM communications (part one: home 
> RM only)
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-3666
>                 URL: https://issues.apache.org/jira/browse/YARN-3666
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Kishore Chaliparambil
>            Assignee: Botong Huang
>         Attachments: YARN-3666-YARN-2915.v1.patch, 
> YARN-3666-YARN-2915.v2.patch, YARN-3666-YARN-2915.v3.patch, 
> YARN-3666-YARN-2915.v4.patch, YARN-3666-YARN-2915.v5.patch
>
>
> In order, to support transparent "spanning" of jobs across sub-clusters, all 
> AM-RM communications are proxied (via YARN-2884).
> This JIRA tracks federation-specific mechanisms that decide how to 
> "split/broadcast" requests to the RMs and "merge" answers to 
> the AM.
> This the part one jira, which sets up the basic structure, without secondary 
> subclusters. All requests are forwarded to home subcluster. Part two is in 
> YARN-6511



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-3666) Federation Intercepting and propagating AM-RM communications (part one: home RM only)

Reply via email to