[ 
https://issues.apache.org/jira/browse/YARN-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3672:
---------------------------------
    Attachment: YARN-3672-YARN-2915-v4.patch

Thanks [~jianhe] for your feedback. 

bq. How will the RM failover play together with YARN-3673 ? Let’s say 
subCluster1(RM1, RM2), subCluster2(RM3, RM4). Looks like the implementation 
will ignore cluster intra-failover and do cluster inter-failover only ?

The implementation will handle only cluster intra-failover as the RM failover 
proxy in YARN-3673 will be seeded based on _subClusterId_. The information on 
the StateStore will get updated as part of RM active services initialization 
(YARN-3671). In your example, the RM failover proxy will be a _connection_ to 
subCluster1 which will initially point to say RM1 which is the current primary. 
Suppose RM1 fails over to RM2, RM2 will now heartbeat the StateStore against 
subCluster1 and we will auto-update the proxy to connect to RM2 (by querying 
getSubClusterInfo(subCluster1) on the Facade).
The cluster inter-failover is determined by the policies (YARN-5323) as that 
defines how a queue spans multiple sub-clusters and the {{Router/AMRMProxy}} 
will create a RM failover proxy per subCluster in the policy.
Makes sense?

bq. question for such API. It asks for a specific subCluster info, do we still 
need the filterInactiveSubClusters flag ? Even if it’s required, the behavior 
for the if/else is inconsistent, the if case is honoring the flag, while the 
else doesn’t. 

Good catch. I have removed filterInactiveSubClusters from getSubClusterInfo.

bq. I think we should not reuse these two configs for retry, the default value 
of both is zero. 

Valid point. I was trying to reuse existing configs as we have to add a few for 
Federation on top of the too many existing ones. I looked at the {{RMProxy}} 
and have replaced them with better fitting ones.


I have updated the patch (v4) accordingly.

> Create Facade for Federation State and Policy Store
> ---------------------------------------------------
>
>                 Key: YARN-3672
>                 URL: https://issues.apache.org/jira/browse/YARN-3672
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Subru Krishnan
>            Assignee: Subru Krishnan
>         Attachments: YARN-3672-YARN-2915-v1.patch, 
> YARN-3672-YARN-2915-v2.patch, YARN-3672-YARN-2915-v3.patch, 
> YARN-3672-YARN-2915-v4.patch
>
>
> This JIRA proposes creating a facade for Federation State and Policy Store to 
> simply access and have a common place for cache management etc that can be 
> used by both Router & AMRMProxy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to