[
https://issues.apache.org/jira/browse/YARN-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Subru Krishnan updated YARN-3672:
---------------------------------
Attachment: YARN-3672-YARN-2915-v4.patch
Thanks [~jianhe] for your feedback.
bq. How will the RM failover play together with YARN-3673 ? Let’s say
subCluster1(RM1, RM2), subCluster2(RM3, RM4). Looks like the implementation
will ignore cluster intra-failover and do cluster inter-failover only ?
The implementation will handle only cluster intra-failover as the RM failover
proxy in YARN-3673 will be seeded based on _subClusterId_. The information on
the StateStore will get updated as part of RM active services initialization
(YARN-3671). In your example, the RM failover proxy will be a _connection_ to
subCluster1 which will initially point to say RM1 which is the current primary.
Suppose RM1 fails over to RM2, RM2 will now heartbeat the StateStore against
subCluster1 and we will auto-update the proxy to connect to RM2 (by querying
getSubClusterInfo(subCluster1) on the Facade).
The cluster inter-failover is determined by the policies (YARN-5323) as that
defines how a queue spans multiple sub-clusters and the {{Router/AMRMProxy}}
will create a RM failover proxy per subCluster in the policy.
Makes sense?
bq. question for such API. It asks for a specific subCluster info, do we still
need the filterInactiveSubClusters flag ? Even if it’s required, the behavior
for the if/else is inconsistent, the if case is honoring the flag, while the
else doesn’t.
Good catch. I have removed filterInactiveSubClusters from getSubClusterInfo.
bq. I think we should not reuse these two configs for retry, the default value
of both is zero.
Valid point. I was trying to reuse existing configs as we have to add a few for
Federation on top of the too many existing ones. I looked at the {{RMProxy}}
and have replaced them with better fitting ones.
I have updated the patch (v4) accordingly.
> Create Facade for Federation State and Policy Store
> ---------------------------------------------------
>
> Key: YARN-3672
> URL: https://issues.apache.org/jira/browse/YARN-3672
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Reporter: Subru Krishnan
> Assignee: Subru Krishnan
> Attachments: YARN-3672-YARN-2915-v1.patch,
> YARN-3672-YARN-2915-v2.patch, YARN-3672-YARN-2915-v3.patch,
> YARN-3672-YARN-2915-v4.patch
>
>
> This JIRA proposes creating a facade for Federation State and Policy Store to
> simply access and have a common place for cache management etc that can be
> used by both Router & AMRMProxy
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]