[
https://issues.apache.org/jira/browse/YARN-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796289#comment-16796289
]
Íñigo Goiri commented on YARN-9399:
-----------------------------------
This is somewhat related to HDFS-4957 in the HDFS side.
The discussion seems pretty related jere:
https://github.com/apache-spark-on-k8s/kubernetes-HDFS/issues/48
[~fengnanli], in HDFS-14327 you are using FQDN addresses.
Should we cover this scenario there?
> Yarn Client may use stale DNS to connect to RM
> ----------------------------------------------
>
> Key: YARN-9399
> URL: https://issues.apache.org/jira/browse/YARN-9399
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.9.1
> Reporter: Leon zhang
> Assignee: Íñigo Goiri
> Priority: Major
> Labels: patch
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> This happens more frequently when running yarn in Kubernetes. When yarn
> client try to connect to RM, if the DNS of RM is not resovable due to
> kube-dns failure or not ready, the yarn client will initaize itself with
> unresoved InetSocketAddress in RMProxy#newProxyInstance(). The connect to RM
> will fail with UnknownHostException. Yarn client will retry the connection by
> RetryProxy by it always use the cached unresolved InetSocketAddress. The
> retry will never success. When RM is reschdured to another kubernetes node,
> which changed the RM ip, this bug will also happen. Currently the work around
> is to restarting the Yarn client.
> This issue happens in both HA and non-HA of RM. HDFS has simialr issues.
> [https://github.com/apache-spark-on-k8s/kubernetes-HDFS/issues/48]
> I propose to add a new RMFailoverProxyProvider called
> AutoRefreshRMFailoverProxyProvider which will resove the DNS in the
> overwriten function getProxy(). This way, RetryProxy can resolve the DNS each
> time it retry.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]