[
https://issues.apache.org/jira/browse/YARN-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332904#comment-17332904
]
Qi Zhu commented on YARN-10754:
-------------------------------
cc [~ebadger] [~epayne] [~Jim_Brennan] [~snemeth] [~pbacsko] [~gandras]
[~fengnanli]
What's your opinions about this?
Thanks.
> RM Renew Delegation token thread should timeout and retry should also
> consider app new submitted.
> -------------------------------------------------------------------------------------------------
>
> Key: YARN-10754
> URL: https://issues.apache.org/jira/browse/YARN-10754
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Qi Zhu
> Assignee: Qi Zhu
> Priority: Major
> Attachments: YARN-10754.001.patch, image-2021-04-27-11-38-29-162.png
>
>
> As YARN-9768 described:
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the
> thread remains stuck indefinitely. The thread should ideally timeout the
> renewToken and retry from the client's perspective.
> But it only consider the app recovery, not consider the app submitted:
> !image-2021-04-27-11-38-29-162.png|width=516,height=428!
> It will cause the app submitted not retry, when renew token (HDFS Namenode/
> Router) timeout.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]