[
https://issues.apache.org/jira/browse/YARN-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ravi Prakash updated YARN-7450:
-------------------------------
Description:
We saw a stack track (posted in the first comment) in the ResourceManager logs
for the TimelineClientImpl not being able to relogin from keytab.
I'm guessing there was an intermittent network issue that failed the kerberos
relogin from keytab. However, I'm assuming this was *not* retried because I
only saw one instance of this stack trace. I propose that this operation
should have been retried.
It seems, this caused events at the ResourceManager to queue up and eventually
stop responding to even basic {{yarn application -list}} commands.
was:
We saw a stack track (posted in the first comment) in the ResourceManager logs
for the TimelineClientImpl not being able to relogin from keytab.
I'm guessing there was an intermittent network issue that failed the kerberos
relogin from keytab. However, I'm assuming this was *not* retried because I
only saw one instance of this stack trace. I propose that this operation
should have been retried.
> ATS Client should retry on intermittent Kerberos issues.
> --------------------------------------------------------
>
> Key: YARN-7450
> URL: https://issues.apache.org/jira/browse/YARN-7450
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: ATSv2
> Affects Versions: 2.7.3
> Environment: Hadoop-2.7.3
> Reporter: Ravi Prakash
>
> We saw a stack track (posted in the first comment) in the ResourceManager
> logs for the TimelineClientImpl not being able to relogin from keytab.
> I'm guessing there was an intermittent network issue that failed the kerberos
> relogin from keytab. However, I'm assuming this was *not* retried because I
> only saw one instance of this stack trace. I propose that this operation
> should have been retried.
> It seems, this caused events at the ResourceManager to queue up and
> eventually stop responding to even basic {{yarn application -list}} commands.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]