Jason Lowe commented on YARN-3104:

bq. The only concern is if we don't do anything, it is possible for AMs to get 
authentication failures

If we don't do anything then AMs who fail to update their token will continue 
to work as long as they don't have their RM connection dropped.  Force-closing 
the connection or reauthenticating over the same connection just makes the auth 
failure happen sooner.  Apps will work better without changing anything, since 
an app with a recently expired token will likely be able to talk with the RM 
for hours or days and usually avoid the auth failures we're worried about.

I agree we should find a way to "fail fast" for this scenario, but also agree 
it's probably non-trivial to do so if we can't force-close the connection via 
the existing RPC API.  If it's not going to be fixed for 2.7, I'd rather put in 
a fix so we don't have the RM regenerating the same token every heartbeat and 
the corresponding logs.

> RM generates new AMRM tokens every heartbeat between rolling and activation
> ---------------------------------------------------------------------------
>                 Key: YARN-3104
>                 URL: https://issues.apache.org/jira/browse/YARN-3104
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: YARN-3104.001.patch, YARN-3104.002.patch, 
> YARN-3104.003.patch
> When the RM rolls a new AMRM secret, it conveys this to the AMs when it 
> notices they are still connected with the old key.  However neither the RM 
> nor the AM explicitly close the connection or otherwise try to reconnect with 
> the new secret.  Therefore the RM keeps thinking the AM doesn't have the new 
> token on every heartbeat and keeps sending new tokens for the period between 
> the key roll and the key activation.  Once activated the RM no longer squawks 
> in its logs about needing to generate a new token every heartbeat (i.e.: 
> second) for every app, but the apps can still be using the old token.  The 
> token is only checked upon connection to the RM.  The apps don't reconnect 
> when sent a new token, and the RM doesn't force them to reconnect by closing 
> the connection.

This message was sent by Atlassian JIRA

Reply via email to