[
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276194#comment-14276194
]
Vinod Kumar Vavilapalli commented on YARN-3021:
-----------------------------------------------
bq. So my question is, should YARN support running a job without validating a
token? (Though MR1 "support" this because the token renewal is asynchronous as
Harsh pointed out).
As I proposed in the beginning of this JIRA, if we want to do this, it has to
be explicit in the app-submission. There are several corner cases that are
leaking though. Even if RM successfully validates a token at app-submission,
the remote service may be down at the time of renewal. So I think there are two
APIs
# Should RM validate the token by renewing at the time of app-submission?
# Should RM fail the app if renewal fails any time during the app-execution?
Ostensibly, (1) and (2) are the same because for now check-token == renew-token
successfully. But user can always ask us to not renew tokens explicitly at
app-submission for whatever reason.
> YARN's delegation-token handling disallows certain trust setups to operate
> properly
> -----------------------------------------------------------------------------------
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
> Issue Type: Bug
> Components: security
> Affects Versions: 2.3.0
> Reporter: Harsh J
> Attachments: YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON,
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM,
> as it attempts a renewDelegationToken(…) synchronously during application
> submission (to validate the managed token before it adds it to a scheduler
> for automatic renewal). The call obviously fails cause B realm will not trust
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously
> and once the renewal attempt failed we simply ceased to schedule any further
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on
> the failure and skip the scheduling alone, rather than bubble back an error
> to the client, failing the app submission. This way the old behaviour is
> retained.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)