[ 
https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828772#comment-15828772
 ] 

Jason Lowe commented on YARN-5910:
----------------------------------

bq. Currently, the RM DelegationTokenRewener will only add the tokens if 
security is enabled (code in RMAppManager#submitApplication), so I think with 
this existing implemtation, we can assume this feature is for security enabled 
only ?

Yeah, I'm thinking it's unnecessary to check both.  This new config has no 
value by default.   A user or admin would have to go out of their way to set 
it.  If they did, then they expect the confs to be in the submission context.

bq. On ther other hand, Varun chatted offline that we can add a limit config in 
RM to limit the size of configs, your opinion ?

A RM-side limit for configs may not be a bad idea to avoid a problematic client 
or user that sets ".*" as the conf filter. ;-)  It won't solve the problem of 
the gigantic RPC trying to come in, but at least the RM can quickly discard it 
before trying to persist it.

bq. So, in the latest patch I changed it to let all apps share the same 
renewerConf - this is based on the assumption that "dfs.nameservices" must have 
distint keys for each distinct cluster, so we won't have situation where two 
apps use different configs for the same cluster - it is true that unnecessary 
configs used by 1st app will be shared by subsequent apps. 

This is bad for two reasons.  One is the polluting problem -- there could be 
cases where the presence of a config key from a previous app is problematic for 
renewal in a subsequent app.  The other is a memory leak.  
Configuration.addResource will add a resource object to the list of resources 
for the config and never get rid of them.  This will cause every app-specific 
conf to be tracked by renewerConf forever, resulting in a memory leak.

One solution to this is to track the (partial) app configurations separately 
and then make a copy of the RM's conf and merge in the partial app conf 
"on-demand" when it's time to renew the token for the app.  Then we're not 
storing a full copy of the RM's configs for every app, just the parts that need 
to be per-app.  If doing the repetitive copy and merge of the conf is too 
expensive then we can derive a Configuration subclass that takes the app conf 
and RM conf in the constructor.  When we try to do property lookups it tries to 
find it in the app conf and falls back to the RM conf if necessary.  Then we 
don't have to make copies and merge each time.


> Support for multi-cluster delegation tokens
> -------------------------------------------
>
>                 Key: YARN-5910
>                 URL: https://issues.apache.org/jira/browse/YARN-5910
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: security
>            Reporter: Clay B.
>            Assignee: Jian He
>            Priority: Minor
>         Attachments: YARN-5910.01.patch, YARN-5910.2.patch, 
> YARN-5910.3.patch, YARN-5910.4.patch
>
>
> As an administrator running many secure (kerberized) clusters, some which 
> have peer clusters managed by other teams, I am looking for a way to run jobs 
> which may require services running on other clusters. Particular cases where 
> this rears itself are running something as core as a distcp between two 
> kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp 
> hdfs://LOCALCLUSTER/user/user292/test.out 
> hdfs://REMOTECLUSTER/user/user292/test.out.result}}).
> Thanks to YARN-3021, once can run for a while but if the delegation token for 
> the remote cluster needs renewal the job will fail[1]. One can pre-configure 
> their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes 
> available but that requires coordination that is not always feasible, 
> especially as a cluster's peers grow into the tens of clusters or across 
> management teams. Ideally, one could have core systems configured this way 
> but jobs could also specify their own handling of tokens and management when 
> needed?
> [1]: Example stack trace when the RM is unaware of a remote service:
> ----------------
> {code}
> 2016-03-23 14:59:50,528 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  application_1458441356031_3317 found existing hdfs token Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: 
> (HDFS_DELEGATION_TOKEN token
>  10927 for user292)
> 2016-03-23 14:59:50,557 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for 
> user292)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Unable to map logical nameservice URI 
> 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a 
> failover proxy provider configured.
> at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164)
> at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425)
> ... 6 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to