[ 
https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373772#comment-16373772
 ] 

Greg Senia edited comment on YARN-5910 at 2/23/18 1:21 AM:
-----------------------------------------------------------

[~jlowe] I was able to resolve my issue with the following:

 

Example/Details: 

*Distcp with Kerberos between Secure Clusters without Cross-realm 
Authentication*

Two clusters with the realms: *Source Realm (PROD.HDP.EXAMPLE.COM) and 
Destination Realm (MODL.HDP.EXAMPLE.COM)*

Data moves from the *Source Cluster (hdfs://prod) to the Destination Cluster 
(hdfs://modl)*

A *one-way cross-realm* trust exists between *Source Realm 
(PROD.HDP.EXAMPLE.COM and Active Directory (NT.EXAMPLE.COM), and Destination 
Realm (MODL.HDP.EXAMPLE.COM) and Active Directory (NT.EXAMPLE.COM).*

Both the Source Cluster (prod) and Destination Cluster (modl) are running a 
Hadoop Distribution with the following patches: 

https://issues.apache.org/jira/browse/HDFS-7546 

https://issues.apache.org/jira/browse/YARN-3021

We set mapreduce.job.hdfs-servers.token-renewal.exclude property which 
apparently instructs the Resource Managers on either cluster to skip or perform 
delegation token renewal for NameNode hosts and we also set the 
dfs.namenode.kerberos.principal.pattern property to * to allow distcp 
irrespective of the principal patterns of the source and destination clusters

*Example of Command that works:*

hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
-Dmapreduce.job.hdfs-servers.token-renewal.exclude=modl hdfs:///public_data 
hdfs://modl/public_data/gss_test2


was (Author: gss2002):
[~jlowe] I was able to resolve my issue with the following:

 

Kerberos Distcp between Secure Clusters (without Cross-realm Authentication)
Two clusters with the realms: SOURCE (PROD.HDP.EXAMPLE.COM) and DESTINATION 
(MODL.HDP.EXAMPLE.COM)
Data needs to move between SOURCE (hdfs://prod) to DESTINATION (hdfs://modl)
Trust exists between SOURCE (PROD.HDP.EXAMPLE.COM and Active Directory 
(NT.EXAMPLE.COM), and DESTINATION (MODL.HDP.EXAMPLE.COM) and Active Directory 
(NT.EXAMPLE.COM).
Both SOURCE (prod) and DESTINATION (modl) clusters are running a Hadoop Distro 
with the following Patches: https://issues.apache.org/jira/browse/HDFS-7546 and 
https://issues.apache.org/jira/browse/YARN-3021

Set mapreduce.job.hdfs-servers.token-renewal.exclude property to instruct 
ResourceManagers on either cluster to skip or perform delegation token renewal 
for NameNode hosts and set the dfs.namenode.kerberos.principal.pattern property 
to * to allow distcp irrespective of the principal patterns of the source and 
destination clusters

Example of Command that works:

hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
-Dmapreduce.job.hdfs-servers.token-renewal.exclude=modl hdfs:///public_data 
hdfs://modl/public_data/gss_test2

> Support for multi-cluster delegation tokens
> -------------------------------------------
>
>                 Key: YARN-5910
>                 URL: https://issues.apache.org/jira/browse/YARN-5910
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: security
>            Reporter: Clay B.
>            Assignee: Jian He
>            Priority: Minor
>             Fix For: 2.9.0, 3.0.0-alpha4
>
>         Attachments: YARN-5910.01.patch, YARN-5910.2.patch, 
> YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch, 
> YARN-5910.7.patch
>
>
> As an administrator running many secure (kerberized) clusters, some which 
> have peer clusters managed by other teams, I am looking for a way to run jobs 
> which may require services running on other clusters. Particular cases where 
> this rears itself are running something as core as a distcp between two 
> kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp 
> hdfs://LOCALCLUSTER/user/user292/test.out 
> hdfs://REMOTECLUSTER/user/user292/test.out.result}}).
> Thanks to YARN-3021, once can run for a while but if the delegation token for 
> the remote cluster needs renewal the job will fail[1]. One can pre-configure 
> their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes 
> available but that requires coordination that is not always feasible, 
> especially as a cluster's peers grow into the tens of clusters or across 
> management teams. Ideally, one could have core systems configured this way 
> but jobs could also specify their own handling of tokens and management when 
> needed?
> [1]: Example stack trace when the RM is unaware of a remote service:
> ----------------
> {code}
> 2016-03-23 14:59:50,528 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  application_1458441356031_3317 found existing hdfs token Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: 
> (HDFS_DELEGATION_TOKEN token
>  10927 for user292)
> 2016-03-23 14:59:50,557 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for 
> user292)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Unable to map logical nameservice URI 
> 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a 
> failover proxy provider configured.
> at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164)
> at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425)
> ... 6 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to