[
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527639#comment-14527639
]
Wangda Tan commented on YARN-3514:
----------------------------------
[~cnauroth], I think this causes other problems in latest YARN as well, for
example:
If a user with name with mixed cases for example "De", if we have a rule "/L"
in kerberos side to make all names to lower case, when NM doing log
aggregation, it will fail because user name doesn't match (in
UserGroupInformation is "de", but in "OS").
{code}
java.io.IOException: Owner 'De' for path
/hadoop/yarn2/log/application_1428608050835_0013/container_1428608050835_0013_01_000006/stder
r did not match expected owner 'de'
at org.apache.hadoop.io.SecureIOUtils.checkStat(SecureIOUtils.java:285)
at
org.apache.hadoop.io.SecureIOUtils.forceSecureOpenForRead(SecureIOUtils.java:219)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:204)
at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogValue.secureOpenFile(AggregatedLogFormat.java:275)
at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogValue.write(AggregatedLogFormat.java:227)
at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:448)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl$ContainerLogAggregator.doContainer
LogAggregation(AppLogAggregatorImpl.java:534)
at
...
{code}
One possible solution is ignoring cases while compare user name, but that will
be problematic when user "De"/"de" existed at the same time. Any thoughts?
[~cnauroth].
> Active directory usernames like domain\login cause YARN failures
> ----------------------------------------------------------------
>
> Key: YARN-3514
> URL: https://issues.apache.org/jira/browse/YARN-3514
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.2.0
> Environment: CentOS6
> Reporter: john lilley
> Assignee: Chris Nauroth
> Priority: Minor
> Attachments: YARN-3514.001.patch, YARN-3514.002.patch
>
>
> We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is
> Kerberos-enabled and uses an external AD domain controller for the KDC. We
> are able to authenticate, browse HDFS, etc. However, YARN fails during
> localization because it seems to get confused by the presence of a \
> character in the local user name.
> Our AD authentication on the nodes goes through sssd and set configured to
> map AD users onto the form domain\username. For example, our test user has a
> Kerberos principal of [email protected] and that maps onto a CentOS user
> "domain\hadoopuser". We have no problem validating that user with PAM,
> logging in as that user, su-ing to that user, etc.
> However, when we attempt to run a YARN application master, the localization
> step fails when setting up the local cache directory for the AM. The error
> that comes out of the RM logs:
> 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication:
> ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED,
> diagnostics='Application application_1429295486450_0001 failed 1 times due to
> AM Container for appattempt_1429295486450_0001_000001 exited with exitCode:
> -1000 due to: Application application_1429295486450_0001 initialization
> failed (exitCode=255) with output: main : command provided 0
> main : user is DOMAIN\hadoopuser
> main : requested yarn user is domain\hadoopuser
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create
> directory:
> /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
> at
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
> .Failing this attempt.. Failing the application.'
> However, when we look on the node launching the AM, we see this:
> [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
> [root@rpb-cdh-kerb-2 usercache]# ls -l
> drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
> There appears to be different treatment of the \ character in different
> places. Something creates the directory as "domain\hadoopuser" but something
> else later attempts to use it as "domain%5Chadoopuser". I’m not sure where
> or why the URL escapement converts the \ to %5C or why this is not consistent.
> I should also mention, for the sake of completeness, our auth_to_local rule
> is set up to map [email protected] to domain\user:
> RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)