[
https://issues.apache.org/jira/browse/YARN-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Advertising
Varun Vasudev reassigned YARN-2426:
-----------------------------------
Assignee: Varun Vasudev
> NodeManger is not able use WebHDFS token properly to tallk to WebHDFS while
> localizing
> ---------------------------------------------------------------------------------------
>
> Key: YARN-2426
> URL: https://issues.apache.org/jira/browse/YARN-2426
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager, resourcemanager, webapp
> Affects Versions: 2.6.0
> Environment: Hadoop Keberos (Secure) cluster with
> LinuxContainerExcutor is enabled
> With SPNEGO on for Yarn new RM web services for application submission
> While using kinit we are using -C (to specify cachepath).
> Then while executing set export KRB5CCNAME = <path provided with -C option>
> There is no kerberos ticket in default KRB5 cache path with is /tmp
> Reporter: Karam Singh
> Assignee: Varun Vasudev
>
> Encountered this issue during using new YARN's RM WS for application
> submission, on single node cluster while submitting Distributed Shell
> application using RM WS(webservice).
> For this we need pass custom script and AppMaster jar along with webhdfs
> token to NodeManager for localization.
> Distributed Shell Application was failing as Node was failing to localise
> AppMaster jar .
> Following is the NM log while localizing AppMaster jar:
> {code}
> 2014-08-18 01:53:52,434 INFO authorize.ServiceAuthorizationManager
> (ServiceAuthorizationManager.java:authorize(114)) - Authorization successful
> for testing (auth:TOKEN) for protocol=interface
> org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-08-18 01:53:52,757 INFO localizer.ResourceLocalizationService
> (ResourceLocalizationService.java:update(1011)) - DEBUG: FAILED {
> webhdfs://<NAMENODEHOST>:<NAMENODEHTTPPORT>/user/<JARpPATH>, 1408352019488,
> FILE, null }, Authentication required
> 2014-08-18 01:53:52,758 INFO localizer.LocalizedResource
> (LocalizedResource.java:handle(203)) - Resource
> webhdfs://<NAMENODEHOST>:<NAMENODEHTTPPORT>/user/<JARPATH>(-><NM_LOCAL_DIR>/usercache/<APP_USER>/appcache/application_1408351986532_0001/filecache/10/DshellAppMaster.jar)
> transitioned from DOWNLOADING to FAILED
> 2014-08-18 01:53:52,758 INFO container.Container
> (ContainerImpl.java:handle(999)) - Container
> container_1408351986532_0001_01_000001 transitioned from LOCALIZING to
> LOCALIZATION_FAILED
> {code}
> Which is similar to what we get is when we try access webhdfs in secure
> (kerberos) cluster without doing kinit
> Whereas if we do curl -i -k -s
> 'http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>?op=listStatus&delegation=<same
> webhdfs token used in app submission structure>"
> works properly
> I also tried using
> http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/hadoopqa/<JAR_PATH>
> in app submission object instead of webhdfs:// uri format
> Then NodeManger fail to localize as there is http filesystem scheme
> {code}
> 14-08-18 02:03:31,343 INFO authorize.ServiceAuthorizationManager
> (ServiceAuthorizationManager.java:authorize(114)) - Authorization successful
> for testing (auth:TOKEN) for protocol=interface org.apache.
> hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-08-18 02:03:31,583 INFO localizer.ResourceLocalizationService
> (ResourceLocalizationService.java:update(1011)) - DEBUG: FAILED {
> http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>
> 1408352576841, FILE, null }, No FileSystem for scheme: http
> 2014-08-18 02:03:31,583 INFO localizer.LocalizedResource
> (LocalizedResource.java:handle(203)) - Resource
> http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>(-><NM_LOCAL_DIR>/usercache/<APP_USER>/appcache/application_1408352544163_0002/filecache/11/DshellAppMaster.jar)
> transitioned from DOWNLOADING to FAILED
> {code}
> Now do kinit without providing -C option for KRB5 cache path. So Ticket to
> goes to default KRB5 cache /tmp
> Again submit same application object to Yarn WS, with webhdfs:// uri format
> paths and webhdfs token
> This time NM is able download jar and custom shell script and application
> runs fine
> Looks like following is happening:
> webhdfs is trying look for krb ticket in NM while localising
> 1. As 1st case there was to krb ticket there in default cache. Application
> failing while localising AppMaster jar
> 2. In second case as already kinit and krb ticket was present in /tmp
> (default KRB5 cache). AppMaster got localized successfully
--
This message was sent by Atlassian JIRA
(v6.2#6252)