[
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812321#comment-15812321
]
Bibin A Chundatt commented on YARN-6072:
----------------------------------------
+1 from my side too. Tried the same on my local cluster seems to be working
fine.
[~ajithshetty]
In addition to changing order please do update logging and exception thrown.
Currently we are losing trace.
{code}
@@ -708,7 +708,7 @@ void refreshAll() throws ServiceFailedException {
}
refreshClusterMaxPriority();
} catch (Exception ex) {
+ LOG.error(ex);
- throw new ServiceFailedException(ex.getMessage());
+ throw new ServiceFailedException(ex.getMessage(), ex);
}
{code}
> RM unable to start in secure mode
> ---------------------------------
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.8.0, 3.0.0-alpha2
> Reporter: Bibin A Chundatt
> Assignee: Ajith S
> Priority: Blocker
> Attachments: hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found
> resource hadoop-policy.xml at
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector:
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll
> during transition to Active
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> ... 5 more
> {code}
> ResourceManager services are added in following order
> # EmbeddedElector
> # AdminService
> During resource manager service start() .EmbeddedElector starts first and
> invokes {{AdminService#refreshAll()}} but {{AdminService#serviceStart()}}
> happens after {{ActiveStandbyElectorBasedElectorService}} service start is
> complete. So {{AdminService#server}} will be *null* which causes
> {{AdminService#refreshAll()}} to fail
> {code}
> if (getConfig().getBoolean(
> CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
> false)) {
> refreshServiceAcls();
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]