[
https://issues.apache.org/jira/browse/YARN-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14079701#comment-14079701
]
Jason Lowe commented on YARN-2208:
----------------------------------
This appears to have broken backwards compatibility with the previous release,
since the new RM cannot load an old AMRM token persisted in the state store. A
sample exception where the new RM starts with old RM state:
{noformat}
2014-07-30 11:09:17,041 FATAL [main] resourcemanager.ResourceManager
(ResourceManager.java:main(1050)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: java.io.EOFException
at
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:837)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:877)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:874)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:874)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:918)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1047)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.hadoop.yarn.security.AMRMTokenIdentifier.readFields(AMRMTokenIdentifier.java:87)
at
org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:142)
at
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.addPersistedPassword(AMRMTokenSecretManager.java:205)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recoverAppAttemptCredentials(RMAppAttemptImpl.java:740)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:710)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:676)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:312)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:425)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1030)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:489)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
... 10 more
{noformat}
I realize this is supposed to be fixed eventually under YARN-668, but in the
interim token changes like this and YARN-2152 are routinely breaking the
ability to do upgrades without wiping the YARN state stores of the cluster.
Arguably this should either be marked as an incompatible change or the release
note should state that the RM state store needs to be wiped when upgrading.
> AMRMTokenManager need to have a way to roll over AMRMToken
> ----------------------------------------------------------
>
> Key: YARN-2208
> URL: https://issues.apache.org/jira/browse/YARN-2208
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Xuan Gong
> Assignee: Xuan Gong
> Fix For: 2.6.0
>
> Attachments: YARN-2208.1.patch, YARN-2208.2.patch, YARN-2208.3.patch,
> YARN-2208.4.patch, YARN-2208.5.patch, YARN-2208.5.patch, YARN-2208.6.patch,
> YARN-2208.7.patch, YARN-2208.8.patch, YARN-2208.8.patch, YARN-2208.8.patch,
> YARN-2208.9.patch, YARN-2208.9.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)