[ 
https://issues.apache.org/jira/browse/YARN-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081191#comment-16081191
 ] 

Jason Lowe commented on YARN-6798:
----------------------------------

I don't know the full story behind these various version bumps, but we need to 
stop the habit of bumping the major version in the state store without 
providing a migration path for older versions.

If we really need to bump the state store major version to support a new 
feature, my preference would be to do this in a lazy fashion as much as 
possible, i.e.: the major version should not be updated in the state store 
until the new feature is enabled/used.  That way we don't lose the ability to 
rollback the release to the old version if something goes terribly wrong after 
the upgrade before the new feature is used.  If that's not possible for some 
reason then the code needs to recognize the older state store versions on 
startup and either do a one-time pass over the data on startup to migrate it to 
the new schema or otherwise deal with it on the fly during reading for recovery.

> NM startup failure with old state store due to version mismatch
> ---------------------------------------------------------------
>
>                 Key: YARN-6798
>                 URL: https://issues.apache.org/jira/browse/YARN-6798
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0-alpha4
>            Reporter: Ray Chiang
>
> YARN-6703 rolled back the state store version number for the RM from 2.0 to 
> 1.4.
> YARN-6127 bumped the version for the NM to 3.0
>     private static final Version CURRENT_VERSION_INFO = 
> Version.newInstance(3, 0);
> YARN-5049 bumped the version for the NM to 2.0
>     private static final Version CURRENT_VERSION_INFO = 
> Version.newInstance(2, 0);
> During an upgrade, all NMs died after upgrading a C6 cluster from alpha2 to 
> alpha4.
> {noformat}
> 2017-07-07 15:48:17,259 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> Incompatible version for NM state: expecting NM state version 3.0, but 
> loading version 2.0
>         at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:246)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:307)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:748)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:809)
> Caused by: java.io.IOException: Incompatible version for NM state: expecting 
> NM state version 3.0, but loading version 2.0
>         at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion(NMLeveldbStateStoreService.java:1454)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1308)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:307)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 5 more
> 2017-07-07 15:48:17,277 INFO 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NodeManager at xxx.gce.cloudera.com/aa.bb.cc.dd
> ************************************************************/
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to