[
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001633#comment-16001633
]
Vrushali C commented on YARN-6555:
----------------------------------
Thanks [~haibochen], I will use that jira YARN-6323 for ensuring a default is
set, which would help with rolling upgrades. This jira can be used for adding
in the code to read/write to NM State store.
> Enable flow context read (& corresponding write) for recovering application
> with NM restart
> --------------------------------------------------------------------------------------------
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Vrushali C
> Assignee: Vrushali C
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled,
> then NM fails to start and throws an error as "flow context can't be null".
> This is happening because the flow context did not exist before but now that
> timeline service v2 is enabled, ApplicationImpl expects it to exist.
> This would also happen even if flow context existed before but since we are
> not persisting it / reading it during
> ContainerManagerImpl#recoverApplication, it does not get passed in to
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.<init>(ApplicationImpl.java:104)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.<init>(ApplicationImpl.java:90)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]