[
https://issues.apache.org/jira/browse/YARN-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966354#comment-13966354
]
Rohith commented on YARN-1924:
------------------------------
Hi, I applied this patch and testing. I found below NPE. ZK cluster was
comparativiely slow response.
{noformat}
2014-04-11 14:28:10,152 ERROR
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error
storing/updating appAttempt: appattempt_1397200878504_0209_000003
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:613)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:675)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:766)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:761)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:662)
2014-04-11 14:28:10,162 FATAL
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a
org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type
STATE_STORE_OP_FAILED. Cause:
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:613)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:675)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:766)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:761)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:662)
{noformat}
> STATE_STORE_OP_FAILED happens when ZKRMStateStore tries to update
> app(attempt) before storing it
> ------------------------------------------------------------------------------------------------
>
> Key: YARN-1924
> URL: https://issues.apache.org/jira/browse/YARN-1924
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.4.0
> Reporter: Arpit Gupta
> Assignee: Jian He
> Priority: Critical
> Fix For: 2.4.1
>
> Attachments: YARN-1924.1.patch, YARN-1924.2.patch
>
>
> Noticed on a HA cluster Both RM shut down with this error.
--
This message was sent by Atlassian JIRA
(v6.2#6252)