[ 
https://issues.apache.org/jira/browse/YARN-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537558#comment-14537558
 ] 

Rohith commented on YARN-3614:
------------------------------

YARN-3410 try to remove the application from RMStateStore which is used as RM 
start up arguments i.e {{./yarn resourcemanager 
-remove-application-from-state-store <appId>}}. 

I am wondering about the use case that why someone move this application folder 
manually?? OTOH, it is better either check for path existence of handle the 
exception and log WARN message instead of throwing exception which crashes the 
RM

> FileSystemRMStateStore throw exception when failed to remove application, 
> that cause resourcemanager to crash
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3614
>                 URL: https://issues.apache.org/jira/browse/YARN-3614
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.5.0
>            Reporter: lachisis
>            Priority: Critical
>
> FileSystemRMStateStore is only a accessorial plug-in of rmstore. 
> When it failed to remove application, I think warning is enough, but now 
> resourcemanager crashed.
> Recently, I configure 
> "yarn.resourcemanager.state-store.max-completed-applications"  to limit 
> applications number in rmstore. when applications number exceed the limit, 
> some old applications will be removed. If failed to remove, resourcemanager 
> will crash.
> The following is log: 
> 2015-05-11 06:58:43,815 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing 
> info for app: application_1430994493305_0053
> 2015-05-11 06:58:43,815 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
>  Removing info for app: application_1430994493305_0053 at: 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> 2015-05-11 06:58:43,816 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> removing app: application_1430994493305_0053
> java.lang.Exception: Failed to delete 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:879)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:874)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-05-11 06:58:43,819 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> java.lang.Exception: Failed to delete 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:879)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:874)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to