Arpit Gupta created YARN-1220:
---------------------------------
Summary: Yarn RM fs state store should handle safemode exceptions
Key: YARN-1220
URL: https://issues.apache.org/jira/browse/YARN-1220
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta
Assignee: Vinod Kumar Vavilapalli
{code}
ons: 0
2013-09-18 05:41:13,542 ERROR recovery.RMStateStore
(RMStateStore.java:handleStoreEvent(490)) - Error removing app:
application_1379482521108_0003
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
Cannot delete
/tmp/hadoop-yarn/yarn/system/rmstore/FSRMStateRoot/RMAppRoot/application_1379482521108_0003.
Name node is in safe mode.
The reported blocks 1018 has reached the threshold 1.0000 of total blocks 1018.
The number of live datanodes 5 has reached the minimum number 0. Safe mode will
be turned off automatically in 20 seconds.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:3124)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:3083)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3067)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:697)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:491)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Clien
{code}
The issue here is that in case namenode is in safemode while we are interacting
with fs state store we wont be able to update the status. In this particular
case the app was never removed from the store and upon rm restart the app was
recovered when it did not need to be.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira