[
https://issues.apache.org/jira/browse/YARN-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583660#comment-13583660
]
Sandy Ryza commented on YARN-413:
---------------------------------
2013-02-21 13:27:24,307 FATAL
org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting
NodeManager
org.apache.hadoop.yarn.YarnException: Failed to Start
org.apache.hadoop.yarn.server.nodemanager.NodeManager
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359)
Caused by: org.apache.hadoop.yarn.YarnException: Failed to Start
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.start(ContainerManagerImpl.java:248)
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
... 3 more
Caused by: org.apache.hadoop.yarn.YarnException: Failed to create remoteLogDir
[/tmp/logs]
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:207)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.start(LogAggregationService.java:132)
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
... 5 more
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
Cannot create directory /tmp/logs. Name node is in safe mode.
The reported blocks 7 has reached the threshold 0.9990 of total blocks 7. Safe
mode will be turned off automatically in 25 seconds.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3067)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3045)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3024)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:667)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:468)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40995)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:482)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1018)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1778)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1774)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1488)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1772)
at org.apache.hadoop.ipc.Client.call(Client.java:1237)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:163)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:82)
at $Proxy9.mkdirs(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:450)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2115)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2086)
at
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:540)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:204)
... 7 more
2013-02-21 13:27:24,308 INFO org.apache.hadoop.ipc.Server: Stopping server on
47223
2013-02-21 13:27:24,308 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService
waiting for pending aggregation during exit
2013-02-21 13:27:24,309 INFO org.apache.hadoop.ipc.Server: Stopping server on
8040
> With log aggregation on, nodemanager dies on startup if it can't connect to
> HDFS
> --------------------------------------------------------------------------------
>
> Key: YARN-413
> URL: https://issues.apache.org/jira/browse/YARN-413
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.0.3-alpha
> Reporter: Sandy Ryza
>
> If log aggregation is on, when the nodemanager starts up, it tries to create
> the remote log directory. If this fails, it kills itself. It doesn't seem
> like turning log aggregation on should ever cause the nodemanager to die.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira