Yesha Vora created YARN-8116:
--------------------------------
Summary: Nodemanager fails with NumberFormatException: For input
string: ""
Key: YARN-8116
URL: https://issues.apache.org/jira/browse/YARN-8116
Project: Hadoop YARN
Issue Type: Bug
Reporter: Yesha Vora
Steps followed.
1) Update nodemanager debug delay config
{code}
<property>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>350</value>
</property>{code}
2) Launch distributed shell application multiple times
{code}
/usr/hdp/current/hadoop-yarn-client/bin/yarn jar
hadoop-yarn-applications-distributedshell-*.jar -shell_command "sleep 120"
-num_containers 1 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env
YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos/httpd-24-centos7:latest -shell_env
YARN_CONTAINER_RUNTIME_DOCKER_DELAYED_REMOVAL=true -jar
hadoop-yarn-applications-distributedshell-*.jar{code}
3) restart NM
Nodemanager fails to start with below error.
{code}
{code:title=NM log}
2018-03-23 21:32:14,437 INFO monitor.ContainersMonitorImpl
(ContainersMonitorImpl.java:serviceInit(181)) - ContainersMonitor enabled: true
2018-03-23 21:32:14,439 INFO logaggregation.LogAggregationService
(LogAggregationService.java:serviceInit(130)) - rollingMonitorInterval is set
as 3600. The logs will be aggregated every 3600 seconds
2018-03-23 21:32:14,455 INFO service.AbstractService
(AbstractService.java:noteFailure(267)) - Service
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
failed in state INITED
java.lang.NumberFormatException: For input string: ""
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:601)
at java.lang.Long.parseLong(Long.java:631)
at
org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainerState(NMLeveldbStateStoreService.java:350)
at
org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainersState(NMLeveldbStateStoreService.java:253)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:365)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:464)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:899)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:960)
2018-03-23 21:32:14,458 INFO logaggregation.LogAggregationService
(LogAggregationService.java:serviceStop(148)) -
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService
waiting for pending aggregation during exit
2018-03-23 21:32:14,460 INFO service.AbstractService
(AbstractService.java:noteFailure(267)) - Service NodeManager failed in state
INITED
java.lang.NumberFormatException: For input string: ""
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:601)
at java.lang.Long.parseLong(Long.java:631)
at
org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainerState(NMLeveldbStateStoreService.java:350)
at
org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainersState(NMLeveldbStateStoreService.java:253)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:365)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:464)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:899)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:960)
2018-03-23 21:32:14,463 INFO impl.MetricsSystemImpl
(MetricsSystemImpl.java:stop(210)) - Stopping NodeManager metrics system...
2018-03-23 21:32:14,464 INFO impl.MetricsSinkAdapter
(MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread
interrupted.
2018-03-23 21:32:14,468 INFO impl.MetricsSystemImpl
(MetricsSystemImpl.java:stop(216)) - NodeManager metrics system stopped.
2018-03-23 21:32:14,508 INFO impl.MetricsSystemImpl
(MetricsSystemImpl.java:shutdown(607)) - NodeManager metrics system shutdown
complete.{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]