[ 
https://issues.apache.org/jira/browse/YARN-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109048#comment-16109048
 ] 

abhishek bharani commented on YARN-6914:
----------------------------------------

Below is the information from NM Logs :

2017-08-01 10:19:50,510 ERROR org.apache.spark.network.util.LevelDBProvider: 
error opening leveldb file 
/usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb.  Creating new file, 
will not be able to recover state for existing applications
org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: 
/usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb/LOCK: No such file 
or directory
        at 
org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
        at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
        at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
        at 
org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:48)
        at 
org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:116)
        at 
org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:94)
        at 
org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:65)
        at 
org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:166)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:261)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:495)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543)
2017-08-01 10:19:50,511 WARN org.apache.spark.network.util.LevelDBProvider: 
error deleting /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb
2017-08-01 10:19:50,511 INFO org.apache.hadoop.service.AbstractService: Service 
spark_shuffle failed in state INITED; cause: java.io.IOException: Unable to 
create state store
java.io.IOException: Unable to create state store
        at 
org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:77)
        at 
org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:116)
        at 
org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:94)
        at 
org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:65)
        at 
org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:166)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:261)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:495)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: 
/usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb/LOCK: No such file 
or directory
        at 
org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
        at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
        at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
        at 
org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:75)
        ... 15 more
2017-08-01 10:19:50,513 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
  Using ResourceCalculatorPlugin : null


> Application application_1501553373419_0001 failed 2 times due to AM Container 
> for appattempt_1501553373419_0001_000002 exited with exitCode: -1000
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-6914
>                 URL: https://issues.apache.org/jira/browse/YARN-6914
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.7.3
>         Environment: Mac OS
>            Reporter: abhishek bharani
>            Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> I am getting below error while running 
> spark-shell --master yarn
> Application application_1501553373419_0001 failed 2 times due to AM Container 
> for appattempt_1501553373419_0001_000002 exited with exitCode: -1000
> For more detailed output, check application tracking 
> page:http://abhisheks-mbp:8088/cluster/app/application_1501553373419_0001Then,
>  click on links to logs of each attempt.
> Diagnostics: null
> Failing this attempt. Failing the application.
> Below are the contents of yarn-site.xml :
> <configuration>
>         <!-- Site specific YARN configuration properties -->
>         <property>
>                 <name>yarn.nodemanager.aux-services</name>
>                 <value>mapreduce_shuffle</value>
>         </property>
>        <property>
>                 
> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
>                 <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>        </property>
>         <property>
>                 <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
>                 
> <value>org.apache.spark.network.yarn.YarnShuffleService</value>
>         </property>
>         <property>
>                 <name>yarn.log-aggregation-enable</name>
>                 <value>true</value>
>         </property>
>         <property>
>                 
> <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
>                 <value>3600</value>
>         </property>
>         <property>
>                 <name>yarn.resourcemanager.hostname</name>
>                 <value>localhost</value>
>         </property>
>         <property>
>                         
> <name>yarn.resourcemanager.resourcetracker.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8025</value>
>                         <description>Enter your ResourceManager 
> hostname.</description>
>         </property>
>         <property>
>                         <name>yarn.resourcemanager.scheduler.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8035</value>
>                         <description>Enter your ResourceManager 
> hostname.</description>
>         </property>
>         <property>
>                         <name>yarn.resourcemanager.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8055</value>
>                         <description>Enter your ResourceManager 
> hostname.</description>
>         </property>
>         <property>
>                         <description>The http address of the RM web 
> application.</description>
>                         <name>yarn.resourcemanager.webapp.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8088</value>
>         </property>
> I tried many solutions but none of them is working :
> 1.Added property 
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage 
> to yarn-site.xml with value as 98.5
> 2.added below property to yarn-site.xml 
> yarn.nodemanager.aux-services.spark_shuffle.class 
> org.apache.spark.network.yarn.YarnShuffleService  
> 3.Added property in spark-defaults.conf 
> spark.yarn.jars=hdfs://localhost:50010/users/spark/jars/*.jar



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to