[
https://issues.apache.org/jira/browse/YARN-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249706#comment-15249706
]
tianyu commented on YARN-3524:
------------------------------
I am a new learner.
Today ,I have get the similar problem as above.
First,I use hadoop 2.7.1 on the centos cluster,and use Eclipse at window 7 to
code the MapReduce program .
As I run my program ,there are error as following:
2016-04-20 19:43:59,559 WARN util.NativeCodeLoader
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
2016-04-20 19:44:00,264 INFO Configuration.deprecation
(Configuration.java:warnOnceIfDeprecated(1173)) - session.id is deprecated.
Instead, use dfs.metrics.session-id
2016-04-20 19:44:00,264 INFO jvm.JvmMetrics (JvmMetrics.java:init(76)) -
Initializing JVM Metrics with processName=JobTracker, sessionId=
Exception in thread "main" ExitCodeException exitCode=-1073741515:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:731)
at
org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:489)
at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:529)
at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:507)
at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:305)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:133)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:144)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at com.wxb.TestStart.main(TestStart.java:71)
================
I have searching the Internet, and then learn about this discuss between you .
Looking forward you response.
Thankyou very much .
======
I have change from hadoop 2.7.1 to hadoop-2.7.2, the error remaining.
I have try to install the MSVCR100.dll, the error reamining.
Thankyou
> Mapreduce failed due to AM Container-Launch failure at NM on windows
> --------------------------------------------------------------------
>
> Key: YARN-3524
> URL: https://issues.apache.org/jira/browse/YARN-3524
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.5.2
> Environment: Windows server 2012 and Windows-8
> Hadoop-2.5.2
> Java-1.7
> Reporter: Kaveen Raajan
>
> I tried to run TEZ job on windows machine
> I successfully Build Tez-0.6.0 against Hadoop-2.5.2
> Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html
> But I face following error while running this command
> Note: I'm using HADOOP High Availability setup.
> {code}
> Running OrderedWordCount
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/C:/Hadoop/
> share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind
> er.class]
> SLF4J: Found binding in [jar:file:/C:/Tez/lib
> /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [
> component=tez-api
> , version=0.6.0, revision=${buildNumber},
> SCM-URL=scm:git:https://git-wip-us.apa
> che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ]
> 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id:
> app
> lication_1429073725727_0005
> 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is
> deprecated.
> Instead, use fs.defaultFS
> 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from
> conf
> iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/
> 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta
> ging doesn't exist and is created
> 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory
> hdfs://HACluster
> /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex
> ist and is created
> 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN,
> applicationId=a
> pplication_1429073725727_0005, dagName=OrderedWordCount
> 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application
> application_14
> 29073725727_0005
> 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM:
> http://MASTER_NN1:8088/proxy/application_1429073725727_0005/
> 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running
> 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
> OrderedWordCount failed with diagnostics: [Application
> application_1429073725727
> _0005 failed 2 times due to AM Container for
> appattempt_1429073725727_0005_00000
> 2 exited with exitCode: -1073741515 due to: Exception from container-launch:
> Ex
> itCodeException exitCode=-1073741515:
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
> 702)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
> unchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
> ontainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
> ontainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
> java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:615)
> at java.lang.Thread.run(Thread.java:744)
> 1 file(s) moved.
> Container exited with a non-zero exit code -1073741515
> .Failing this attempt.. Failing the application.]
> {code}
> While Seeing at Resourcemanager log:
> {code}
> 2015-04-19 21:49:57,533 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> completedContainer container=Container: [ContainerId:
> container_1429505171727_0001_02_000001, NodeId: SLAVE1:57794,
> NodeHttpAddress: SLAVE1:8042, Resource: <memory:2048, vCores:1>, Priority: 0,
> Token: Token { kind: ContainerToken, service: 172.16.100.92:57794 }, ]
> queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0,
> vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1,
> numContainers=0 cluster=<memory:8192, vCores:8>
> 2015-04-19 21:49:57,533 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0
> used=<memory:0, vCores:0> cluster=<memory:8192, vCores:8>
> 2015-04-19 21:49:57,533 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Re-sorting completed queue: root.default stats: default: capacity=1.0,
> absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0,
> absoluteUsedCapacity=0.0, numApps=1, numContainers=0
> 2015-04-19 21:49:57,533 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
> Application attempt appattempt_1429505171727_0001_000002 released container
> container_1429505171727_0001_02_000001 on node: host: SLAVE1:57794
> #containers=0 available=8192 used=0 with event: FINISHED
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> Watcher event type: NodeDataChanged with state:UserConnected for
> path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1429505171727_0001/appattempt_1429505171727_0001_000002
> for Service
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Unregistering app attempt : appattempt_1429505171727_0001_000002
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1429505171727_0001_000002 State change from FINAL_SAVING to FAILED
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating
> application application_1429505171727_0001 with final state: FAILED
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1429505171727_0001 State change from ACCEPTED to FINAL_SAVING
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating
> info for app: application_1429505171727_0001
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
> Application Attempt appattempt_1429505171727_0001_000002 is done.
> finalState=FAILED
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
> Application application_1429505171727_0001 requests cleared
> 2015-04-19 21:49:57,580 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> Application removed - appId: application_1429505171727_0001 user: SYSTEM
> queue: default #user-pending-applications: 0 #user-active-applications: 0
> #queue-pending-applications: 0 #queue-active-applications: 0
> 2015-04-19 21:49:57,611 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> Watcher event type: NodeDataChanged with state:UserConnected for
> path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1429505171727_0001 for
> Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore
> in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore:
> STARTED
> 2015-04-19 21:49:57,611 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application
> application_1429505171727_0001 failed 2 times due to AM Container for
> appattempt_1429505171727_0001_000002 exited with exitCode: -1073741515 due
> to: Exception from container-launch: ExitCodeException exitCode=-1073741515:
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 1 file(s) moved.
> Container exited with a non-zero exit code -1073741515
> .Failing this attempt.. Failing the application.
> 2015-04-19 21:49:57,627 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1429505171727_0001 State change from FINAL_SAVING to FAILED
> 2015-04-19 21:49:57,627 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
> Application removed - appId: application_1429505171727_0001 user: SYSTEM
> leaf-queue of parent: root #applications: 0
> 2015-04-19 21:49:57,627 WARN
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=SYSTEM
> OPERATION=Application Finished - Failed TARGET=RMAppManager
> RESULT=FAILURE DESCRIPTION=App failed with state: FAILED
> PERMISSIONS=Application application_1429505171727_0001 failed 2 times due to
> AM Container for appattempt_1429505171727_0001_000002 exited with exitCode:
> -1073741515 due to: Exception from container-launch: ExitCodeException
> exitCode=-1073741515:
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 1 file(s) moved.
> Container exited with a non-zero exit code -1073741515
> .Failing this attempt.. Failing the application.
> APPID=application_1429505171727_0001
> 2015-04-19 21:49:57,627 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary:
>
> appId=application_1429505171727_0001,name=OrderedWordCount,user=SYSTEM,queue=default,state=FAILED,trackingUrl=http://MASTER_NN1:8088/cluster/app/application_1429505171727_0001,appMasterHost=N/A,startTime=1429505386589,finishTime=1429505397580,finalStatus=FAILED
> 2015-04-19 21:49:58,580 INFO org.apache.hadoop.ipc.Server: Socket Reader #1
> for port 8032: readAndProcess from client 172.16.100.XX threw exception
> [java.io.IOException: An existing connection was forcibly closed by the
> remote host]
> {code}
> At nodemanager logs
> {code}
> 2015-04-20 10:19:59,365 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> launchContainer: [C:\Hadoop\bin\winutils.exe, task, create,
> container_1429505171727_0001_02_000001, cmd /c
> /tmp/hadoop-SLAVE1$/nm-local-dir/usercache/SYSTEM/appcache/application_1429505171727_0001/container_1429505171727_0001_02_000001/default_container_executor.cmd]
> 2015-04-20 10:19:59,436 WARN
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code
> from container container_1429505171727_0001_02_000001 is : -1073741515
> 2015-04-20 10:19:59,437 WARN
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception
> from container-launch with container ID:
> container_1429505171727_0001_02_000001 and exit code: -1073741515
> ExitCodeException exitCode=-1073741515:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 2015-04-20 10:19:59,438 INFO
> org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: 1
> file(s) moved.
> 2015-04-20 10:19:59,439 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Container exited with a non-zero exit code -1073741515
> 2015-04-20 10:19:59,439 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1429505171727_0001_02_000001 transitioned from RUNNING
> to EXITED_WITH_FAILURE
> 2015-04-20 10:19:59,440 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Cleaning up container container_1429505171727_0001_02_000001
> 2015-04-20 10:19:59,480 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting
> absolute path :
> /tmp/hadoop-SLAVE1$/nm-local-dir/usercache/SYSTEM/appcache/application_1429505171727_0001/container_1429505171727_0001_02_000001
> 2015-04-20 10:19:59,480 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=SYSTEM
> OPERATION=Container Finished - Failed TARGET=ContainerImpl
> RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE
> APPID=application_1429505171727_0001
> CONTAINERID=container_1429505171727_0001_02_000001
> 2015-04-20 10:19:59,481 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1429505171727_0001_02_000001 transitioned from
> EXITED_WITH_FAILURE to DONE
> 2015-04-20 10:19:59,481 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Removing container_1429505171727_0001_02_000001 from application
> application_1429505171727_0001
> 2015-04-20 10:19:59,481 INFO
> org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: ProcfsBasedProcessTree
> currently is supported only on Linux.
> {code}
> Problem might be while connecting to nodemanager it unable to handshake with
> ResourceManager.
> If I try in single node hadoop cluster mean It working correctly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)