[ https://issues.apache.org/jira/browse/YARN-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Nauroth resolved YARN-3524. --------------------------------- Resolution: Not A Problem Hello [~KaveenBigdata]. Nice debugging! The native components for Hadoop on Windows are built using either Windows SDK 7.1 or Visual Studio 2010. Because of this, there is a runtime dependency on the C++ 2010 runtime dll, which is MSVCR100.dll. You are correct that the fix in this case is to install the missing dll. I believe this is the official download location: https://www.microsoft.com/en-us/download/details.aspx?id=13523 Since this does not represent a bug in the Hadoop codebase, I'm resolving this issue as Not a Problem. > Mapreduce failed due to AM Container-Launch failure at NM on windows > -------------------------------------------------------------------- > > Key: YARN-3524 > URL: https://issues.apache.org/jira/browse/YARN-3524 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.5.2 > Environment: Windows server 2012 and Windows-8 > Hadoop-2.5.2 > Java-1.7 > Reporter: Kaveen Raajan > > I tried to run TEZ job on windows machine > I successfully Build Tez-0.6.0 against Hadoop-2.5.2 > Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html > But I face following error while running this command > Note: I'm using HADOOP High Availability setup. > {code} > Running OrderedWordCount > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in [jar:file:/C:/Hadoop/ > share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind > er.class] > SLF4J: Found binding in [jar:file:/C:/Tez/lib > /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ > component=tez-api > , version=0.6.0, revision=${buildNumber}, > SCM-URL=scm:git:https://git-wip-us.apa > che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ] > 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: > app > lication_1429073725727_0005 > 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is > deprecated. > Instead, use fs.defaultFS > 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from > conf > iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/ > 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta > ging doesn't exist and is created > 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory > hdfs://HACluster > /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex > ist and is created > 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, > applicationId=a > pplication_1429073725727_0005, dagName=OrderedWordCount > 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application > application_14 > 29073725727_0005 > 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: > http://MASTER_NN1:8088/proxy/application_1429073725727_0005/ > 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running > 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED > OrderedWordCount failed with diagnostics: [Application > application_1429073725727 > _0005 failed 2 times due to AM Container for > appattempt_1429073725727_0005_00000 > 2 exited with exitCode: -1073741515 due to: Exception from container-launch: > Ex > itCodeException exitCode=-1073741515: > ExitCodeException exitCode=-1073741515: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: > 702) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la > unchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C > ontainerLaunch.call(ContainerLaunch.java:300) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C > ontainerLaunch.call(ContainerLaunch.java:81) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. > java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor > .java:615) > at java.lang.Thread.run(Thread.java:744) > 1 file(s) moved. > Container exited with a non-zero exit code -1073741515 > .Failing this attempt.. Failing the application.] > {code} > While Seeing at Resourcemanager log: > {code} > 2015-04-19 21:49:57,533 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > completedContainer container=Container: [ContainerId: > container_1429505171727_0001_02_000001, NodeId: SLAVE1:57794, > NodeHttpAddress: SLAVE1:8042, Resource: <memory:2048, vCores:1>, Priority: 0, > Token: Token { kind: ContainerToken, service: 172.16.100.92:57794 }, ] > queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, > vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, > numContainers=0 cluster=<memory:8192, vCores:8> > 2015-04-19 21:49:57,533 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 > used=<memory:0, vCores:0> cluster=<memory:8192, vCores:8> > 2015-04-19 21:49:57,533 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > Re-sorting completed queue: root.default stats: default: capacity=1.0, > absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, > absoluteUsedCapacity=0.0, numApps=1, numContainers=0 > 2015-04-19 21:49:57,533 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Application attempt appattempt_1429505171727_0001_000002 released container > container_1429505171727_0001_02_000001 on node: host: SLAVE1:57794 > #containers=0 available=8192 used=0 with event: FINISHED > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > Watcher event type: NodeDataChanged with state:UserConnected for > path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1429505171727_0001/appattempt_1429505171727_0001_000002 > for Service > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: > Unregistering app attempt : appattempt_1429505171727_0001_000002 > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1429505171727_0001_000002 State change from FINAL_SAVING to FAILED > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating > application application_1429505171727_0001 with final state: FAILED > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1429505171727_0001 State change from ACCEPTED to FINAL_SAVING > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating > info for app: application_1429505171727_0001 > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Application Attempt appattempt_1429505171727_0001_000002 is done. > finalState=FAILED > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: > Application application_1429505171727_0001 requests cleared > 2015-04-19 21:49:57,580 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > Application removed - appId: application_1429505171727_0001 user: SYSTEM > queue: default #user-pending-applications: 0 #user-active-applications: 0 > #queue-pending-applications: 0 #queue-active-applications: 0 > 2015-04-19 21:49:57,611 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > Watcher event type: NodeDataChanged with state:UserConnected for > path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1429505171727_0001 for > Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore > in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: > STARTED > 2015-04-19 21:49:57,611 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application > application_1429505171727_0001 failed 2 times due to AM Container for > appattempt_1429505171727_0001_000002 exited with exitCode: -1073741515 due > to: Exception from container-launch: ExitCodeException exitCode=-1073741515: > ExitCodeException exitCode=-1073741515: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 1 file(s) moved. > Container exited with a non-zero exit code -1073741515 > .Failing this attempt.. Failing the application. > 2015-04-19 21:49:57,627 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: > application_1429505171727_0001 State change from FINAL_SAVING to FAILED > 2015-04-19 21:49:57,627 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > Application removed - appId: application_1429505171727_0001 user: SYSTEM > leaf-queue of parent: root #applications: 0 > 2015-04-19 21:49:57,627 WARN > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=SYSTEM > OPERATION=Application Finished - Failed TARGET=RMAppManager > RESULT=FAILURE DESCRIPTION=App failed with state: FAILED > PERMISSIONS=Application application_1429505171727_0001 failed 2 times due to > AM Container for appattempt_1429505171727_0001_000002 exited with exitCode: > -1073741515 due to: Exception from container-launch: ExitCodeException > exitCode=-1073741515: > ExitCodeException exitCode=-1073741515: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 1 file(s) moved. > Container exited with a non-zero exit code -1073741515 > .Failing this attempt.. Failing the application. > APPID=application_1429505171727_0001 > 2015-04-19 21:49:57,627 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: > > appId=application_1429505171727_0001,name=OrderedWordCount,user=SYSTEM,queue=default,state=FAILED,trackingUrl=http://MASTER_NN1:8088/cluster/app/application_1429505171727_0001,appMasterHost=N/A,startTime=1429505386589,finishTime=1429505397580,finalStatus=FAILED > 2015-04-19 21:49:58,580 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 > for port 8032: readAndProcess from client 172.16.100.XX threw exception > [java.io.IOException: An existing connection was forcibly closed by the > remote host] > {code} > At nodemanager logs > {code} > 2015-04-20 10:19:59,365 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: > launchContainer: [C:\Hadoop\bin\winutils.exe, task, create, > container_1429505171727_0001_02_000001, cmd /c > /tmp/hadoop-SLAVE1$/nm-local-dir/usercache/SYSTEM/appcache/application_1429505171727_0001/container_1429505171727_0001_02_000001/default_container_executor.cmd] > 2015-04-20 10:19:59,436 WARN > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code > from container container_1429505171727_0001_02_000001 is : -1073741515 > 2015-04-20 10:19:59,437 WARN > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception > from container-launch with container ID: > container_1429505171727_0001_02_000001 and exit code: -1073741515 > ExitCodeException exitCode=-1073741515: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 2015-04-20 10:19:59,438 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: 1 > file(s) moved. > 2015-04-20 10:19:59,439 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Container exited with a non-zero exit code -1073741515 > 2015-04-20 10:19:59,439 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1429505171727_0001_02_000001 transitioned from RUNNING > to EXITED_WITH_FAILURE > 2015-04-20 10:19:59,440 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Cleaning up container container_1429505171727_0001_02_000001 > 2015-04-20 10:19:59,480 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting > absolute path : > /tmp/hadoop-SLAVE1$/nm-local-dir/usercache/SYSTEM/appcache/application_1429505171727_0001/container_1429505171727_0001_02_000001 > 2015-04-20 10:19:59,480 WARN > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=SYSTEM > OPERATION=Container Finished - Failed TARGET=ContainerImpl > RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE > APPID=application_1429505171727_0001 > CONTAINERID=container_1429505171727_0001_02_000001 > 2015-04-20 10:19:59,481 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1429505171727_0001_02_000001 transitioned from > EXITED_WITH_FAILURE to DONE > 2015-04-20 10:19:59,481 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Removing container_1429505171727_0001_02_000001 from application > application_1429505171727_0001 > 2015-04-20 10:19:59,481 INFO > org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: ProcfsBasedProcessTree > currently is supported only on Linux. > {code} > Problem might be while connecting to nodemanager it unable to handshake with > ResourceManager. > If I try in single node hadoop cluster mean It working correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)