[
https://issues.apache.org/jira/browse/YARN-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
lqjack updated YARN-8231:
-------------------------
Comment: was deleted
(was: when I try to recur the issue , I get the output in the console as below
:
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/conf/Configuration
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at
sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.conf.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
)
> Dshell application fails when one of the docker container gets killed
> ---------------------------------------------------------------------
>
> Key: YARN-8231
> URL: https://issues.apache.org/jira/browse/YARN-8231
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn-native-services
> Reporter: Yesha Vora
> Priority: Critical
>
> 1) Launch dshell application
> {code}
> yarn jar hadoop-yarn-applications-distributedshell-*.jar -shell_command
> "sleep 300" -num_containers 2 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker
> -shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos/httpd-24-centos7:latest
> -keep_containers_across_application_attempts -jar
> /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell-*.jar{code}
> 2) Kill container_1524681858728_0012_01_000002
> Expected behavior:
> Application should start new instance and finish successfully
> Actual behavior:
> Application Failed as soon as container was killed
> {code:title=AM log}
> 18/04/27 23:05:12 INFO distributedshell.ApplicationMaster: Got response from
> RM for container ask, completedCnt=1
> 18/04/27 23:05:12 INFO distributedshell.ApplicationMaster:
> appattempt_1524681858728_0012_000001 got container status for
> containerID=container_1524681858728_0012_01_000002, state=COMPLETE,
> exitStatus=137, diagnostics=[2018-04-27 23:05:09.310]Container killed on
> request. Exit code is 137
> [2018-04-27 23:05:09.331]Container exited with a non-zero exit code 137.
> [2018-04-27 23:05:09.332]Killed by external signal
> 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Got response from
> RM for container ask, completedCnt=1
> 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster:
> appattempt_1524681858728_0012_000001 got container status for
> containerID=container_1524681858728_0012_01_000003, state=COMPLETE,
> exitStatus=0, diagnostics=
> 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Container
> completed successfully., containerId=container_1524681858728_0012_01_000003
> 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Application
> completed. Stopping running containers
> 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Application
> completed. Signalling finish to RM
> 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Diagnostics.,
> total=2, completed=2, allocated=2, failed=1
> 18/04/27 23:08:46 INFO impl.AMRMClientImpl: Waiting for application to be
> successfully unregistered.{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]