[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433405#comment-16433405 ]
Eric Yang commented on YARN-8142: --------------------------------- [~cheersyang] If AM is hanging, then it is unlikely to gracefully terminate by SIGTERM. I think SIGKILL would be the right way to handle this, and let RM restart it. > yarn service application stops when AM is killed with SIGTERM > ------------------------------------------------------------- > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services > Reporter: Yesha Vora > Assignee: Billie Rinaldi > Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org