Weiwei Yang commented on YARN-8142:

I am thinking another scenario, what if an app's AM hangs. User wants to 
restart the AM without restarting the entire app, would it be reasonable to 
allow an AM be killed then automatically restarted in another instance? To 
terminate the application (especially for a long-running service), we need to 
call a stop command instead of killing an AM directly. Does that make sense?

> yarn service application stops when AM is killed with SIGTERM
> -------------------------------------------------------------
>                 Key: YARN-8142
>                 URL: https://issues.apache.org/jira/browse/YARN-8142
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-native-services
>            Reporter: Yesha Vora
>            Assignee: Billie Rinaldi
>            Priority: Major
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to