[ 
https://issues.apache.org/jira/browse/YARN-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969861#comment-13969861
 ] 

Billie Rinaldi commented on YARN-1944:
--------------------------------------

Having looked into YARN-1922 recently, I wouldn't have expected to see 
processes staying around after containers were killed by Yarn.  What OS are you 
using?  Does the "setsid" command exist?  I think Yarn only kills the entire 
process group if setsid is available.

> Application Container commands fail to stop when application is killed
> ----------------------------------------------------------------------
>
>                 Key: YARN-1944
>                 URL: https://issues.apache.org/jira/browse/YARN-1944
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Oleg Zhurakousky
>
> When launching Yarn Application with an infinite command (e.g., ping 
> google.com), Application Container stops while command(s) continues to run.
> For example:
> Command: ping google.com; 4 containers
> Submit app:
> {code}
> ApplicationId appId = this.yarnClient.submitApplication(appContext);
> {code}
> Kill app:
> {code}
> this.yarnClient.killApplication(appId);
> {code}
> Produces the following output:
> {code}
> 13:10:22,017 ERROR IPC Server handler 48 on 8035 
> resourcemanager.ApplicationMasterService:328 - Application doesn't exist in 
> cache appattempt_1397581697363_0002_000001
> {code}
> Why is it telling me that it doesn't exist when I am using the same AppId 
> that was returned by the YarnClient?
> Also, I can see that after the kill the actual application containers stopped:
> {code}
> 13:10:22,128  WARN ContainersLauncher #6 
> nodemanager.DefaultContainerExecutor:207 - Exit code from container 
> container_1397581697363_0002_01_000002 is : 143
> 13:10:22,151  WARN ContainersLauncher #7 
> nodemanager.DefaultContainerExecutor:207 - Exit code from container 
> container_1397581697363_0002_01_000003 is : 143
> 13:10:22,175  WARN ContainersLauncher #8 
> nodemanager.DefaultContainerExecutor:207 - Exit code from container 
> container_1397581697363_0002_01_000004 is : 143
> 13:10:22,198  WARN ContainersLauncher #9 
> nodemanager.DefaultContainerExecutor:207 - Exit code from container 
> container_1397581697363_0002_01_000005 is : 143
> {code}
> Meanwhile I have 4 pings running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to