[ 
https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4744:
-----------------------------------
    Description: 
Install HA cluster in secure mode
Enable LCE with cgroups
Start server with dsperf user
Submit mapreduce application terasort/teragen with user yarn/dsperf 
Too many signal to container failure 

Submit with user the exception is thrown

{noformat}
2014-03-02 09:20:38,689 INFO 
SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
 Authorization successful for testing (auth:TOKEN) for protocol=interface 
org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
2014-03-02 09:20:40,158 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
 Event EventType: KILL_CONTAINER sent to absent container 
container_e02_1393731146548_0001_01_000013
2014-03-02 09:20:43,071 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Container container_e02_1393731146548_0001_01_000009 succeeded
2014-03-02 09:20:43,072 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
 Container container_e02_1393731146548_0001_01_000009 transitioned from RUNNING 
to EXITED_WITH_SUCCESS
2014-03-02 09:20:43,073 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Cleaning up container container_e02_1393731146548_0001_01_000009
2014-03-02 09:20:43,075 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime:
 Using container runtime: DefaultLinuxContainerRuntime
2014-03-02 09:20:43,081 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
 Shell execution returned exit code: 9. Privileged Execution Operation Output:
main : command provided 2
main : run as user is yarn
main : requested yarn user is yarn
Full command array for failed execution:
[/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, 
yarn, yarn, 2, 9370, 15]
2014-03-02 09:20:43,081 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime:
 Signal container failed. Exception:
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
 ExitCodeException exitCode=9:
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109)
        at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
        at java.lang.Thread.run(Thread.java:745)
Caused by: ExitCodeException exitCode=9:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
        at org.apache.hadoop.util.Shell.run(Shell.java:838)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
        ... 9 more
2014-03-02 09:20:43,113 INFO 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=yarn 
OPERATION=Container Finished - Succeeded        TARGET=ContainerImpl    
RESULT=SUCCESS  APPID=application_1393731146548_0001    
CONTAINERID=container_e02_1393731146548_0001_01_000009
2014-03-02 09:20:43,115 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
 Container container_e02_1393731146548_0001_01_000009 transitioned from 
EXITED_WITH_SUCCESS to DONE
2014-03-02 09:20:43,115 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl:
 Removing container_e02_1393731146548_0001_01_000009 from application 
application_1393731146548_0001

{noformat}


Checked the same scenario in 2.7.2 version (not available)



  was:
Install HA cluster in secure mode
Enable LCE with cgroups
Start server with dsperf user
Submit mapreduce application terasort/teragen with user yarn/dsperf 
Too many signal to container failure 

Submit with user the exception is thrown

{noformat}
2014-03-01 14:10:32,223 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime:
 Using container runtime: DefaultLinuxContainerRuntime
2014-03-01 14:10:32,228 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
 Shell execution returned exit code: 9. Privileged Execution Operation Output:
main : command provided 2
main : run as user is yarn
main : requested yarn user is yarn
Full command array for failed execution:
[/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, 
yarn, yarn, 2, 28575, 15]
2014-03-01 14:10:32,228 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime:
 Signal container failed. Exception:
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
 ExitCodeException exitCode=9:
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109)
        at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
        at java.lang.Thread.run(Thread.java:745)
Caused by: ExitCodeException exitCode=9:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
        at org.apache.hadoop.util.Shell.run(Shell.java:838)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
        ... 9 more

{noformat}


Checked the same scenario in 2.7.2 version (not available)




> Too many signal to container failure in case of LCE
> ---------------------------------------------------
>
>                 Key: YARN-4744
>                 URL: https://issues.apache.org/jira/browse/YARN-4744
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.0
>            Reporter: Bibin A Chundatt
>
> Install HA cluster in secure mode
> Enable LCE with cgroups
> Start server with dsperf user
> Submit mapreduce application terasort/teragen with user yarn/dsperf 
> Too many signal to container failure 
> Submit with user the exception is thrown
> {noformat}
> 2014-03-02 09:20:38,689 INFO 
> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
>  Authorization successful for testing (auth:TOKEN) for protocol=interface 
> org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-03-02 09:20:40,158 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
>  Event EventType: KILL_CONTAINER sent to absent container 
> container_e02_1393731146548_0001_01_000013
> 2014-03-02 09:20:43,071 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Container container_e02_1393731146548_0001_01_000009 succeeded
> 2014-03-02 09:20:43,072 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
>  Container container_e02_1393731146548_0001_01_000009 transitioned from 
> RUNNING to EXITED_WITH_SUCCESS
> 2014-03-02 09:20:43,073 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Cleaning up container container_e02_1393731146548_0001_01_000009
> 2014-03-02 09:20:43,075 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime:
>  Using container runtime: DefaultLinuxContainerRuntime
> 2014-03-02 09:20:43,081 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 9. Privileged Execution Operation Output:
> main : command provided 2
> main : run as user is yarn
> main : requested yarn user is yarn
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  yarn, yarn, 2, 9370, 15]
> 2014-03-02 09:20:43,081 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime:
>  Signal container failed. Exception:
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=9:
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=9:
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
>         at org.apache.hadoop.util.Shell.run(Shell.java:838)
>         at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
>         ... 9 more
> 2014-03-02 09:20:43,113 INFO 
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=yarn 
> OPERATION=Container Finished - Succeeded        TARGET=ContainerImpl    
> RESULT=SUCCESS  APPID=application_1393731146548_0001    
> CONTAINERID=container_e02_1393731146548_0001_01_000009
> 2014-03-02 09:20:43,115 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
>  Container container_e02_1393731146548_0001_01_000009 transitioned from 
> EXITED_WITH_SUCCESS to DONE
> 2014-03-02 09:20:43,115 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl:
>  Removing container_e02_1393731146548_0001_01_000009 from application 
> application_1393731146548_0001
> {noformat}
> Checked the same scenario in 2.7.2 version (not available)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to