[ 
https://issues.apache.org/jira/browse/YARN-11959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18079942#comment-18079942
 ] 

ASF GitHub Bot commented on YARN-11959:
---------------------------------------

hadoop-yetus commented on PR #8474:
URL: https://github.com/apache/hadoop/pull/8474#issuecomment-4417646467

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   8m 25s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 48s |  |  trunk passed with JDK Red 
Hat, Inc.-21.0.11+10-LTS  |
   | +1 :green_heart: |  compile  |   1m 48s |  |  trunk passed with JDK Red 
Hat, Inc.-17.0.19+10-LTS  |
   | +1 :green_heart: |  checkstyle  |   1m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK Red 
Hat, Inc.-21.0.11+10-LTS  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK Red 
Hat, Inc.-17.0.19+10-LTS  |
   | +1 :green_heart: |  spotbugs  |   1m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 56s |  |  the patch passed with JDK 
Red Hat, Inc.-21.0.11+10-LTS  |
   | +1 :green_heart: |  cc  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  golang  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 57s |  |  the patch passed with JDK 
Red Hat, Inc.-17.0.19+10-LTS  |
   | +1 :green_heart: |  cc  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  golang  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  the patch passed with JDK 
Red Hat, Inc.-21.0.11+10-LTS  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  the patch passed with JDK 
Red Hat, Inc.-17.0.19+10-LTS  |
   | +1 :green_heart: |  spotbugs  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 21s |  |  patch has no errors 
when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  23m 25s |  |  hadoop-yarn-server-nodemanager 
in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 115m 13s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.54 ServerAPI=1.54 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8474/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/8474 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets cc golang |
   | uname | Linux 6e91e43aa13b 5.15.0-173-generic #183-Ubuntu SMP Fri Mar 6 
13:29:34 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 643ae1fdd2c0d9540b8e0284a812a4e199cfe5db |
   | Default Java | Red Hat, Inc.-17.0.19+10-LTS |
   | Multi-JDK versions | 
/usr/lib/jvm/java-21-openjdk-21.0.11.0.10-1.el8_10.x86_64:Red Hat, 
Inc.-21.0.11+10-LTS 
/usr/lib/jvm/java-17-openjdk-17.0.19.0.10-1.el8_10.x86_64:Red Hat, 
Inc.-17.0.19+10-LTS |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8474/2/testReport/ |
   | Max. process+thread count | 629 (vs. ulimit of 10000) |
   | modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8474/2/console |
   | versions | git=2.43.7 maven=3.9.15 spotbugs=4.9.7 |
   | Powered by | Apache Yetus 0.14.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> NodeManager becomes unhealthy when container exits with code 22 or 24
> ---------------------------------------------------------------------
>
>                 Key: YARN-11959
>                 URL: https://issues.apache.org/jira/browse/YARN-11959
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: KWON BYUNGCHANG
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: YARN-11959.001.patch
>
>
> When a user container exits with code 22 or 24, the NodeManager becomes 
> unhealthy and no more containers are allocated to that node. This situation 
> can be resolved by restarting the NodeManager.
>  
>  
> It can be reproduced immediately by running Scala Spark wordcount job that 
> exits with code 22.
>  
>  
> I propose to fix this by wrapping exit code 22 or 24 with different exit 
> code, so that ConfigurationException that causes NodeManager to become 
> unhealthy is not triggered.
>  
> {noformat}
> 2024-09-23 18:50:14,360 INFO  nodemanager.ContainerExecutor 
> (ContainerExecutor.java:logOutput(532)) - Obtaining the exit code...
> 2024-09-23 18:50:14,360 INFO  nodemanager.ContainerExecutor 
> (ContainerExecutor.java:logOutput(532)) - Docker inspect command: 
> /usr/bin/docker inspect --format {{.State.ExitCode}} 
> container_e161_1711009858797_8304894_01_000015
> 2024-09-23 18:50:14,360 INFO  nodemanager.ContainerExecutor 
> (ContainerExecutor.java:logOutput(532)) - Exit code from docker inspect: 22
> 2024-09-23 18:50:14,360 INFO  nodemanager.ContainerExecutor 
> (ContainerExecutor.java:logOutput(532)) - Wrote the exit code 22 to 
> /data6/hadoop/yarn/local/nmPrivate/application_1711009858797_8304894/container_e161_1711009858797_8304894_01_000015/container_e161_1711009858797_8304894_01_000015.pid.exitcode
> 2024-09-23 18:50:14,381 ERROR launcher.ContainerLaunch 
> (ContainerLaunch.java:call(340)) - Failed to launch container due to 
> configuration error.
> org.apache.hadoop.yarn.exceptions.ConfigurationException: Linux Container 
> Executor reached unrecoverable exception
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleExitCode(LinuxContainerExecutor.java:615)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:573)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:479)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:513)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:323)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:106)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  Launch container failed
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:1099)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:166)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:564)
>         ... 8 more {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to