[ https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bilwa S T updated YARN-10670: ----------------------------- Description: Preconditions: # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed # Set the below parameters in RM yarn-site.xml ::<property> <name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name> <value>true</value> </property> # Set this in NM[s]yarn-site.xml ::: <property> <name>yarn.nodemanager.opportunistic-containers-max-queue-length</name> <value>30</value> </property> Test Steps: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC -promote_opportunistic_after_start Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message {noformat} Application Failure: desired = 20, completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for Guaranateed container. {noformat} Expected Result: Distributed Shell Yarn Job should not fail. was: Preconditions: # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed # Set the below parameters in RM yarn-site.xml ::<property> <name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name> <value>true</value> </property> # Set this in NM[s]yarn-site.xml ::: <property> <name>yarn.nodemanager.opportunistic-containers-max-queue-length</name> <value>30</value> </property> Test Steps: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC -promote_opportunistic_after_start Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message {noformat} Attempt recovered after RM restartApplication Failure: desired = 20, completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container De-queued to meet NM queuing limits. [2021-02-09 22:11:48.441]Container terminated before launch. {noformat} Expected Result: Distributed Shell Yarn Job should not fail. > YARN: Opportunistic Container : : In distributed shell job if containers are > killed then application is failed. But in this case as containers are killed > to make room for guaranteed containers which is not correct to fail an > application > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-10670 > URL: https://issues.apache.org/jira/browse/YARN-10670 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell > Affects Versions: 3.1.1 > Reporter: Sushanta Sen > Assignee: Bilwa S T > Priority: Major > > Preconditions: > # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed > # Set the below parameters in RM yarn-site.xml ::<property> > <name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name> > <value>true</value> > </property> > # Set this in NM[s]yarn-site.xml ::: <property> > <name>yarn.nodemanager.opportunistic-containers-max-queue-length</name> > <value>30</value> > </property> > > Test Steps: > Job Command : : yarn > org.apache.hadoop.yarn.applications.distributedshell.Client jar > HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar > -shell_command sleep -shell_args 20 -num_containers 20 -container_type > OPPORTUNISTIC -promote_opportunistic_after_start > Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics > message > {noformat} > Application Failure: desired = 20, completed = 20, allocated = 20, failed = > 1, diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for > Guaranateed container. > {noformat} > Expected Result: Distributed Shell Yarn Job should not fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org