[
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bilwa S T updated YARN-10670:
-----------------------------
Description:
Preconditions:
# Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
# Set the below parameters in RM yarn-site.xml ::<property>
<name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name>
<value>true</value>
</property>
# Set this in NM[s]yarn-site.xml ::: <property>
<name>yarn.nodemanager.opportunistic-containers-max-queue-length</name>
<value>30</value>
</property>
Test Steps:
Job Command : : yarn
org.apache.hadoop.yarn.applications.distributedshell.Client jar
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
-shell_command sleep -shell_args 20 -num_containers 20 -container_type
OPPORTUNISTIC -promote_opportunistic_after_start
Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Application Failure: desired = 20, completed = 20, allocated = 20, failed = 1,
diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for
Guaranateed container.
{noformat}
Expected Result: Distributed Shell Yarn Job should not fail.
was:
Preconditions:
# Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
# Set the below parameters in RM yarn-site.xml ::<property>
<name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name>
<value>true</value>
</property>
# Set this in NM[s]yarn-site.xml ::: <property>
<name>yarn.nodemanager.opportunistic-containers-max-queue-length</name>
<value>30</value>
</property>
Test Steps:
Job Command : : yarn
org.apache.hadoop.yarn.applications.distributedshell.Client jar
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
-shell_command sleep -shell_args 20 -num_containers 20 -container_type
OPPORTUNISTIC -promote_opportunistic_after_start
Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
Expected Result: Distributed Shell Yarn Job should not fail.
> YARN: Opportunistic Container : : In distributed shell job if containers are
> killed then application is failed. But in this case as containers are killed
> to make room for guaranteed containers which is not correct to fail an
> application
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
> Issue Type: Bug
> Components: distributed-shell
> Affects Versions: 3.1.1
> Reporter: Sushanta Sen
> Assignee: Bilwa S T
> Priority: Major
>
> Preconditions:
> # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
> # Set the below parameters in RM yarn-site.xml ::<property>
> <name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name>
> <value>true</value>
> </property>
> # Set this in NM[s]yarn-site.xml ::: <property>
> <name>yarn.nodemanager.opportunistic-containers-max-queue-length</name>
> <value>30</value>
> </property>
>
> Test Steps:
> Job Command : : yarn
> org.apache.hadoop.yarn.applications.distributedshell.Client jar
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
> -shell_command sleep -shell_args 20 -num_containers 20 -container_type
> OPPORTUNISTIC -promote_opportunistic_after_start
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics
> message
> {noformat}
> Application Failure: desired = 20, completed = 20, allocated = 20, failed =
> 1, diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for
> Guaranateed container.
> {noformat}
> Expected Result: Distributed Shell Yarn Job should not fail.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]