Sushanta Sen created YARN-10670:
-----------------------------------

             Summary: YARN: Opportunistic Container : : In distributed shell 
job if containers are killed then application is failed. But in this case as 
containers are killed to make room for guaranteed containers which is not 
correct to fail an application
                 Key: YARN-10670
                 URL: https://issues.apache.org/jira/browse/YARN-10670
             Project: Hadoop YARN
          Issue Type: Bug
          Components: distributed-shell
    Affects Versions: 3.1.1
            Reporter: Sushanta Sen


Preconditions:
 # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
 # Set the below parameters  in RM::<property>
 <name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name>
 <value>true</value>
 </property>
 # Set this in NM[s]: <property>
 <name>yarn.nodemanager.opportunistic-containers-max-queue-length</name>
 <value>30</value>
 </property>

 
Test Steps:


Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message

{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to