[ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488563#comment-16488563
 ] 

Hudson commented on YARN-8346:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14279 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14279/])
YARN-8346. Upgrading to 3.1 kills running containers with error 
(rohithsharmaks: rev 4cc0c9b0baa93f5a1c0623eee353874e858a7caa)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestYARNTokenIdentifier.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/ContainerTokenIdentifier.java


> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-8346
>                 URL: https://issues.apache.org/jira/browse/YARN-8346
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 3.1.0, 3.0.2
>            Reporter: Rohith Sharma K S
>            Assignee: Jason Lowe
>            Priority: Blocker
>             Fix For: 3.1.0, 2.10.0, 3.2.0, 3.0.3
>
>         Attachments: YARN-8346.001.patch
>
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_000001] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to