[
https://issues.apache.org/jira/browse/YARN-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jian He updated YARN-1783:
--------------------------
Attachment: YARN-1783.3.patch
Thanks for catching this !
The new patch creates a separate collection for recording the previous
completed containers when getNodeStatus is called and remove containers from
context only for those completed containers.
> yarn application does not make any progress even when no other application is
> running when RM is being restarted in the background
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-1783
> URL: https://issues.apache.org/jira/browse/YARN-1783
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.4.0
> Reporter: Arpit Gupta
> Assignee: Jian He
> Priority: Critical
> Attachments: YARN-1783.1.patch, YARN-1783.2.patch, YARN-1783.3.patch
>
>
> Noticed that during HA tests some tests took over 3 hours to run when the
> test failed.
> Looking at the logs i see the application made no progress for a very long
> time. However if i look at application log from yarn it actually ran in 5 mins
> I am seeing same behavior when RM was being restarted in the background and
> when both RM and AM were being restarted. This does not happen for all
> applications but a few will hit this in the nightly run.
--
This message was sent by Atlassian JIRA
(v6.2#6252)