Rohith Sharma K S commented on YARN-3535:

[~peng.zhang] I rebased the patch to trunk and added FT test. The test 
simulates reported scenarion and fails with timeout if this fix is not present. 
After this fix, test passes. 
In  you previous patch, I have one doubt that , why the below method is removed 
in both FS and CS? Any specific reason?
-    recoverResourceRequestForContainer(cont);

>  ResourceRequest should be restored back to scheduler when RMContainer is 
> killed at ALLOCATED
> ---------------------------------------------------------------------------------------------
>                 Key: YARN-3535
>                 URL: https://issues.apache.org/jira/browse/YARN-3535
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Peng Zhang
>            Assignee: Peng Zhang
>            Priority: Critical
>         Attachments: 0003-YARN-3535.patch, YARN-3535-001.patch, 
> YARN-3535-002.patch, syslog.tgz, yarn-app.log
> During rolling update of NM, AM start of container on NM failed. 
> And then job hang there.
> Attach AM logs.

This message was sent by Atlassian JIRA

Reply via email to