[
https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuqi Wang updated YARN-2402:
----------------------------
Attachment: YARN-2402-v2.patch
Correct coding style.
The issue is that the recovered container always exits with failed status
because NM cannot find its exitCodeFile, so NM cannot get the actual exit code.
I have tested the patch by getting the exit code from a recovered then failed
container, and from a recovered then succeed container.
I have checked there is also not unit test for getting exit code from the
exitCodeFile for Unix or getting pid from the pidFile for Windows, maybe it is
trivial to test this simple script. But if it is needed a unit test, I can add
it afterwards. :)
> NM restart: Container recovery for Windows
> ------------------------------------------
>
> Key: YARN-2402
> URL: https://issues.apache.org/jira/browse/YARN-2402
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.6.0
> Reporter: Jason Lowe
> Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch
>
>
> We should add container recovery for NM restart on Windows.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)