Jason Lowe created YARN-1337:
--------------------------------
Summary: Recover active container state upon nodemanager restart
Key: YARN-1337
URL: https://issues.apache.org/jira/browse/YARN-1337
Project: Hadoop YARN
Issue Type: Sub-task
Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
To support work-preserving NM restart we need to recover the state of the
containers that were active when the nodemanager went down. This includes
informing the RM of containers that have exited in the interim and a strategy
for dealing with the exit codes from those containers along with how to
reacquire the active containers and determine their exit codes when they
terminate.
--
This message was sent by Atlassian JIRA
(v6.1#6144)