Container re-use is a separate JIRA, without any code behind it yet https://issues.apache.org/jira/browse/YARN-1040
All that happens on YARN-1336 NM restart is the containers stay up and the NM reconnects to them. This actually forces the slider code to add some more logic to handle the situation "NM down & stays down, container failure report triggers new container allocation —but the existing container stays up and heartbeats to our AM." we handle this by recognising an unknown container checking in, and sending a message to its python agent saying "you are no longer live, kill yourself and your processes" On 17 December 2014 at 09:57, Li Shengmei <[email protected]> wrote: > Hi, > > I want to ask some questions about YARN-1336. As we know, we can > recover container after NM Restart as YARN-1336 described. > > I want to persist the container after the container finished after one > iteration not after NM restart. > > I want to persist the container and the immediate values after the > container finished, and reuse the container and immediate values in the > future, may be next iteration run. Can I use the implementation of > YARN-1336? Does anyone give some hints? > > My understand is that the immediate values are stored in proto. > Right? And maybe I need to add another status of container? > > > > Thanks a lot. > > > > May > > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
