Re: Question about YARN NodeManager and ApplicationMaster failures

Dustin Cote Thu, 03 Mar 2016 04:59:52 -0800

-dev since this is more of a user question

The NodeManager is the parent for the application master, so any containers
(including application master containers) that are running where the failed
NodeManager is located will die.  If an application master fails, then a
new one is created up to your limit (set by
yarn.resourcemanager.am.max-attempts).  The other containers associated
with the application master are supposed to continue on and pick up the
newly started application master.  The resource manager takes care of the
bookkeeping needed to make this happen.  I'd suggest you have a look at the
series of blogs here
<http://blog.cloudera.com/blog/2015/09/untangling-apache-hadoop-yarn-part-1/>
for
a more in depth look at the mechanics.


-Dustin

On Wed, Mar 2, 2016 at 8:26 PM, Sadystio Ilmatunt <[email protected]>
wrote:

> Hello,
>
> I have some questions regarding failure of NodeManager and Application
> Master.
> What happens if NodeManager which is running on the same node as
> Application Master fails?
> Does Application Master fail as well?
>
> Also How is Application Master failure handled with respect to its
> (child) container?
> Do these containers fail too?
> If Yes, is there a way these containers can be assigned to new
> instance of application master that might come up on some other node?
>



-- 
Dustin Cote
Customer Operations Engineer
<http://www.cloudera.com>

Re: Question about YARN NodeManager and ApplicationMaster failures

Reply via email to