[
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jian He updated YARN-713:
-------------------------
Attachment: YARN-713.3.patch
New patch:
- Move the creation of NMToken to the same place where container token is
created.
- Catch the exception and return empty allocation for that container if either
container token or NMToken creation fails because of DNS unavailable.
- Add a new field nmTokens in Allocation.java
- Change AMContainerAllocatedTransition to retry if the am container is not
fetchable because token creation fails.
> ResourceManager can exit unexpectedly if DNS is unavailable
> -----------------------------------------------------------
>
> Key: YARN-713
> URL: https://issues.apache.org/jira/browse/YARN-713
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.1.0-beta
> Reporter: Jason Lowe
> Assignee: Jian He
> Priority: Critical
> Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch,
> YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch,
> YARN-713.3.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch,
> YARN-713.patch
>
>
> As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could
> lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and
> that ultimately would cause the RM to exit. The RM should not exit during
> DNS hiccups.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)