[
https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703820#comment-13703820
]
Omkar Vinit Joshi commented on YARN-299:
----------------------------------------
my bad... missed it...canceling the patch.. I think we need to remove LOCALIZED
state transition.... let [~vinodkv] confirm though ... after vinod pointed it
out we can't get the LOCALIZED / FAILED events in DONE state at all because we
have single dispatcher thread and all the events generated from
LocalizedResource will get processed before Container receives
"ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP" event in
"ContainerState.LOCALIZATION_FAILED" state.
However there is a situation in PRIVATE/APPLICATION resource localizer in which
we notify container of failed localization if localizer thread (for that
container) fails for some reason. check below code in
ResourceLocalizationService.java
{code}
dispatcher.getEventHandler().handle(
new ContainerResourceFailedEvent(cId, null, e.getMessage()));
{code}
This is what happened in YARN-820 too as evident from logs.. sorry for missing
it..
In summary we will have to keep FAILED transition but remove LOCALIZED
transition.
> Node Manager throws
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> RESOURCE_FAILED at DONE
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-299
> URL: https://issues.apache.org/jira/browse/YARN-299
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Affects Versions: 2.0.1-alpha, 2.0.0-alpha
> Reporter: Devaraj K
> Assignee: Mayank Bansal
> Attachments: YARN-299-trunk-1.patch, YARN-299-trunk-2.patch
>
>
> {code:xml}
> 2012-12-31 10:36:27,844 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Can't handle this event at current state: Current: [DONE], eventType:
> [RESOURCE_FAILED]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> RESOURCE_FAILED at DONE
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:662)
> 2012-12-31 10:36:27,845 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1356792558130_0002_01_000001 transitioned from DONE to
> null
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira