[
https://issues.apache.org/jira/browse/YARN-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nathan Roberts updated YARN-212:
--------------------------------
Attachment: yarn-212.txt
Fixed timing issue in TestLogAggregationService
> NM state machine ignores an APPLICATION_CONTAINER_FINISHED event when it
> shouldn't
> ----------------------------------------------------------------------------------
>
> Key: YARN-212
> URL: https://issues.apache.org/jira/browse/YARN-212
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 0.23.4, 2.0.1-alpha
> Reporter: Nathan Roberts
> Assignee: Nathan Roberts
> Priority: Blocker
> Attachments: yarn-212.txt, yarn-212.txt
>
>
> The NM state machines can make the following two invalid state transitions
> when a speculative attempt is killed shortly after it gets started. When this
> happens the NM keeps the log aggregation context open for this application
> and therefore chews up FDs and leases on the NN, eventually running the NN
> out of FDs and bringing down the entire cluster.
> 2012-11-07 05:36:33,774 [AsyncDispatcher event handler] WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> APPLICATION_CONTAINER_FINISHED at INITING
> 2012-11-07 05:36:33,775 [AsyncDispatcher event handler] WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Can't handle this event at current state: Current: [DONE], eventType:
> [INIT_CONTAINER]
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> INIT_CONTAINER at DONE
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira