[ https://issues.apache.org/jira/browse/YARN-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nathan Roberts updated YARN-212: -------------------------------- Attachment: yarn-212.txt Fixed timing issue in TestLogAggregationService > NM state machine ignores an APPLICATION_CONTAINER_FINISHED event when it > shouldn't > ---------------------------------------------------------------------------------- > > Key: YARN-212 > URL: https://issues.apache.org/jira/browse/YARN-212 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 0.23.4, 2.0.1-alpha > Reporter: Nathan Roberts > Assignee: Nathan Roberts > Priority: Blocker > Attachments: yarn-212.txt, yarn-212.txt > > > The NM state machines can make the following two invalid state transitions > when a speculative attempt is killed shortly after it gets started. When this > happens the NM keeps the log aggregation context open for this application > and therefore chews up FDs and leases on the NN, eventually running the NN > out of FDs and bringing down the entire cluster. > 2012-11-07 05:36:33,774 [AsyncDispatcher event handler] WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > APPLICATION_CONTAINER_FINISHED at INITING > 2012-11-07 05:36:33,775 [AsyncDispatcher event handler] WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Can't handle this event at current state: Current: [DONE], eventType: > [INIT_CONTAINER] > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > INIT_CONTAINER at DONE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira