[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Lowe updated YARN-4051: ----------------------------- Summary: ContainerKillEvent lost when container is still recovering and application finishes (was: ContainerKillEvent is lost when container is In New State and is recovering) Thanks for updating the patch! I'm OK with fixing the lost kill-from-AM event in a separate JIRA, but I adjusted the headline of this one to avoid confusion. Should we use NMNotYetReadyException in the case where the AM tries to kill a container still recovering? We already throw it in similar situations where the NM isn't ready to handle the request. Nits: - ",because " should be " because " - ContainerImpl#isRecovering should check recoveredStatus before container state since recoveredStatus is the cheaper check and likely to avoid a subsequent state check and corresponding lock acquisition. > ContainerKillEvent lost when container is still recovering and application > finishes > ----------------------------------------------------------------------------------- > > Key: YARN-4051 > URL: https://issues.apache.org/jira/browse/YARN-4051 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Reporter: sandflee > Assignee: sandflee > Priority: Critical > Attachments: YARN-4051.01.patch, YARN-4051.02.patch, > YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch, > YARN-4051.06.patch, YARN-4051.07.patch > > > As in YARN-4050, NM event dispatcher is blocked, and container is in New > state, when we finish application, the container still alive even after NM > event dispatcher is unblocked. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org