[
https://issues.apache.org/jira/browse/YARN-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Minni Mittal updated YARN-10822:
--------------------------------
Attachment: YARN-10822.v1.patch
> Containers going from New to Scheduled transition even though container is
> killed before NM restart when NM recovery is enabled
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-10822
> URL: https://issues.apache.org/jira/browse/YARN-10822
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Minni Mittal
> Assignee: Minni Mittal
> Priority: Major
> Attachments: YARN-10822.v1.patch
>
>
> INFO [91] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from NEW to
> LOCALIZING
> INFO [91] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from LOCALIZING to
> SCHEDULED
> INFO [91] ContainerScheduler: Opportunistic container
> container_e1171_1623422468672_2229_01_000738 will be queued at the NM.
> INFO [127] ContainerManagerImpl: Stopping container with container Id:
> container_e1171_1623422468672_2229_01_000738
> INFO [91] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from SCHEDULED to
> KILLING
> INFO [91] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from KILLING to
> CONTAINER_CLEANEDUP_AFTER_KILL
> INFO [91] NMAuditLogger: USER=defaultcafor1stparty OPERATION=Container
> Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS
> APPID=application_1623422468672_2229
> CONTAINERID=container_e1171_1623422468672_2229_01_000738
> INFO [91] ApplicationImpl: Removing
> container_e1171_1623422468672_2229_01_000738 from application
> application_1623422468672_2229
> INFO [91] ContainersMonitorImpl: Stopping resource-monitoring for
> container_e1171_1623422468672_2229_01_000738
> INFO [163] NodeStatusUpdaterImpl: Removed completed containers from NM
> context:[container_e1171_1623422468672_2229_01_000738]
> NM restart happened and recovery is attempted
>
> INFO [1] ContainerManagerImpl: Recovering
> container_e1171_1623422468672_2229_01_000738 in state QUEUED with exit code
> -1000
> INFO [1] ApplicationImpl: Adding
> container_e1171_1623422468672_2229_01_000738 to application
> application_1623422468672_2229
> INFO [89] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from NEW to
> SCHEDULED
> INFO [89] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from SCHEDULED to
> KILLING
> INFO [89] ContainerImpl: Container
> container_e1171_1623422468672_2229_01_000738 transitioned from KILLING to
> CONTAINER_CLEANEDUP_AFTER_KILL
> Ideally, when container got killed before restart, it should finish the
> container immediately.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]