[
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308259#comment-15308259
]
Varun Saxena edited comment on YARN-5156 at 5/31/16 6:13 PM:
-------------------------------------------------------------
Looked at the code. NMTimelinePublisher publishes the YARN_CONTAINER_FINISHED
event on ApplicationContainerFinishedEvent.
And this event is posted from ContainerImpl.
The issue here seems to be that we are cloning the container status and posting
a ApplicationContainerFinishedEvent before the transition has completed and
container state has been set to DONE. This means the container state is
reported as RUNNING. ContainerImpl#sendFinishedEvents which posts a
ApplicationContainerFinishedEvent is called from all those transitions which
would lead the state to be changed to DONE. So in
NMTimelinePublisher#publishContainerFinishedEvent we can simply set
STATE_EVENT_INFO as DONE.
Or as we know that container finished event would always lead to a state of
DONE, no need to send STATE_EVENT_INFO at all. Thoughts ?
{code:title=ContainerImpl.java|borderStyle=solid}
@SuppressWarnings("unchecked")
private void sendFinishedEvents() {
// Inform the application
@SuppressWarnings("rawtypes")
EventHandler eventHandler = dispatcher.getEventHandler();
ContainerStatus containerStatus = cloneAndGetContainerStatus();
eventHandler.handle(new ApplicationContainerFinishedEvent(containerStatus));
// Remove the container from the resource-monitor
eventHandler.handle(new ContainerStopMonitoringEvent(containerId));
// Tell the logService too
eventHandler.handle(new LogHandlerContainerFinishedEvent(
containerId, exitCode));
}
{code}
Naga, you will be handling this ?
was (Author: varun_saxena):
Looked at the code. NMTimelinePublisher publishes the YARN_CONTAINER_FINISHED
event on ApplicationContainerFinishedEvent.
And this event is posted from ContainerImpl.
The issue here seems to be that we are cloning the container status and posting
a ApplicationContainerFinishedEvent before the transition has completed and
container state has been set to DONE. This means the container state is
reported as RUNNING. ContainerImpl#sendFinishedEvents which posts a
ApplicationContainerFinishedEvent is called from all those transitions which
would lead the state to be changed to DONE. So in
NMTimelinePublisher#publishContainerFinishedEvent we can simply set
STATE_EVENT_INFO as DONE.
Or when we know that container finished event would lead to a state of DONE, no
need to send STATE_EVENT_INFO at all. Thoughts ?
{code:title=ContainerImpl.java|borderStyle=solid}
@SuppressWarnings("unchecked")
private void sendFinishedEvents() {
// Inform the application
@SuppressWarnings("rawtypes")
EventHandler eventHandler = dispatcher.getEventHandler();
ContainerStatus containerStatus = cloneAndGetContainerStatus();
eventHandler.handle(new ApplicationContainerFinishedEvent(containerStatus));
// Remove the container from the resource-monitor
eventHandler.handle(new ContainerStopMonitoringEvent(containerId));
// Tell the logService too
eventHandler.handle(new LogHandlerContainerFinishedEvent(
containerId, exitCode));
}
{code}
Naga, you will be handling this ?
> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -------------------------------------------------------------------------
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Li Lu
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do
> we design this deliberately or it's a bug?
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_000018",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID:
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_000018",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_000001"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]