[
https://issues.apache.org/jira/browse/YARN-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702823#comment-13702823
]
Omkar Vinit Joshi commented on YARN-592:
----------------------------------------
I just looked at your patch.... I need more information to understand it
better....
* are you assuming that after nm restarts application for which containers were
running on that node manager will again get new container on the same node
manager? at present NM doesn't remember the applications which were running on
it across restart. Also RM doesn't inform NM about all the running applications
in the cluster.
* Now across NM restart applications might be still running or it might have
just finished before restart. Do you want to upload the logs for both
scenarios? at present we upload logs only when application finishes...
> Container logs lost for the application when NM gets restarted
> --------------------------------------------------------------
>
> Key: YARN-592
> URL: https://issues.apache.org/jira/browse/YARN-592
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.0.1-alpha, 2.0.3-alpha
> Reporter: Devaraj K
> Assignee: Devaraj K
> Priority: Critical
> Attachments: YARN-592.patch
>
>
> While running a big job if the NM goes down due to some reason and comes
> back, it will do the log aggregation for the newly launched containers and
> deletes all the containers for the application. This case we don't get the
> container logs from HDFS or local for the containers which are launched
> before restart and completed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira