[jira] [Commented] (YARN-4216) Container logs not shown for newly assigned containers after NM recovery

Jason Lowe (JIRA) Mon, 05 Oct 2015 11:16:53 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943781#comment-14943781
 ]


Jason Lowe commented on YARN-4216:
----------------------------------

The container logs should not be uploaded on NM stop if we are doing recovery.  
That is intentional.  Decommission + nm restart doesn't make sense to me.  
Either we are decommissioning a node and don't expect it to return, or we are 
going to restart it and expect it to return shortly.  For the former, we want 
the NM to linger a bit to try to finish log aggregation.  For the latter it 
should not.

If we are decommissioning the node then context.getDecommissioned() in the 
boolean clause above should be true which means shouldAbort would be false.  
That means it should not do the same thing as a shutdown under supervision.  My 
apologies if I'm missing something.

> Container logs not shown for newly assigned containers  after NM  recovery
> --------------------------------------------------------------------------
>
>                 Key: YARN-4216
>                 URL: https://issues.apache.org/jira/browse/YARN-4216
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation, nodemanager
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>         Attachments: NMLog, ScreenshotFolder.png, yarn-site.xml
>
>
> Steps to reproduce
> # Start 2 nodemanagers  with NM recovery enabled
> # Submit pi job with 20 maps 
> # Once 5 maps gets completed in NM 1 stop NM (yarn daemon stop nodemanager)
> (Logs of all completed container gets aggregated to HDFS)
> # Now start  the NM1 again and wait for job completion
> *The newly assigned container logs on NM1 are not shown*
> *hdfs log dir state*
> # When logs are aggregated to HDFS during stop its with NAME (localhost_38153)
> # On log aggregation after starting NM the newly assigned container logs gets 
> uploaded with name  (localhost_38153.tmp) 
> History server the logs are now shown for new task attempts



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4216) Container logs not shown for newly assigned containers after NM recovery

Reply via email to