[ https://issues.apache.org/jira/browse/YARN-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943781#comment-14943781 ]
Jason Lowe commented on YARN-4216: ---------------------------------- The container logs should not be uploaded on NM stop if we are doing recovery. That is intentional. Decommission + nm restart doesn't make sense to me. Either we are decommissioning a node and don't expect it to return, or we are going to restart it and expect it to return shortly. For the former, we want the NM to linger a bit to try to finish log aggregation. For the latter it should not. If we are decommissioning the node then context.getDecommissioned() in the boolean clause above should be true which means shouldAbort would be false. That means it should not do the same thing as a shutdown under supervision. My apologies if I'm missing something. > Container logs not shown for newly assigned containers after NM recovery > -------------------------------------------------------------------------- > > Key: YARN-4216 > URL: https://issues.apache.org/jira/browse/YARN-4216 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, nodemanager > Reporter: Bibin A Chundatt > Assignee: Bibin A Chundatt > Priority: Critical > Attachments: NMLog, ScreenshotFolder.png, yarn-site.xml > > > Steps to reproduce > # Start 2 nodemanagers with NM recovery enabled > # Submit pi job with 20 maps > # Once 5 maps gets completed in NM 1 stop NM (yarn daemon stop nodemanager) > (Logs of all completed container gets aggregated to HDFS) > # Now start the NM1 again and wait for job completion > *The newly assigned container logs on NM1 are not shown* > *hdfs log dir state* > # When logs are aggregated to HDFS during stop its with NAME (localhost_38153) > # On log aggregation after starting NM the newly assigned container logs gets > uploaded with name (localhost_38153.tmp) > History server the logs are now shown for new task attempts -- This message was sent by Atlassian JIRA (v6.3.4#6332)