[ https://issues.apache.org/jira/browse/YARN-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18021893#comment-18021893 ]
ASF GitHub Bot commented on YARN-11698: --------------------------------------- zeekling commented on code in PR #6845: URL: https://github.com/apache/hadoop/pull/6845#discussion_r2369180733 ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java: ########## @@ -783,15 +783,11 @@ public void removeVeryOldStoppedContainersFromCache() { break; } if (!context.getContainers().containsKey(cid)) { - ApplicationId appId = - cid.getApplicationAttemptId().getApplicationId(); - if (isApplicationStopped(appId)) { Review Comment: I carefully reviewed the code again and there should be no problem of aggregation failure. By the way. Aggregate logs of containers that completed more than 30 minutes (or longer) to onto HDFS in advance, rather than waiting for the job to completed. Is this implementation better than the current one? > Finished containers shouldn't be stored indefinitely in the NM state store > -------------------------------------------------------------------------- > > Key: YARN-11698 > URL: https://issues.apache.org/jira/browse/YARN-11698 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 3.4.0 > Reporter: Adam Binford > Priority: Major > Labels: pull-request-available > > https://issues.apache.org/jira/browse/YARN-4771 updated the container > tracking in the state store to only remove containers when their application > ends, in order to make sure all containers logs get aggregated even during NM > restarts. This can lead to a significant number of containers building up in > the state store and a lot of things to recover. Since this was purely for > making sure logs get aggregated, it could be done smarter that takes into > account both rolling log aggregation or not having log aggregation enabled at > all. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org