[ 
https://issues.apache.org/jira/browse/YARN-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-7083:
-----------------------------
    Attachment: YARN-7083.001.patch

Here's a patch that should fix the issue, since it closes the 
try-with-resources block for the writer at the same place we used to call 
close() explicitly before YARN-6288.  It looks like a lot of change, but it's 
really just adding a new outer try block and indenting for that block.  The 
same diff while ignoring whitespace-only changes looks like this:
{noformat}
$ git diff -b trunk
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
index 1601c3f..af655fc 100644
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
@@ -293,6 +293,8 @@ private void uploadLogsForContainers(boolean appFinished) {
     logAggregationTimes++;
     String diagnosticMessage = "";
     boolean logAggregationSucceedInThisCycle = true;
+    try {
+      boolean uploadedLogsInThisCycle = false;
       try (LogWriter writer = createLogWriter()) {
         try {
           writer.initialize(this.conf, this.remoteNodeTmpLogFileForApp,
@@ -308,7 +310,6 @@ private void uploadLogsForContainers(boolean appFinished) {
           return;
         }
 
-      boolean uploadedLogsInThisCycle = false;
         for (ContainerId container : pendingContainerInThisCycle) {
           ContainerLogAggregator aggregator = null;
           if (containerLogAggregators.containsKey(container)) {
@@ -343,6 +344,7 @@ private void uploadLogsForContainers(boolean appFinished) {
           cleanOldLogs();
           cleanupOldLogTimes++;
         }
+      }
 
       long currentTime = System.currentTimeMillis();
       final Path renamedPath = getRenamedPath(currentTime);
{noformat}

No unit test for it yet.  I'll try to get some time to see if that's a 
straightforward thing to add.

> Log aggregation deletes/renames while file is open
> --------------------------------------------------
>
>                 Key: YARN-7083
>                 URL: https://issues.apache.org/jira/browse/YARN-7083
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.8.2
>            Reporter: Daryn Sharp
>            Priority: Critical
>         Attachments: YARN-7083.001.patch
>
>
> YARN-6288 changes the log aggregation writer to be an autoclosable.  
> Unfortunately the try-with-resources block for the writer will either rename 
> or delete the log while open.
> Assuming the NM's behavior is correct, deleting open files only results in 
> ominous WARNs in the nodemanager log and increases the rate of logging in the 
> NN when the implicit try-with-resource close fails.  These red herrings 
> complicate debugging efforts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to