[
https://issues.apache.org/jira/browse/YARN-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated YARN-7083:
-----------------------------
Attachment: YARN-7083.001.patch
Here's a patch that should fix the issue, since it closes the
try-with-resources block for the writer at the same place we used to call
close() explicitly before YARN-6288. It looks like a lot of change, but it's
really just adding a new outer try block and indenting for that block. The
same diff while ignoring whitespace-only changes looks like this:
{noformat}
$ git diff -b trunk
diff --git
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
index 1601c3f..af655fc 100644
---
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
+++
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
@@ -293,6 +293,8 @@ private void uploadLogsForContainers(boolean appFinished) {
logAggregationTimes++;
String diagnosticMessage = "";
boolean logAggregationSucceedInThisCycle = true;
+ try {
+ boolean uploadedLogsInThisCycle = false;
try (LogWriter writer = createLogWriter()) {
try {
writer.initialize(this.conf, this.remoteNodeTmpLogFileForApp,
@@ -308,7 +310,6 @@ private void uploadLogsForContainers(boolean appFinished) {
return;
}
- boolean uploadedLogsInThisCycle = false;
for (ContainerId container : pendingContainerInThisCycle) {
ContainerLogAggregator aggregator = null;
if (containerLogAggregators.containsKey(container)) {
@@ -343,6 +344,7 @@ private void uploadLogsForContainers(boolean appFinished) {
cleanOldLogs();
cleanupOldLogTimes++;
}
+ }
long currentTime = System.currentTimeMillis();
final Path renamedPath = getRenamedPath(currentTime);
{noformat}
No unit test for it yet. I'll try to get some time to see if that's a
straightforward thing to add.
> Log aggregation deletes/renames while file is open
> --------------------------------------------------
>
> Key: YARN-7083
> URL: https://issues.apache.org/jira/browse/YARN-7083
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.8.2
> Reporter: Daryn Sharp
> Priority: Critical
> Attachments: YARN-7083.001.patch
>
>
> YARN-6288 changes the log aggregation writer to be an autoclosable.
> Unfortunately the try-with-resources block for the writer will either rename
> or delete the log while open.
> Assuming the NM's behavior is correct, deleting open files only results in
> ominous WARNs in the nodemanager log and increases the rate of logging in the
> NN when the implicit try-with-resource close fails. These red herrings
> complicate debugging efforts.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]