wy created YARN-11963:
-------------------------

             Summary: Rolling log aggregation only deletes the last container's 
uploaded local files per cycle due to single-variable overwrite in 
uploadLogsForContainers loop
                 Key: YARN-11963
                 URL: https://issues.apache.org/jira/browse/YARN-11963
             Project: Hadoop YARN
          Issue Type: Bug
          Components: log-aggregation, yarn
    Affects Versions: 3.4.3, 3.5.0, 3.3.6, 3.2.4
         Environment: * OS: Ubuntu 24.04 (WSL2)
 * Java: OpenJDK 21
 * Hadoop: 3.4.2
 * Cluster: Single-node (localhost), pseudo-distributed mode
            Reporter: wy


h3. Problem

In {{AppLogAggregatorImpl.uploadLogsForContainers()}}, when rolling log 
aggregation processes multiple containers per node in a single cycle, only the 
*last* container's uploaded files are scheduled for local deletion. Files 
uploaded for all other containers in the same cycle are silently leaked on the 
local disk until the application finishes (at which point 
{{doAppLogAggregationPostCleanUp()}} deletes the entire app log directory).

This is a regression introduced by YARN-8273.

h3. Root Cause

YARN-8273 (commit {{b22f56c471}}, 2018-05-22) moved the 
{{delService.delete(deletionTask)}} call from *inside* the per-container loop 
to *outside* the loop (in the {{finally}} block), to avoid deleting local files 
when HDFS write fails. However, it kept {{deletionTask}} as a single variable 
that gets overwritten on each loop iteration.

*Before YARN-8273* (correct, introduced by YARN-6366):
{code:java}
for (ContainerId container : pendingContainerInThisCycle) {
    // ...
    if (uploadedFilePathsInThisCycle.size() > 0) {
        DeletionTask deletionTask = new FileDeletionTask(...);  // local 
variable
        delService.delete(deletionTask);                         // called per 
container ✓
    }
}
{code}

*After YARN-8273* (buggy):
{code:java}
DeletionTask deletionTask = null;                         // declared outside 
loop
for (ContainerId container : pendingContainerInThisCycle) {
    // ...
    if (uploadedFilePathsInThisCycle.size() > 0) {
        deletionTask = new FileDeletionTask(...);          // overwrites 
previous value!
    }
}
// ...
if (logAggregationSucceedInThisCycle && deletionTask != null) {
    delService.delete(deletionTask);                       // only last 
container's task survives
}
{code}

Additionally, {{pendingContainerInThisCycle}} is a {{HashSet<ContainerId>}} 
(line 297), so iteration order is non-deterministic — which container "wins" 
the deletion slot varies across cycles.

h3. Impact

For applications with N containers on a single node (e.g., Spark with 1 AM + 
multiple executors):
* Per rolling cycle: N containers' files are uploaded, but only 1 container's 
files are deleted locally
* The discarded {{FileDeletionTask}} objects go out of scope silently — no 
warning is logged
* Over the lifetime of a long-running application, local disk usage grows with 
(N-1) containers' logs accumulating per cycle
* At app finish, {{doAppLogAggregationPostCleanUp()}} deletes the entire app 
directory, masking the per-cycle bug

For single-container applications, this bug does not manifest (the loop runs 
once, no overwrite occurs).

h3. Reproduction Steps

*Prerequisites:*
* Hadoop 3.4.2 single-node cluster (HDFS + YARN)
* TRACE logging enabled:
{noformat}
log4j.logger.org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl=TRACE
log4j.logger.org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor=DEBUG
{noformat}

*Steps:*

# Configure {{yarn-site.xml}} with TFile format (to isolate from the separate 
IndexedFormat bug) and rolling aggregation:
{code:xml}
<property>
  <name>yarn.log-aggregation-enable</name>
  <value>true</value>
</property>
<property>
  <name>yarn.log-aggregation.file-formats</name>
  <value>TFile</value>
</property>
<property>
  <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
  <value>15</value>
</property>
{code}
# Create a test script that writes to stderr:
{code:bash}
# test-app.sh
#!/bin/bash
for i in $(seq 1 90); do echo "line_$i" >&2; sleep 1; done
{code}
# Submit via DistributedShell with multiple containers (this creates 1 AM 
container + 3 task containers = 4 containers on a single node):
{code:bash}
yarn jar hadoop-yarn-applications-distributedshell-*.jar \
  org.apache.hadoop.yarn.applications.distributedshell.Client \
  --jar hadoop-yarn-applications-distributedshell-*.jar \
  --shell_script test-app.sh --shell_args "90" \
  --num_containers 3 --container_memory 256 --master_memory 256 \
  --rolling_log_pattern "stderr"
{code}
# After completion, check NM log for the first rolling cycle. Compare the 
number of "Uploaded the following files for container_..." TRACE entries vs the 
number of "Deleting path" entries for stderr files:
{code:bash}
grep "$APP_ID" $NM_LOG | grep "Uploaded the following files"
# Shows uploads for ALL containers (e.g., container_000001 and container_000002)

grep "$APP_ID" $NM_LOG | grep "Deleting path.*stderr"
# Shows deletion for only ONE container per cycle (the last one iterated)
{code}

h3. Observed Test Result

In our test with 2 containers (1 AM + 1 task), the first rolling cycle at 
01:48:28 showed:
{noformat}
TRACE Uploaded the following files for container_...000001: 
[.../AppMaster.stderr]
TRACE Uploaded the following files for container_...000002: [.../stderr]
INFO  Deleting path : .../container_...000002/stderr
{noformat}

container_000001's {{AppMaster.stderr}} was uploaded but *not deleted*. Only 
container_000002 (last in the {{HashSet}} iteration) had its file deleted.

h3. Expected Behavior

After each rolling cycle, uploaded local files for *all* containers should be 
scheduled for deletion (when {{enableLocalCleanup=true}} and the HDFS write 
succeeded).

h3. Actual Behavior

Only the last container's uploaded files (per {{HashSet}} iteration order) are 
deleted. All other containers' files remain on local disk until app finish.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to