[
https://issues.apache.org/jira/browse/YARN-11932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shilun Fan resolved YARN-11932.
-------------------------------
Fix Version/s: 3.5.0
Hadoop Flags: Reviewed
Resolution: Fixed
> Fix TestYarnFederationWithFairScheduler timeout caused by shared NodeLabel
> storage
> ----------------------------------------------------------------------------------
>
> Key: YARN-11932
> URL: https://issues.apache.org/jira/browse/YARN-11932
> Project: Hadoop YARN
> Issue Type: Bug
> Components: router
> Affects Versions: 3.5.1
> Reporter: Shilun Fan
> Assignee: Shilun Fan
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.5.0
>
>
> *Problem*
>
> TestYarnFederationWithFairScheduler#testMetricsInfo intermittently times out
> during test execution.
>
> The root cause is that multiple test subclusters share the same NodeLabel
> storage directory (\{{/tmp/hadoop-yarn-$USER/node-labels}}) by default. When
> tests run sequentially, residual editlog entries containing "delete default
> label" operations from previous tests cause the ResourceManager to fail
> during startup recovery with the error:
> {code:java}
> Node label=default to be removed doesn't existed in cluster node labels
> collection {code}
> *Solution*
>
> Set an isolated NodeLabel storage directory for each subcluster startup to
> avoid reusing old editlog files.
>
> In \{{TestMockSubCluster.java}}, configure a unique directory per subcluster
> using:
> * GenericTestUtils.getTestDir() to create test-specific directories
> * Directory naming pattern: \{{node-labels-{subClusterId}-\{timestamp}}}
> * Configuration key: \{{YarnConfiguration.FS_NODE_LABELS_STORE_ROOT_DIR}}
>
> *Test Results*
>
> After the fix, all 38 tests in TestYarnFederationWithFairScheduler pass
> successfully:
> * Tests run: 38, Failures: 0, Errors: 0, Skipped: 0
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]