[ https://issues.apache.org/jira/browse/YARN-11809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17942380#comment-17942380 ]
ASF GitHub Bot commented on YARN-11809: --------------------------------------- hadoop-yetus commented on PR #7589: URL: https://github.com/apache/hadoop/pull/7589#issuecomment-2791803973 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 24s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 24m 8s | | trunk passed | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | compile | 0m 29s | | trunk passed with JDK Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 30s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 30s | | trunk passed | | +1 :green_heart: | javadoc | 0m 33s | | trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 0m 28s | | trunk passed with JDK Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 17s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 12s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 25s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 24s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 29s | | the patch passed | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 11s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 13s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | -1 :x: | unit | 159m 9s | [/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 25s | | The patch does not generate ASF License warnings. | | | | 236m 44s | | | | Reason | Tests | |-------:|:------| | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerMultiNodes | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/7589 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 525a454aeea8 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 79f25034c5acd260a93c07cbbdd3b0f975192526 | | Default Java | Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/testReport/ | | Max. process+thread count | 954 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Support application backoff mechanism for CapacityScheduler > ----------------------------------------------------------- > > Key: YARN-11809 > URL: https://issues.apache.org/jira/browse/YARN-11809 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Tao Yang > Assignee: Tao Yang > Priority: Major > Labels: pull-request-available > > Currently, when an application repeatedly fails to schedule tasks due to > resource constraints or other issues, it continues to be considered in every > scheduling cycle, potentially causing unnecessary scheduling overhead and > resource contention. This can lead to inefficient resource utilization and > increased scheduling latency. This is especially impactful in global > scheduling where the scheduler needs to consider resources across the entire > cluster. The number of allocated containers per second may drop from 1000+ to > 200+, when the scheduler is overwhelmed with repeated scheduling attempts for > applications that cannot be satisfied. > Thus it's necessary to introduce a new application backoff mechanism in the > Capacity Scheduler to temporarily skip applications that fail to schedule > tasks after a certain number of opportunities, improving the scheduling > efficiency. > h2. Solution > Implement an application backoff mechanism that: > * Tracks missed scheduling opportunities for each application > * Temporarily skips applications that exceed a configurable threshold of > missed opportunities > * Automatically resumes scheduling after a configurable backoff period > * Provides configurable parameters at both global and queue levels > h3. Configuration Parameters > h3. Global Configuration > * yarn.scheduler.capacity.app-backoff.enabled: Enable/disable backoff > mechanism globally (default: false) > * yarn.scheduler.capacity.app-backoff.interval-ms: Global backoff duration > in milliseconds (default: 3000ms) > * yarn.scheduler.capacity.app-backoff.missed-threshold: Global number of > missed opportunities before backoff (default: 3) > h3. Queue-Specific Configuration > * yarn.scheduler.capacity.<queue-path>.app-backoff.enabled: Enable/disable > backoff mechanism for a specific queue. When enabled, applications in this > queue will be temporarily skipped if they fail to schedule tasks after > reaching the missed opportunities threshold. This setting can be configured > independently for each queue, allowing for fine-grained control over which > queues use the backoff mechanism. If not specified, it inherits the global > setting from yarn.scheduler.capacity.app-backoff.enabled. > * yarn.scheduler.capacity.<queue-path>.app-backoff.interval-ms: Backoff > duration in milliseconds for a specific queue. If not specified, it inherits > the global setting from yarn.scheduler.capacity.app-backoff.interval-ms. > * yarn.scheduler.capacity.<queue-path>.app-backoff.missed-threshold: Number > of missed opportunities before backoff for a specific queue. If not > specified, it inherits the global setting from > yarn.scheduler.capacity.app-backoff.missed-threshold. > Queue-specific configurations take precedence over global configurations. If > a queue-specific configuration is not set, the queue will inherit the global > configuration values. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org