[ 
https://issues.apache.org/jira/browse/YARN-11809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17942380#comment-17942380
 ] 

ASF GitHub Bot commented on YARN-11809:
---------------------------------------

hadoop-yetus commented on PR #7589:
URL: https://github.com/apache/hadoop/pull/7589#issuecomment-2791803973

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 24s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  24m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 17s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 13s |  |  patch has no errors 
when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 159m  9s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 25s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 236m 44s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerMultiNodes
 |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.48 ServerAPI=1.48 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7589 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 525a454aeea8 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 
17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 79f25034c5acd260a93c07cbbdd3b0f975192526 |
   | Default Java | Private Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_442-8u442-b06~us1-0ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/testReport/ |
   | Max. process+thread count | 954 (vs. ulimit of 5500) |
   | modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7589/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Support application backoff mechanism for CapacityScheduler
> -----------------------------------------------------------
>
>                 Key: YARN-11809
>                 URL: https://issues.apache.org/jira/browse/YARN-11809
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Major
>              Labels: pull-request-available
>
> Currently, when an application repeatedly fails to schedule tasks due to 
> resource constraints or other issues, it continues to be considered in every 
> scheduling cycle, potentially causing unnecessary scheduling overhead and 
> resource contention. This can lead to inefficient resource utilization and 
> increased scheduling latency. This is especially impactful in global 
> scheduling where the scheduler needs to consider resources across the entire 
> cluster. The number of allocated containers per second may drop from 1000+ to 
> 200+, when the scheduler is overwhelmed with repeated scheduling attempts for 
> applications that cannot be satisfied. 
> Thus it's necessary to introduce a new application backoff mechanism in the 
> Capacity Scheduler to temporarily skip applications that fail to schedule 
> tasks after a certain number of opportunities, improving the scheduling 
> efficiency.
> h2. Solution
> Implement an application backoff mechanism that:
>  * Tracks missed scheduling opportunities for each application
>  * Temporarily skips applications that exceed a configurable threshold of 
> missed opportunities
>  *  Automatically resumes scheduling after a configurable backoff period
>  * Provides configurable parameters at both global and queue levels
> h3. Configuration Parameters
> h3. Global Configuration
>  * yarn.scheduler.capacity.app-backoff.enabled: Enable/disable backoff 
> mechanism globally (default: false)
>  * yarn.scheduler.capacity.app-backoff.interval-ms: Global backoff duration 
> in milliseconds (default: 3000ms)
>  * yarn.scheduler.capacity.app-backoff.missed-threshold: Global number of 
> missed opportunities before backoff (default: 3)
> h3. Queue-Specific Configuration
>  * yarn.scheduler.capacity.<queue-path>.app-backoff.enabled: Enable/disable 
> backoff mechanism for a specific queue. When enabled, applications in this 
> queue will be temporarily skipped if they fail to schedule tasks after 
> reaching the missed opportunities threshold. This setting can be configured 
> independently for each queue, allowing for fine-grained control over which 
> queues use the backoff mechanism. If not specified, it inherits the global 
> setting from yarn.scheduler.capacity.app-backoff.enabled.
>  * yarn.scheduler.capacity.<queue-path>.app-backoff.interval-ms: Backoff 
> duration in milliseconds for a specific queue. If not specified, it inherits 
> the global setting from yarn.scheduler.capacity.app-backoff.interval-ms.
>  * yarn.scheduler.capacity.<queue-path>.app-backoff.missed-threshold: Number 
> of missed opportunities before backoff for a specific queue. If not 
> specified, it inherits the global setting from 
> yarn.scheduler.capacity.app-backoff.missed-threshold.
> Queue-specific configurations take precedence over global configurations. If 
> a queue-specific configuration is not set, the queue will inherit the global 
> configuration values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to