[
https://issues.apache.org/jira/browse/YARN-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17724402#comment-17724402
]
ASF GitHub Bot commented on YARN-7720:
--------------------------------------
hadoop-yetus commented on PR #5672:
URL: https://github.com/apache/hadoop/pull/5672#issuecomment-1555201867
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|:----:|----------:|--------:|:--------:|:-------:|
| +0 :ok: | reexec | 0m 48s | | Docker mode activated. |
|||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +0 :ok: | xmllint | 0m 0s | | xmllint was not available. |
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include
any new or modified tests. Please justify why no new tests are needed for this
patch. Also please list what manual steps were performed to verify this patch.
|
|||| _ trunk Compile Tests _ |
| +0 :ok: | mvndep | 15m 46s | | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 24m 55s | | trunk passed |
| +1 :green_heart: | compile | 7m 32s | | trunk passed with JDK
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 |
| +1 :green_heart: | compile | 6m 44s | | trunk passed with JDK
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
| +1 :green_heart: | checkstyle | 1m 45s | | trunk passed |
| +1 :green_heart: | mvnsite | 2m 40s | | trunk passed |
| +1 :green_heart: | javadoc | 2m 36s | | trunk passed with JDK
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 |
| +1 :green_heart: | javadoc | 2m 22s | | trunk passed with JDK
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
| +1 :green_heart: | spotbugs | 5m 55s | | trunk passed |
| +1 :green_heart: | shadedclient | 25m 8s | | branch has no errors
when building and testing our client artifacts. |
|||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 23s | | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 1m 52s | | the patch passed |
| +1 :green_heart: | compile | 6m 56s | | the patch passed with JDK
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 |
| +1 :green_heart: | javac | 6m 56s | | the patch passed |
| +1 :green_heart: | compile | 6m 41s | | the patch passed with JDK
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
| +1 :green_heart: | javac | 6m 41s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| +1 :green_heart: | checkstyle | 1m 39s | | the patch passed |
| +1 :green_heart: | mvnsite | 2m 28s | | the patch passed |
| +1 :green_heart: | javadoc | 2m 20s | | the patch passed with JDK
Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 |
| +1 :green_heart: | javadoc | 2m 10s | | the patch passed with JDK
Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
| +1 :green_heart: | spotbugs | 6m 5s | | the patch passed |
| +1 :green_heart: | shadedclient | 24m 33s | | patch has no errors
when building and testing our client artifacts. |
|||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 3s | | hadoop-yarn-api in the patch
passed. |
| +1 :green_heart: | unit | 5m 20s | | hadoop-yarn-common in the patch
passed. |
| +1 :green_heart: | unit | 101m 28s | |
hadoop-yarn-server-resourcemanager in the patch passed. |
| +1 :green_heart: | asflicense | 0m 48s | | The patch does not
generate ASF License warnings. |
| | | 262m 22s | | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5672/3/artifact/out/Dockerfile
|
| GITHUB PR | https://github.com/apache/hadoop/pull/5672 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
| uname | Linux 2fef0dde8e6f 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3
19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/bin/hadoop.sh |
| git revision | trunk / 830d7af6ef27524f201f1581cc17e5508146acc7 |
| Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
| Multi-JDK versions |
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
/usr/lib/jvm/java-8-openjdk-amd64:Private
Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
| Test Results |
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5672/3/testReport/ |
| Max. process+thread count | 899 (vs. ulimit of 5500) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
U: hadoop-yarn-project/hadoop-yarn |
| Console output |
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5672/3/console |
| versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
| Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
This message was automatically generated.
> Race condition between second app attempt and UAM timeout when first attempt
> node is down
> -----------------------------------------------------------------------------------------
>
> Key: YARN-7720
> URL: https://issues.apache.org/jira/browse/YARN-7720
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Botong Huang
> Assignee: Shilun Fan
> Priority: Major
> Attachments: YARN-7720.v1.patch, YARN-7720.v2.patch
>
>
> In Federation, multiple attempts of an application share the same UAM in each
> secondary sub-cluster. When first attempt fails, we reply on the fact that
> secondary RM won't kill the existing UAM before the AM heartbeat timeout
> (default at 10 min). When second attempt comes up in the home sub-cluster, it
> will pick up the UAM token from Yarn Registry and resume the UAM heartbeat to
> secondary RMs.
> The default heartbeat timeout for NM and AM are both 10 mins. The problem is
> that when the first attempt node goes down or out of connection, only after
> 10 mins will the home RM mark the first attempt as failed, and then schedule
> the 2nd attempt in some other node. By then the UAMs in secondaries are
> already timing out, and they might not survive until the second attempt comes
> up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]