[
https://issues.apache.org/jira/browse/YARN-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297004#comment-17297004
]
Hadoop QA commented on YARN-10678:
----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
34s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m
0s{color} | {color:green}{color} | {color:green} No case conflicting files
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green}{color} | {color:green} The patch does not contain any
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to
include any new or modified tests. Please justify why no new tests are needed
for this patch. Also please list what manual steps were performed to verify
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m
38s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
26s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
19s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m
29s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}
14m 48s{color} | {color:green}{color} | {color:green} branch has no errors when
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
26s{color} | {color:green}{color} | {color:green} trunk passed with JDK
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m
46s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
21s{color} | {color:green}{color} | {color:green} the patch passed with JDK
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m
21s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
19s{color} | {color:green}{color} | {color:green} the patch passed with JDK
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m
19s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
15s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m
22s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green}{color} | {color:green} The patch has no whitespace
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}
12m 55s{color} | {color:green}{color} | {color:green} patch has no errors when
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
22s{color} | {color:green}{color} | {color:green} the patch passed with JDK
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
21s{color} | {color:green}{color} | {color:green} the patch passed with JDK
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m
50s{color} | {color:green}{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} || ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m
57s{color} | {color:green}{color} | {color:green} hadoop-sls in the patch
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
29s{color} | {color:green}{color} | {color:green} The patch does not generate
ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 12s{color} |
{color:black}{color} | {color:black}{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.41 ServerAPI=1.41 base:
https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/740/artifact/out/Dockerfile
|
| JIRA Issue | YARN-10678 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/13021795/YARN-10678.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite
unit shadedclient findbugs checkstyle |
| uname | Linux d558d8447d89 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 9cb51bf1068 |
| Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
| Multi-JDK versions |
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
/usr/lib/jvm/java-8-openjdk-amd64:Private
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
| Test Results |
https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/740/testReport/ |
| Max. process+thread count | 536 (vs. ulimit of 5500) |
| modules | C: hadoop-tools/hadoop-sls U: hadoop-tools/hadoop-sls |
| Console output |
https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/740/console |
| versions | git=2.25.1 maven=3.6.3 findbugs=4.0.6 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
This message was automatically generated.
> Try blocks without catch blocks in SLS scheduler classes can swallow other
> exceptions
> -------------------------------------------------------------------------------------
>
> Key: YARN-10678
> URL: https://issues.apache.org/jira/browse/YARN-10678
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Szilard Nemeth
> Assignee: Szilard Nemeth
> Priority: Major
> Attachments: YARN-10678-unchecked-exception-from-FS-allocate.diff,
> YARN-10678-unchecked-exception-from-FS-allocate_fixed.diff,
> YARN-10678.001.patch,
> org.apache.hadoop.yarn.sls.TestReservationSystemInvariants__testSimulatorRunning_modified.log,
>
> org.apache.hadoop.yarn.sls.TestReservationSystemInvariants__testSimulatorRunning_original.log
>
>
> In SLSFairScheduler, we have this try-finally block (without catch block) in
> the allocate method:
> https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L109-L123
> Similarly, in SLSCapacityScheduler:
> https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java#L116-L131
> In the finally block, the updateQueueWithAllocateRequest is invoked:
> https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L118
> In our internal environment, there was a situation when an NPE was logged
> from this method:
> {code}
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.sls.scheduler.SLSFairScheduler.updateQueueWithAllocateRequest(SLSFairScheduler.java:262)
> at
> org.apache.hadoop.yarn.sls.scheduler.SLSFairScheduler.allocate(SLSFairScheduler.java:118)
> at
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:288)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436)
> at
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352)
> at
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
> at
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348)
> at
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212)
> at
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This can happen if the following events occur:
> 1. A runtime exception is thrown in FairScheduler or CapacityScheduler's
> allocate method
> 2. In this case, the local variable called 'allocation' remains null:
> https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L110
> 3. In updateQueueWithAllocateRequest, this null object will be dereferenced
> here:
> https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L262
> 4. Then, we have an NPE here:
> https://github.com/apache/hadoop/blob/9cb51bf106802c78b1400fba9f1d1c7e772dd5e7/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSFairScheduler.java#L117-L122
> In this case, we lost the original exception thrown from
> FairScheduler#allocate.
> In order to fix this, a catch-block should be introduced and the exception
> needs to be logged.
> The whole thing applies to SLSCapacityScheduler as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]