Elen Chatikyan created YARN-11666:
-------------------------------------
Summary: NullPointerException in TestSLSRunner.testSimulatorRunning
Key: YARN-11666
URL: https://issues.apache.org/jira/browse/YARN-11666
Project: Hadoop YARN
Issue Type: Bug
Environment: {*}Operating System{*}: macOS (Sanoma 14.2.1 (23C71))
{*}Hardware{*}: MacBook Air 2023
{*}IDE{*}: IntelliJ IDEA (2023.3.2 (Ultimate Edition))
{*}Java Version{*}: OpenJDK version "1.8.0_292"
Reporter: Elen Chatikyan
*What happened:*
In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load
Scheduler) framework, a *NullPointerException* is thrown during the teardown
process of parameterized tests. This exception is thrown when the stop method
is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue
occurs under test conditions that involve mismatches between trace types
(RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios
where the rm object may not be properly initialized before the stop method is
invoked.
*Buggy code:*
The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}*
method:{+}{{+}}
{code:java}
public void stop() {
rm.stop();
}
{code}
The root cause of the *{{NullPointerException}}* is the lack of a null check
for the {{rm}} object before calling its {{stop}} method. Under any condition
where the *{{ResourceManager}}* fails to initialize correctly, attempting to
stop the *{{ResourceManager}}* leads to a null pointer dereference.
After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed.
+TaskRunner.java+
{code:java}
public void stop() throws InterruptedException {
executor.shutdownNow();
executor.awaitTermination(20, TimeUnit.SECONDS);
}
{code}
{*}How to trigger this bug:{*}{*}{{*}}
* Change the parameterized unit test's(TestSLSRunner.java) data method to
include one/both of the following test cases:
* {capScheduler, "SYNTH", rumenTraceFile, nodeFile }
* {capScheduler, "SYNTH", slsTraceFile, nodeFile }
* Execute the *TestSLSRunner* test suite, particularly the
*testSimulatorRunning* method.
* Observe the resulting *NullPointerException* in the test output(triggered in
RMRunner.java).
{panel:title=Example stack trace from the test output:}
[ERROR] testSimulatorRunning[Testing with: SYNTH,
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler,
(nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027
s <<< ERROR!
java.lang.NullPointerException
at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127)
at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320)
at
org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68)
...
{panel}
_______________________________________________________________________
_{color:#172b4d}The bug can be fixed by implementing a null check for the
{{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any
methods on it.(same for executor object in TaskRunner.java){color}_
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]