Susheel Gupta created YARN-11801: ------------------------------------ Summary: NPE in FifoCandidatesSelector.selectCandidates when preempting resources for an auto-created queue without child queues Key: YARN-11801 URL: https://issues.apache.org/jira/browse/YARN-11801 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 3.4.0, 3.5.0 Reporter: Susheel Gupta
When enabling the ProportionalCapacityPreemptionPolicy in the YARN SchedulingMonitor, the system encounters a NullPointerException in {{{}FifoCandidatesSelector.selectCandidates{}}}. This happens when an auto-created queue exists but does not have any child queues. NullPointer stack trace: {code:java} 2025-03-24 08:36:12,593 ERROR monitor.SchedulingMonitor: Exception raised while executing preemption checker, skip this run..., exception= java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.FifoCandidatesSelector.selectCandidates(FifoCandidatesSelector.java:104) at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:515) at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:344) at org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:100) at org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PolicyInvoker.run(SchedulingMonitor.java:112) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){code} Capacity-scheduler used: {noformat} <configuration> <property> <name>yarn.scheduler.capacity.mapping-rule-json</name> <value/> </property> <property> <name>yarn.scheduler.capacity.root.client.capacity</name> <value>50</value> </property> <property> <name>yarn.scheduler.capacity.root.client.leaf-queue-template.capacity</name> <value>0</value> </property> <property> <name>yarn.scheduler.capacity.root.client.maximum-capacity</name> <value>100</value> </property> <property> <name>yarn.scheduler.capacity.root.client.auto-create-child-queue.enabled</name> <value>true</value> </property> <property> <name>yarn.scheduler.capacity.root.default.capacity</name> <value>50</value> </property> <property> <name>yarn.scheduler.capacity.root.client.leaf-queue-template.maximum-capacity</name> <value>100</value> </property> <property> <name>yarn.scheduler.capacity.root.queues</name> <value>default,client</value> </property> <property> <name>yarn.scheduler.capacity.root.capacity</name> <value>100</value> </property> <property> <name>yarn.scheduler.capacity.root.default.maximum-capacity</name> <value>100</value> </property> <property> <name>yarn.scheduler.capacity.schedule-asynchronously.enable</name> <value>true</value> </property> <property> <name>yarn.webservice.mutation-api.version</name> <value>1742806178771</value> </property> <property> <name>yarn.scheduler.capacity.root.default.maximum-am-resource-percent</name> <value>0.2</value> </property> <property> <name>yarn.scheduler.capacity.mapping-rule-format</name> <value>json</value> </property> </configuration>{noformat} Add this in yarn-site.xml: {code:java} <property> <name>yarn.resourcemanager.scheduler.monitor.enable</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.scheduler.monitor.policies</name> <value>org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy,org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AutoCreatedQueueDeletionPolicy,org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.QueueManagementDynamicEditPolicy</value> </property>{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org