caozhiqiang created YARN-11488: ---------------------------------- Summary: Handling CONTAINER_EXPIRED event will throw NEP if the reservation is removed from node Key: YARN-11488 URL: https://issues.apache.org/jira/browse/YARN-11488 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.4.0 Reporter: caozhiqiang Assignee: caozhiqiang
In AbstractYarnScheduler::completeOustandingUpdatesWhichAreReserved(), after getReservedContainer(), there is a certain possibility that the reservedContainer is removed by scheduler. It will throw NEP and resourcemanager would crash like below log. {code:java} // code placeholder 2023-05-07 02:04:38,201 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type CONTAINER_EXPIRED to the Event Dispatcher java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completeOustandingUpdatesWhichAreReserved(AbstractYarnScheduler.java:725) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:686) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1927) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:172) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:74) at java.lang.Thread.run(Thread.java:748) 2023-05-07 02:04:38,201 INFO [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye.. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org