[
https://issues.apache.org/jira/browse/YARN-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Susheel Gupta reassigned YARN-11900:
------------------------------------
Assignee: Susheel Gupta
> NullPointerException in ZKConfigurationStore during RM startup when HA
> enabled and configuration store is ZK
> ------------------------------------------------------------------------------------------------------------
>
> Key: YARN-11900
> URL: https://issues.apache.org/jira/browse/YARN-11900
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Affects Versions: 3.5.0
> Reporter: Susheel Gupta
> Assignee: Susheel Gupta
> Priority: Major
>
> Intermittently observing RM startup failures when YARN RM HA is enabled and
> the scheduler configuration store is set to ZK.
> During RM restarts one of the RMs occasionally fails to initialize the
> CapacityScheduler with the following exception:
> {code:java}
> 2025-11-18 16:50:23,398 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server
> quasar-tiwwno-3.vpc.cloudera.com/10.65.54.198:2182, session id =
> 0x30000621e760015, negotiated timeout = 60000
> 2025-11-18 16:50:23,399 INFO
> org.apache.curator.framework.state.ConnectionStateManager: State change:
> CONNECTED
> 2025-11-18 16:50:23,487 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.YarnConfigurationStore:
> Loaded configuration store version info null
> 2025-11-18 16:50:23,487 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.YarnConfigurationStore:
> Storing configuration store version info 0.1
> 2025-11-18 16:50:23,541 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore:
> Exception while deserializing scheduler configuration from store
> java.lang.NullPointerException
> at
> java.base/java.io.ByteArrayInputStream.<init>(ByteArrayInputStream.java:108)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore.deserializeObject(ZKConfigurationStore.java:317)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.ZKConfigurationStore.retrieve(ZKConfigurationStore.java:214)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.MutableCSConfigurationProvider.init(MutableCSConfigurationProvider.java:83)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:403)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:875)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1293)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:334)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1580)
> 2025-11-18 16:50:23,549 INFO org.apache.hadoop.service.AbstractService:
> Service
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> failed in state INITED
> java.lang.NullPointerException
> at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:842)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.MutableCSConfigurationProvider.loadConfiguration(MutableCSConfigurationProvider.java:102)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:296)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:403)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:875)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1293)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:334)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1580)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]