[ https://issues.apache.org/jira/browse/YARN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632017#comment-14632017 ]
Ming Ma commented on YARN-3934: ------------------------------- This is due to a single ASC object size. You can repro this with RM starting with empty state. So it is different from YARN-2962. > Application with large ApplicationSubmissionContext can cause RM to exit when > ZK store is used > ---------------------------------------------------------------------------------------------- > > Key: YARN-3934 > URL: https://issues.apache.org/jira/browse/YARN-3934 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Ming Ma > > Use the following steps to test. > 1. Set up ZK as the RM HA store. > 2. Submit a job that refers to lots of distributed cache files with long HDFS > path, which will cause the app state size to exceed ZK's max object size > limit. > 3. RM can't write to ZK and exit with the following exception. > {noformat} > 2015-07-10 22:21:13,002 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:935) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:944) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:941) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1083) > {noformat} > In this case, RM could have rejected the app during submitApplication RPC if > the size of ApplicationSubmissionContext is too large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)