[
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912544#comment-16912544
]
Prabhu Joseph commented on YARN-9755:
-------------------------------------
[~eyang] Thanks for the update. FileSystemBasedConfigurationProvider accepts
only below list of files. And it is optional to pass these files in Hdfs.
Server first reads the configs from Local, then combines / overrides the
configs which are present in Hdfs. So user can decide on what configs need to
be maintained in Hdfs.
This feature is mainly useful for capacity-scheduler.xml as it provides
consistency during RM failover.
{code}
resource-types.xml , dynamic-resources.xml, capacity-scheduler.xml,
hadoop-policy.xml
yarn-site.xml, core-site.xml
{code}
Let me know if i have to reattach the patch 003.
> RM fails to start with FileSystemBasedConfigurationProvider
> -----------------------------------------------------------
>
> Key: YARN-9755
> URL: https://issues.apache.org/jira/browse/YARN-9755
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 3.3.0
> Reporter: Prabhu Joseph
> Assignee: Prabhu Joseph
> Priority: Major
> Attachments: YARN-9755-001.patch, YARN-9755-002.patch,
> YARN-9755-003.patch, YARN-9755-004.patch
>
>
> RM fails to start with below exception when
> FileSystemBasedConfigurationProvider is used.
> *Exception:*
> {code}
> 2019-08-16 12:05:33,802 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException:
> java.io.IOException: Filesystem closed
> at
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
> Caused by: java.io.IOException: java.io.IOException: Filesystem closed
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> ... 14 more
> Caused by: java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
> at
> org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
> {code}
> FileSystemBasedConfigurationProvider uses the cached FileSystem causing the
> issue.
> *Configs:*
> {code}
> <property><name>yarn.resourcemanager.configuration.provider-class</name><value>org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider</value></property>
> <property><name>yarn.resourcemanager.configuration.file-system-based-store</name><value>/yarn/conf</value></property>
> [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
> -rw-r--r-- 3 yarn supergroup 4138 2019-08-16 13:09
> /yarn/conf/capacity-scheduler.xml
> -rw-r--r-- 3 yarn supergroup 494 2019-08-16 11:41
> /yarn/conf/core-site.xml
> -rw-r--r-- 3 yarn supergroup 11392 2019-08-16 11:52
> /yarn/conf/hadoop-policy.xml
> -rw-r--r-- 3 yarn supergroup 11492 2019-08-16 11:41
> /yarn/conf/yarn-site.xml
> {code}
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]