[ 
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912200#comment-16912200
 ] 

Prabhu Joseph commented on YARN-9755:
-------------------------------------

[~eyang] RM Server creates a {{Configuration}} object which reads proxy ACL 
list from Local core-site.xml ({{hadoop.proxyuser.yarn.groups}}) which is 
overridden by Hdfs core-site.xml. This proxy settings are again overridden by 
Local yarn-site.xml ({{yarn.resourcemanager.proxyuser.yarn.groups}}) which is 
overridden by Hdfs yarn-site.xml.

The order of override is

Local core-site.xml ({{hadoop.proxyuser.yarn.groups}}) -> Hdfs core-site.xml -> 
Local yarn-site.xml ({{yarn.resourcemanager.proxyuser.yarn.groups}}) -> Hdfs 
yarn-site.xml

The above issue happens if the latest value of hadoop.proxyuser.yarn.groups 
after all override does not allow the hbase user. User can maintain proxy ACL 
list in any of the above four, it is error prone if each one has different 
value.

Can you check if the latest value of hadoop.proxyuser.yarn.groups after all 
override has user list which allows hbase user.

Have attached [^YARN-9755-004.patch] which does reading Hdfs yarn-site.xml 
before Proxy User refresh. Irrespective of this patch 4, the above issue should 
work fine.

> RM fails to start with FileSystemBasedConfigurationProvider
> -----------------------------------------------------------
>
>                 Key: YARN-9755
>                 URL: https://issues.apache.org/jira/browse/YARN-9755
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.3.0
>            Reporter: Prabhu Joseph
>            Assignee: Prabhu Joseph
>            Priority: Major
>         Attachments: YARN-9755-001.patch, YARN-9755-002.patch, 
> YARN-9755-003.patch, YARN-9755-004.patch
>
>
> RM fails to start with below exception when 
> FileSystemBasedConfigurationProvider is used.
> *Exception:*
> {code}
> 2019-08-16 12:05:33,802 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> java.io.IOException: Filesystem closed
>         at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
>         at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
> Caused by: java.io.IOException: java.io.IOException: Filesystem closed
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>         ... 14 more
> Caused by: java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
>         at 
> org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
> {code}
> FileSystemBasedConfigurationProvider uses the cached FileSystem causing the 
> issue.
> *Configs:*
> {code}
> <property><name>yarn.resourcemanager.configuration.provider-class</name><value>org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider</value></property>
> <property><name>yarn.resourcemanager.configuration.file-system-based-store</name><value>/yarn/conf</value></property>
> [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
> -rw-r--r--   3 yarn supergroup       4138 2019-08-16 13:09 
> /yarn/conf/capacity-scheduler.xml
> -rw-r--r--   3 yarn supergroup        494 2019-08-16 11:41 
> /yarn/conf/core-site.xml
> -rw-r--r--   3 yarn supergroup      11392 2019-08-16 11:52 
> /yarn/conf/hadoop-policy.xml
> -rw-r--r--   3 yarn supergroup      11492 2019-08-16 11:41 
> /yarn/conf/yarn-site.xml
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to