[
https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724356#comment-15724356
]
Jian He commented on YARN-5734:
-------------------------------
[~mshen], [~jhung], [~zhz], very useful feature! thanks for the contribution,
Some questions I had about the design:
- Does add/remove also support a full qualified queue name, not just a
hierachical structure ? I think supporting a single full qualified queue name
would be handy, especially for CLI add/remove
- IIUC, the xml-file will still be used for initialization on startup, even if
the API-based approach is enabled ? Then, if the RM gets restarted, will the RM
honor the xml file or the config store for initialization ? I feel both
scenarios may be possible:
-- If it is a crash-and-restart, probably we should honor the config
store.
-- If RM is going through a rolling upgrade. User may need to provide a
new queue structure for initialization, then, the xml file will conflict with
what's in config store.
- Is the implementation that the caller will block until the update is
completed - both in store and memory ?
- IIUC, the EmbededDerbyDatabase is suitable for single RM only. Do you run RM
HA in your cluster? Also, I guess Derby does not support fencing ? If so, we
could potentially have two RMs writing together in a split-brain situation and
cause data inconsistency. Therefore, I think ZKRMStateStore might be a better
store option by default, especially because of RM HA.
- Regarding PluggableConfigurationPolicy for authorization, has the
implementation considered using YarnAuthorizationProvider ?
YarnAuthorizationProvider is a interface which can be implemented by other
authorization plugin(Apache Ranger). Ranger has a nice web portal where it can
define arbitrary authorization policies such as restricting certain user/groups
from doing certain operations. It would be useful if it did, as Ranger plugin
just needs to implement the necessary interface and get the config
authorization for free.
> OrgQueue for easy CapacityScheduler queue configuration management
> ------------------------------------------------------------------
>
> Key: YARN-5734
> URL: https://issues.apache.org/jira/browse/YARN-5734
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Min Shen
> Assignee: Min Shen
> Attachments: OrgQueue_API-Based_Config_Management_v1.pdf,
> OrgQueue_Design_v0.pdf
>
>
> The current xml based configuration mechanism in CapacityScheduler makes it
> very inconvenient to apply any changes to the queue configurations. We saw 2
> main drawbacks in the file based configuration mechanism:
> # This makes it very inconvenient to automate queue configuration updates.
> For example, in our cluster setup, we leverage the queue mapping feature from
> YARN-2411 to route users to their dedicated organization queues. It could be
> extremely cumbersome to keep updating the config file to manage the very
> dynamic mapping between users to organizations.
> # Even a user has the admin permission on one specific queue, that user is
> unable to make any queue configuration changes to resize the subqueues,
> changing queue ACLs, or creating new queues. All these operations need to be
> performed in a centralized manner by the cluster administrators.
> With these current limitations, we realized the need of a more flexible
> configuration mechanism that allows queue configurations to be stored and
> managed more dynamically. We developed the feature internally at LinkedIn
> which introduces the concept of MutableConfigurationProvider. What it
> essentially does is to provide a set of configuration mutation APIs that
> allows queue configurations to be updated externally with a set of REST APIs.
> When performing the queue configuration changes, the queue ACLs will be
> honored, which means only queue administrators can make configuration changes
> to a given queue. MutableConfigurationProvider is implemented as a pluggable
> interface, and we have one implementation of this interface which is based on
> Derby embedded database.
> This feature has been deployed at LinkedIn's Hadoop cluster for a year now,
> and have gone through several iterations of gathering feedbacks from users
> and improving accordingly. With this feature, cluster administrators are able
> to automate lots of thequeue configuration management tasks, such as setting
> the queue capacities to adjust cluster resources between queues based on
> established resource consumption patterns, or managing updating the user to
> queue mappings. We have attached our design documentation with this ticket
> and would like to receive feedbacks from the community regarding how to best
> integrate it with the latest version of YARN.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]