[ 
https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15727137#comment-15727137
 ] 

Jonathan Hung commented on YARN-5734:
-------------------------------------

Hi [~jianhe], thanks for the feedback.
bq. Does add/remove also support a full qualified queue name, not just a 
hierachical structure ? I think supporting a single full qualified queue name 
would be handy, especially for CLI add/remove
Sure, I think it makes sense to support both.
bq. User may need to provide a new queue structure for initialization, then, 
the xml file will conflict with what's in config store.
I don't think I understand this part, can you explain why the user needs to 
provide a new queue structure?
Initialization will be done by xml even if API-based approach is enabled. Then 
on crash/restart the config store will be honored. Basically once store is 
initialized, it will be used as source of truth (and the xml is no longer 
useful).
bq. Is the implementation that the caller will block until the update is 
completed - both in store and memory ?
Yes, the plan is to block until the update is completed for both. This is to 
prevent the scenario where the client sends a configuration change, an event is 
queued, and the call returns, then RM crashes, at which point the configuration 
change is lost.
bq. IIUC, the EmbededDerbyDatabase is suitable for single RM only. Do you run 
RM HA in your cluster? Also, I guess Derby does not support fencing ? If so, we 
could potentially have two RMs writing together in a split-brain situation and 
cause data inconsistency. Therefore, I think ZKRMStateStore might be a better 
store option by default, especially because of RM HA.
Currently we are not running RM HA. The reason we have Derby as the default is 
because we currently have it running in production (and we don't have a working 
implementation which supports RM HA), so for single RM clusters we know it 
works well.
bq. Regarding PluggableConfigurationPolicy for authorization, has the 
implementation considered using YarnAuthorizationProvider ?
Took a look at this. I have a couple comments about it, let me know if it's not 
what you had in mind.
* Right now if I understand correctly it looks like YarnAuthorizationProvider 
only supports authorization based on queue ACL (submit/administer queue). We 
would need to extend the implementation to support things like fine-grained 
acls (e.g. acls by configuration key). In this case we would just extend 
YarnAuthorizationProvider with something like 
"SchedulerConfigurationAuthorizationProvider". If this is true, then each 
component using an authorization provider would need to configure its own 
implementation, since the SchedulerConfigurationAuthorizationProvider does not 
apply to all components (and it seems all components use the same provider 
determined by yarn.authorization-provider).
* We will probably still need the new pluggable configuration policy, at least 
for configuration change validation to make sure the proposed configuration 
changes make sense.

> OrgQueue for easy CapacityScheduler queue configuration management
> ------------------------------------------------------------------
>
>                 Key: YARN-5734
>                 URL: https://issues.apache.org/jira/browse/YARN-5734
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Min Shen
>            Assignee: Min Shen
>         Attachments: OrgQueue_API-Based_Config_Management_v1.pdf, 
> OrgQueue_Design_v0.pdf
>
>
> The current xml based configuration mechanism in CapacityScheduler makes it 
> very inconvenient to apply any changes to the queue configurations. We saw 2 
> main drawbacks in the file based configuration mechanism:
> # This makes it very inconvenient to automate queue configuration updates. 
> For example, in our cluster setup, we leverage the queue mapping feature from 
> YARN-2411 to route users to their dedicated organization queues. It could be 
> extremely cumbersome to keep updating the config file to manage the very 
> dynamic mapping between users to organizations.
> # Even a user has the admin permission on one specific queue, that user is 
> unable to make any queue configuration changes to resize the subqueues, 
> changing queue ACLs, or creating new queues. All these operations need to be 
> performed in a centralized manner by the cluster administrators.
> With these current limitations, we realized the need of a more flexible 
> configuration mechanism that allows queue configurations to be stored and 
> managed more dynamically. We developed the feature internally at LinkedIn 
> which introduces the concept of MutableConfigurationProvider. What it 
> essentially does is to provide a set of configuration mutation APIs that 
> allows queue configurations to be updated externally with a set of REST APIs. 
> When performing the queue configuration changes, the queue ACLs will be 
> honored, which means only queue administrators can make configuration changes 
> to a given queue. MutableConfigurationProvider is implemented as a pluggable 
> interface, and we have one implementation of this interface which is based on 
> Derby embedded database.
> This feature has been deployed at LinkedIn's Hadoop cluster for a year now, 
> and have gone through several iterations of gathering feedbacks from users 
> and improving accordingly. With this feature, cluster administrators are able 
> to automate lots of thequeue configuration management tasks, such as setting 
> the queue capacities to adjust cluster resources between queues based on 
> established resource consumption patterns, or managing updating the user to 
> queue mappings. We have attached our design documentation with this ticket 
> and would like to receive feedbacks from the community regarding how to best 
> integrate it with the latest version of YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to