[ 
https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832692#comment-15832692
 ] 

Jonathan Hung commented on YARN-5734:
-------------------------------------

Uploaded an initial patch containing some basic end-to-end functionality.
Here are yarn-site.xml configurations to get this working:
* {{yarn.scheduler.capacity.config.path}} should be set to a directory inside 
which the database will be stored. (resource manager user should be able to 
create subdirectories in here)
* {{yarn.scheduler.mutable-queue-config.enabled}} should be {{true}}
* {{yarn.resourcemanager.configuration.provider-class}} should be set to 
{{org.apache.hadoop.yarn.server.resourcemanager.conf.MutableConfigurationManager}}

Here's some working examples which can be run in series, assuming a starting 
configuration of two queues, {{root.default}} (with 100 capacity) and 
{{root.test}} (with 0 capacity):
{noformat}curl -X PUT -H 'Content-Type: application/xml' -d '<schedConf>
  <update>
    <name>root.test</name>
    <params>
      <entry>
        <key>state</key>
        <value>STOPPED</value>
      </entry>
      <entry>
        <key>maximum-applications</key>
        <value>33</value>
      </entry>
    </params>
  </update>
</schedConf>' --negotiate -u : 
"http://<rmHost>:8088/ws/v1/cluster/conf/scheduler/mutate"{noformat}
Sets the {{root.test}} queue's state to STOPPED and its maximum-applications to 
33.

{noformat}curl -X PUT -H 'Content-Type: application/xml' -d '<schedConf>
  <remove>
    <name>root.test</name>
  </remove>
</schedConf>' --negotiate -u : 
"http://<rmHost>:8088/ws/v1/cluster/conf/scheduler/mutate"{noformat}
Removes the {{root.test}} queue (since it is STOPPED, leveraging YARN-5556)

{noformat}curl -X PUT -H 'Content-Type: application/xml' -d '<schedConf>
  <add>
    <name>root.test2</name>
    <params>
      <entry>
        <key>maximum-applications</key>
        <value>34</value>
      </entry>
    </params>
  </add>
</schedConf>' --negotiate -u : 
"http://<rmHost>:8088/ws/v1/cluster/conf/scheduler/mutate"{noformat}
Adds a {{root.test2}} queue. Also sets its maximum-applications to 34.

This is just a first version, so there are some details that are not yet 
implemented/tested (e.g. specifying a hierarchical conf update). [~xgong] and 
[~wangda], do you mind taking a look to make sure our ideas/interfaces are in 
alignment?

> OrgQueue for easy CapacityScheduler queue configuration management
> ------------------------------------------------------------------
>
>                 Key: YARN-5734
>                 URL: https://issues.apache.org/jira/browse/YARN-5734
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Min Shen
>            Assignee: Min Shen
>         Attachments: OrgQueue_API-Based_Config_Management_v1.pdf, 
> OrgQueueAPI-BasedSchedulerConfigurationManagement_v2.pdf, 
> OrgQueue_Design_v0.pdf, YARN-5734-YARN-5734.001.patch
>
>
> The current xml based configuration mechanism in CapacityScheduler makes it 
> very inconvenient to apply any changes to the queue configurations. We saw 2 
> main drawbacks in the file based configuration mechanism:
> # This makes it very inconvenient to automate queue configuration updates. 
> For example, in our cluster setup, we leverage the queue mapping feature from 
> YARN-2411 to route users to their dedicated organization queues. It could be 
> extremely cumbersome to keep updating the config file to manage the very 
> dynamic mapping between users to organizations.
> # Even a user has the admin permission on one specific queue, that user is 
> unable to make any queue configuration changes to resize the subqueues, 
> changing queue ACLs, or creating new queues. All these operations need to be 
> performed in a centralized manner by the cluster administrators.
> With these current limitations, we realized the need of a more flexible 
> configuration mechanism that allows queue configurations to be stored and 
> managed more dynamically. We developed the feature internally at LinkedIn 
> which introduces the concept of MutableConfigurationProvider. What it 
> essentially does is to provide a set of configuration mutation APIs that 
> allows queue configurations to be updated externally with a set of REST APIs. 
> When performing the queue configuration changes, the queue ACLs will be 
> honored, which means only queue administrators can make configuration changes 
> to a given queue. MutableConfigurationProvider is implemented as a pluggable 
> interface, and we have one implementation of this interface which is based on 
> Derby embedded database.
> This feature has been deployed at LinkedIn's Hadoop cluster for a year now, 
> and have gone through several iterations of gathering feedbacks from users 
> and improving accordingly. With this feature, cluster administrators are able 
> to automate lots of thequeue configuration management tasks, such as setting 
> the queue capacities to adjust cluster resources between queues based on 
> established resource consumption patterns, or managing updating the user to 
> queue mappings. We have attached our design documentation with this ticket 
> and would like to receive feedbacks from the community regarding how to best 
> integrate it with the latest version of YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to