[ 
https://issues.apache.org/jira/browse/YARN-7004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257454#comment-16257454
 ] 

Wangda Tan commented on YARN-7004:
----------------------------------

[~Tao Yang], thanks for filing this ticket. I haven't reviewed much details of 
the implementation.

There're two parallel efforts which you might be interested: 
1) YARN-5734, this JIRA makes queue config stored inside storage instead of 
capacity-scheduler.xml. I'm not sure if this can solve your problem. cc: 
[~jhung].
2) It might be bad to maintain all queues inside capacity-scheduler.xml. I 
assume some queues (like per-user queue) can be auto created and managed by 
policies. We're working on YARN-7117 and targeted to 3.1.0 release (Feb 2018). 
Part of the code is already merged to trunk. Appreciate if you could share your 
use cases and feedbacks to the feature. cc: [~suma.shivaprasad]

> Add configs cache to optimize refreshQueues performance for large scale of 
> queues
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-7004
>                 URL: https://issues.apache.org/jira/browse/YARN-7004
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>    Affects Versions: 2.9.0, 3.0.0-alpha4
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>         Attachments: YARN-7004.001.patch
>
>
> We have requirements for large scale queues in our production environment to 
> serve for many projects. So we did some tests for more than 5000 queues and 
> found that refreshQueues process took more than 1 minute. The refreshQueues 
> process costs most of time on iterating over all configurations to get 
> accessible-node-labels and ordering-policy configs for every queue.  
> Loading queue configs from cache should be beneficial to reduce time costs 
> (optimized from 1 minutes to 3 seconds for 5000 queues in our test) when 
> initializing/reinitializing queues. So I propose to load queue configs into 
> cache in CapacityScheduler#initializeQueues and 
> CapacityScheduler#reinitializeQueues. If cache has not be loaded on other 
> scenes, such as in test cases, it still can get queue configs by iterating 
> over all configurations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to