[ 
https://issues.apache.org/jira/browse/YARN-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17367259#comment-17367259
 ] 

Peter Bacsko commented on YARN-10780:
-------------------------------------

[~gandras] looks good, could you take care of the checkstyle problems?

> Optimise retrieval of configured node labels in CS queues
> ---------------------------------------------------------
>
>                 Key: YARN-10780
>                 URL: https://issues.apache.org/jira/browse/YARN-10780
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Andras Gyori
>            Assignee: Andras Gyori
>            Priority: Major
>         Attachments: YARN-10780.001.patch, YARN-10780.002.patch, 
> YARN-10780.003.patch, YARN-10780.004.patch
>
>
> CapacitySchedulerConfiguration#getConfiguredNodeLabels scales poorly with 
> respect to queue numbers (its O(n*m), where n is the number of queues and m 
> is the number of properties set by each queue). During CS reinit, the node 
> labels are often queried, however looking at the code:
> {code:java}
> for (Entry<String, String> stringStringEntry : this) {
>       e = stringStringEntry;
>       String key = e.getKey();
>       if (key.startsWith(getQueuePrefix(queuePath) + ACCESSIBLE_NODE_LABELS
>           + DOT)) {
>         // Find <label-name> in
>         // <queue-path>.accessible-node-labels.<label-name>.property
>         int labelStartIdx =
>             key.indexOf(ACCESSIBLE_NODE_LABELS)
>                 + ACCESSIBLE_NODE_LABELS.length() + 1;
>         int labelEndIndx = key.indexOf('.', labelStartIdx);
>         String labelName = key.substring(labelStartIdx, labelEndIndx);
>         configuredNodeLabels.add(labelName);
>       }
>     }
> {code}
>  This method iterates through ALL properties set in the configuration. For 
> example in case of initialising 2500 queues, each having at least 2 
> properties:
> 2500 * 5000 ~= over 12 million iteration + additional properties
> There are some ways to resolve this issue while keeping backward 
> compatibility:
>  # Create a property like the original accessible-node-labels, which contains 
> predefined labels. If it is set, then getConfiguredNodeLabels get the value 
> of this property, otherwise it falls back to the old logic. I think 
> accessible-node-labels are not used for this purpose (though I have a feeling 
> that it should have been).
>  # Collect node labels for all queues at the beginning of parseQueue and only 
> iterate through the properties once. This will increase the space complexity 
> in exchange of not requiring intervention from user's perspective. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to