[ https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589719#comment-16589719 ]
Tao Yang edited comment on YARN-8692 at 8/23/18 5:48 AM: --------------------------------------------------------- {quote} I am curious how node memory/cpu is calculated here? Is it based on the allocated memory/cpu? {quote} Yes, it's based on allocated memory/cpu. Detailed calculation as follow: {noformat} container-utilization = $container-allocated-resource * $task-utilization-ratio node-utilization = sum($container-utilization) {noformat} {{$task-utilization-ratio}} can be configured with average and standard deviation, so that we can generate different task-utilization-ratio samples as we wanted for containers. For example, we can configured "memory_utilization_ratio":{ "val": 0.5, "std": 0.01} for map tasks so that the memory utilization for map containers will be calculated as below: {noformat} allocated-memory = 1000 memory-utilization-ratio-sample is a random double value from 0.49 to 0.51 memory-utilization-of-map-container = $allocated-memory * $memory-utilization-ratio-sample {noformat} As a result, utilization of map container can be 490, 491, 492, ..., 508, 509 or 510 was (Author: tao yang): {quote} I am curious how node memory/cpu is calculated here? Is it based on the allocated memory/cpu? {quote} Yes, it's based on allocated memory/cpu. Detailed calculation as follow: {noformat} node-utilization = sum(container-utilization) container-utilization = container-allocated-resource * task-utilization-ratio {noformat} {{task-utilization-ratio}} can be configured with average and standard deviation, so that we can generate different task-utilization-ratio samples as we wanted for containers. For example, we can configured {{"memory_utilization_ratio":{ "val": 0.5, "std": 0.01}}} for map tasks so that we can calculate the memory utilization for map containers as below: {noformat} allocated-memory = 1000 memory-utilization-ratio-sample is a random double value from 0.49 to 0.51 memory-utilization-of-map-container = $allocated-memory * $memory-utilization-ratio-sample {noformat} so that utilization of map container can be 490, 491, 492, ..., 508, 509 or 510 > Support node utilization metrics for SLS > ---------------------------------------- > > Key: YARN-8692 > URL: https://issues.apache.org/jira/browse/YARN-8692 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator > Affects Versions: 3.2.0 > Reporter: Tao Yang > Assignee: Tao Yang > Priority: Major > Attachments: image-2018-08-21-18-04-22-749.png > > > The distribution of node utilization is an important healthy factor for the > YARN cluster, related metrics in SLS can be used to evaluate the scheduling > effects and optimize related configurations. > To implement this improvement, we need to do things as below: > (1) Add input configurations (contain avg and stddev for cpu/memory > utilization ratio) and generate utilization samples for tasks, not include AM > container cause I think it's negligible. > (2) Simulate containers and node utilization within node status. > (3) calculate and generate the distribution metrics and use standard > deviation metric (stddev for short) to evaluate the effects(smaller is > better). > (4) show these metrics on SLS simulator page like this: > !image-2018-08-21-18-04-22-749.png! > For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, > and P0 represents 0%~9% utilization ratio(containers-utilization / > node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents > 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization > ratio. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org