[ 
https://issues.apache.org/jira/browse/YARN-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949330#comment-16949330
 ] 

Tao Yang commented on YARN-9838:
--------------------------------

Thanks [~jiulongZhu] for fixing this issue. 
The patch is LGTM in general,  some minor suggestions for the patch:
* check-style warnings need to be fixed, after that, you can run 
"dev-support/bin/test-patch /path/to/my.patch" to confirm.
* The indentation of updated log need to be adjusted and useless deletion of a 
blank line should be reverted in LeafQueue.
* The annotation "sync ResourceUsageByLabel ResourceUsageByUser and 
numContainer" can be removed since it seems unnecessary to add details here.
* As for UT, you can remove before-fixed block and just keep the correct 
verification.  Moreover, I think it's better to remove "//YARN-9838" since we 
can find the source easily by git, and the annotation style "/** */" often used 
for class or method, it's better to use "//" or "/* */" in the method.

> Using the CapacityScheduler,Apply "movetoqueue" on the application which CS 
> reserved containers for,will cause "Num Container" and "Used Resource" in 
> ResourceUsage metrics error 
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9838
>                 URL: https://issues.apache.org/jira/browse/YARN-9838
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler
>    Affects Versions: 2.7.3
>            Reporter: jiulongzhu
>            Priority: Critical
>              Labels: patch
>             Fix For: 2.7.3
>
>         Attachments: RM_UI_metric_negative.png, RM_UI_metric_positive.png, 
> YARN-9838.0001.patch
>
>
>       In some clusters of ours, we are seeing "Used Resource","Used 
> Capacity","Absolute Used Capacity" and "Num Container" is positive or 
> negative when the queue is absolutely idle(no RUNNING, no NEW apps...).In 
> extreme cases, apps couldn't be submitted to the queue that is actually idle 
> but the "Used Resource" is far more than zero, just like "Container Leak".
>       Firstly,I found that "Used Resource","Used Capacity" and "Absolute Used 
> Capacity" use the "Used" value of ResourceUsage kept by AbstractCSQueue, and 
> "Num Container" use the "numContainer" value kept by LeafQueue.And 
> AbstractCSQueue#allocateResource and AbstractCSQueue#releaseResource will 
> change the state value of "numContainer" and "Used". Secondly, by comparing 
> the values numContainer and ResourceUsageByLabel and QueueMetrics 
> changed(#allocateContainer and #releaseContainer) logic of applications with 
> and without "movetoqueue",i found that moving the reservedContainers didn't 
> modify the "numContainer" value in AbstractCSQueue and "used" value in 
> ResourceUsage when the application was moved from a queue to another queue.
>         The metric values changed logic of reservedContainers are allocated, 
> and moved from $FROM queue to $TO queue, and released.The degree of increase 
> and decrease is not conservative, the Resource allocated from $FROM queue and 
> release to $TO queue.
> ||move reversedContainer||allocate||movetoqueue||release||
> |numContainer|increase in $FROM queue|{color:#FF0000}$FROM queue stay the 
> same,$TO queue stay the same{color}|decrease  in $TO queue|
> |ResourceUsageByLabel(USED)|increase in $FROM queue|{color:#FF0000}$FROM 
> queue stay the same,$TO queue stay the same{color}|decrease  in $TO queue |
> |QueueMetrics|increase in $FROM queue|decrease in $FROM queue, increase in 
> $TO queue|decrease  in $TO queue|
>       The metric values changed logic of allocatedContainer(allocated, 
> acquired, running) are allocated, and movetoqueue, and released are 
> absolutely conservative.
>    



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to