jiulongzhu created YARN-9838:
--------------------------------

             Summary: Using the CapacityScheduler,Apply "movetoqueue" on the 
application which CS reserved containers for,will cause "Num Container" and 
"Used Resource" in ResourceUsage metrics error 
                 Key: YARN-9838
                 URL: https://issues.apache.org/jira/browse/YARN-9838
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacity scheduler
    Affects Versions: 2.7.3
            Reporter: jiulongzhu
             Fix For: 2.7.3
         Attachments: RM_UI_metric_negative.png, RM_UI_metric_positive.png, 
bug_fix_capacityScheduler_moveApplication.patch

      In some clusters of ours, we are seeing "Used Resource","Used 
Capacity","Absolute Used Capacity" and "Num Container" is positive or negative 
when the queue is absolutely idle(no RUNNING, no NEW apps...).In extreme cases, 
apps couldn't be submitted to the queue that is actually idle but the "Used 
Resource" is far more than zero, just like "Container Leak".

      Firstly,I found that "Used Resource","Used Capacity" and "Absolute Used 
Capacity" use the "Used" value of ResourceUsage kept by AbstractCSQueue, and 
"Num Container" use the "numContainer" value kept by LeafQueue.And 
AbstractCSQueue#allocateResource and AbstractCSQueue#releaseResource will 
change the state value of "numContainer" and "Used". Secondly, by comparing the 
values numContainer and ResourceUsageByLabel and QueueMetrics 
changed(#allocateContainer and #releaseContainer) logic of applications with 
and without "movetoqueue",i found that moving the reservedContainers didn't 
modify the "numContainer" value in AbstractCSQueue and "used" value in 
ResourceUsage when the application was moved from a queue to another queue.

        The metric values changed logic of reservedContainers are allocated, 
and moved from $FROM queue to $TO queue, and released.The degree of increase 
and decrease is not conservative, the Resource allocated from $FROM queue and 
release to $TO queue.
||move reversedContainer||allocate||movetoqueue||release||
|numContainer|increase in $FROM queue|{color:#660000}{color:#660000}$FROM queue 
stay the same,$TO queue stay the same{color}{color}|decrease in $TO queue|
|ResourceUsageByLabel(USED)|increase in $FROM queue|{color:#660000}$FROM queue 
stay the same,$TO queue stay the same{color}|decrease  in $TO queue|
|QueueMetrics|increase in $FROM queue|decrease in $FROM queue, increase in $TO 
queue|decrease  in $TO queue|

      The metric values changed logic of allocatedContainer(allocated, 
acquired, running) are allocated, and movetoqueue, and released are absolutely 
conservative.

   



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to