[ 
https://issues.apache.org/jira/browse/YARN-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065181#comment-15065181
 ] 

gu-chi commented on YARN-4481:
------------------------------

I added some extra log to trace, do you have any idea how can probably 
reproduce?

> negative pending resource of queues lead to applications in accepted status 
> inifnitly
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-4481
>                 URL: https://issues.apache.org/jira/browse/YARN-4481
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.7.2
>            Reporter: gu-chi
>            Priority: Critical
>         Attachments: jmx.txt
>
>
> Met a scenario of negative pending resource with capacity scheduler, in jmx, 
> it shows:
> {noformat}
>     "PendingMB" : -4096,
>     "PendingVCores" : -1,
>     "PendingContainers" : -1,
> {noformat}
> full jmx infomation attached.
> this is not just a jmx UI issue, the actual pending resource of queue is also 
> negative as I see the debug log of
> bq. DEBUG | ResourceManager Event Processor | Skip this queue=root, because 
> it doesn't need more resource, schedulingMode=RESPECT_PARTITION_EXCLUSIVITY 
> node-partition= | ParentQueue.java
> this lead to the {{NULL_ASSIGNMENT}}
> The background is submitting hundreds of applications and consume all cluster 
> resource and reservation happen. While running, network fault injected by 
> some tool, injection types are delay,jitter
> ,repeat,packet loss and disorder. And then kill most of the applications 
> submitted.
> Anyone also facing negative pending resource, or have idea of how this happen?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to