Varun Saxena commented on YARN-4481:

[~sunilg], we do not have AM debug logs. And RM debug logs are after the event 
so all we get from it is that pending resources are negative which leads to the 
log guchi mentioned above. Let us see if we get something more from code.

> negative pending resource of queues lead to applications in accepted status 
> inifnitly
> -------------------------------------------------------------------------------------
>                 Key: YARN-4481
>                 URL: https://issues.apache.org/jira/browse/YARN-4481
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.7.2
>            Reporter: gu-chi
>            Priority: Critical
>         Attachments: jmx.txt
> Met a scenario of negative pending resource with capacity scheduler, in jmx, 
> it shows:
> {noformat}
>     "PendingMB" : -4096,
>     "PendingVCores" : -1,
>     "PendingContainers" : -1,
> {noformat}
> full jmx infomation attached.
> this is not just a jmx UI issue, the actual pending resource of queue is also 
> negative as I see the debug log of
> bq. DEBUG | ResourceManager Event Processor | Skip this queue=root, because 
> it doesn't need more resource, schedulingMode=RESPECT_PARTITION_EXCLUSIVITY 
> node-partition= | ParentQueue.java
> this lead to the {{NULL_ASSIGNMENT}}
> The background is submitting hundreds of applications and consume all cluster 
> resource and reservation happen. While running, network fault injected by 
> some tool, injection types are delay,jitter
> ,repeat,packet loss and disorder. And then kill most of the applications 
> submitted.
> Anyone also facing negative pending resource, or have idea of how this happen?

This message was sent by Atlassian JIRA

Reply via email to