[
https://issues.apache.org/jira/browse/YARN-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065181#comment-15065181
]
Advertising
gu-chi commented on YARN-4481:
------------------------------
I added some extra log to trace, do you have any idea how can probably
reproduce?
> negative pending resource of queues lead to applications in accepted status
> inifnitly
> -------------------------------------------------------------------------------------
>
> Key: YARN-4481
> URL: https://issues.apache.org/jira/browse/YARN-4481
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Affects Versions: 2.7.2
> Reporter: gu-chi
> Priority: Critical
> Attachments: jmx.txt
>
>
> Met a scenario of negative pending resource with capacity scheduler, in jmx,
> it shows:
> {noformat}
> "PendingMB" : -4096,
> "PendingVCores" : -1,
> "PendingContainers" : -1,
> {noformat}
> full jmx infomation attached.
> this is not just a jmx UI issue, the actual pending resource of queue is also
> negative as I see the debug log of
> bq. DEBUG | ResourceManager Event Processor | Skip this queue=root, because
> it doesn't need more resource, schedulingMode=RESPECT_PARTITION_EXCLUSIVITY
> node-partition= | ParentQueue.java
> this lead to the {{NULL_ASSIGNMENT}}
> The background is submitting hundreds of applications and consume all cluster
> resource and reservation happen. While running, network fault injected by
> some tool, injection types are delay,jitter
> ,repeat,packet loss and disorder. And then kill most of the applications
> submitted.
> Anyone also facing negative pending resource, or have idea of how this happen?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)