gu-chi created YARN-4481:

             Summary: negative pending resource of queues lead to applications 
in accepted status inifnitly
                 Key: YARN-4481
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacity scheduler
    Affects Versions: 2.7.2
            Reporter: gu-chi
            Priority: Critical

Met a scenario of negative pending resource with capacity scheduler, in jmx, it 
    "PendingMB" : -4096,
    "PendingVCores" : -1,
    "PendingContainers" : -1,
full jmx infomation attached.
this is not just a jmx UI issue, the actual pending resource of queue is also 
negative as I see the debug log of
bq. DEBUG | ResourceManager Event Processor | Skip this queue=root, because it 
doesn't need more resource, schedulingMode=RESPECT_PARTITION_EXCLUSIVITY 
node-partition= |
this lead to the {{NULL_ASSIGNMENT}}
The background is submitting hundreds of applications and consume all cluster 
resource and reservation happen. While running, network fault injected by some 
tool, injection types are delay,jitter
,repeat,packet loss and disorder. And then kill most of the applications 

Anyone also facing negative pending resource, or have idea of how this happen?

This message was sent by Atlassian JIRA

Reply via email to