[
https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678606#comment-16678606
]
Botong Huang commented on YARN-8984:
------------------------------------
Took a quick look. It is expected for AMRMClient to re-send all pending request
after an RM failover. Whenever a container is allocated, we should remove it
from the pending list, which is exactly what
_removeFromOutstandingSchedulingRequests()_ is doing here. If we are not
cleaning it up properly, very likely is it because RM is not feeding in the
proper allocationTags in the allocated Container? So we need to fix this
instead of removing the null check here?
> AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty
> ------------------------------------------------------------------------------
>
> Key: YARN-8984
> URL: https://issues.apache.org/jira/browse/YARN-8984
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Yang Wang
> Assignee: Yang Wang
> Priority: Critical
> Attachments: YARN-8984-001.patch, YARN-8984-002.patch,
> YARN-8984-003.patch
>
>
> In AMRMClient, outstandingSchedRequests should be removed or decreased when
> container allocated. However, it could not work when allocation tag is null
> or empty.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]