[
https://issues.apache.org/jira/browse/YARN-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859205#comment-15859205
]
Varun Saxena commented on YARN-6157:
------------------------------------
[~Naganarasimha], the check for system max apps is made in
CapacityScheduler#addApplication and flow for recovered apps goes via
CapacityScheduler#addApplicationOnRecovery. So recovered apps I think will be
excluded.
Other point is right. We probably need a counter in LeafQueue which will be
increment when we call LeafQueue#submitApplication. We have a similar counter
in ParentQueue too.
> Inconsistencies in verifying Max Applications
> ---------------------------------------------
>
> Key: YARN-6157
> URL: https://issues.apache.org/jira/browse/YARN-6157
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Naganarasimha G R
> Assignee: Naganarasimha G R
>
> Inconsistencies in verifying Max Applications when the max apps is reduced
> and either HA is done ow work preserving restart is done.
> # currently Max applications across cluster should not be done for the
> recovered apps. Seems like currently we are doing it
> # Max applications for a queue is done @ CapacityScheduler.addApplication
> which considers sum of Pending and running applications but we add to pending
> applications in {{CapacityScheduler.addApplicationAttempt ->
> LeafQueue.addApplicationAttempt}} so between these 2 checks we can activate
> more apps than what can queue restrict.
> # During recovery of a RMApp, if applicationAttempts are not found then we
> recover it without recovery false @ {{RMAppImpl.RMAppRecoveredTransition}},
> this can lead to failure of apps which were accepted earlier but attempt was
> not yet created and HA happens when MAX app configuration (for cluster/queue)
> is modified.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]