[ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771710#comment-16771710
 ] 

Shengyang Sha commented on YARN-9195:
-------------------------------------

{quote}
Just read the patch, I am trying to understand 
refreshContainersFromPreviousAttempts(), if a container from previous attempt 
is completed, then you are not removing it from outstanding requests. Why are 
you doing this?
{quote}
refreshContainersFromPreviousAttempts method is used to maintain running 
containers which originally obtained by previous app attempts, not outstanding 
requests.
Probably you meant removePreviousContainersFromOutstandingSchedulingRequests 
method. In this method, I filtered out (1) containers obtained by current app 
attempt and (2) known containers from previous app attempt.

{quote}
I am also not sure why you need to initApplicationAttempt(), this is retrieving 
current app attempt id from AM RM token. Since in the protocol, we have 
getContainersFromPreviousAttempts() already, what's the attempt id is used for 
here?
{quote}
I think current app attempt id is needed because RM might return all the 
running containers as previous containers 
(RegisterApplicationMasterResponse#getNMTokensFromPreviousAttempts). If we 
don't filter out such containers, outstanding request will be decreased 
unexpectedly. And if current outstanding request is zero, it will then be 
decreased to zero.

{quote}
Another thing is, why this issue would cause pending container/resource in RM's 
queue become negative? Can you add some more info?
{quote}
As have described above, outstanding requests could turn to negative values. 
Since RM has no sanity check, requests in RM will then become negative. Btw, 
the description of this issue also provides some detailed explanations.


> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9195
>                 URL: https://issues.apache.org/jira/browse/YARN-9195
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 3.1.0
>            Reporter: Shengyang Sha
>            Assignee: Shengyang Sha
>            Priority: Critical
>         Attachments: YARN-9195.001.patch, YARN-9195.002.patch, 
> cases_to_recreate_negative_pending_requests_scenario.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to