[jira] [Commented] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

Weiwei Yang (JIRA) Sun, 24 Feb 2019 10:23:34 -0800


    [ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776341#comment-16776341
 ]


Weiwei Yang commented on YARN-9195:
-----------------------------------

Thanks [~ssy], [~Tao Yang]. Apologies for the delay on reviewing this, I need 
to catch up a lot of changes around.

To better illustrate the problem, let's use an example

1) Initially app state (container size irrelative) 
{code:java}
total 5, outstanding: 1, currentAttemptAllocated: 2, previousAttemptsAllocated:2
{code}
2) RM failover, and recover containers

current logic during register
{code:java}
outstanding -= previousAttemptsAllocated
outstanding = -1 // WRONG!
{code}
is this correct? If this is the case, I am confused why it even removes the 
previous attempts containers at first place? 

Back to the patch, method 
\{{removePreviousContainersFromOutstandingSchedulingRequests}}, it scans all 
containers from previous attempts, if container ID is not same as current ID, 
remove it from outstanding request. As \{{prevContainers}} is the result of 
\{{response.getContainersFromPreviousAttempts()}} which means the containers' 
ID should not be same as current attemptId, why you need to compare the ID 
again?

Thanks

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9195
>                 URL: https://issues.apache.org/jira/browse/YARN-9195
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 3.1.0
>            Reporter: MalcolmSanders
>            Assignee: MalcolmSanders
>            Priority: Critical
>         Attachments: YARN-9195.001.patch, YARN-9195.002.patch, 
> YARN-9195.003.patch, cases_to_recreate_negative_pending_requests_scenario.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

Reply via email to