[ 
https://issues.apache.org/jira/browse/YARN-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957707#comment-16957707
 ] 

Prabhu Joseph commented on YARN-9921:
-------------------------------------

[~tangzhankun] The patch looks good. +1 

Thanks [~tarunparimi] for the patch.

> Issue in PlacementConstraint when YARN Service AM retries allocation on 
> component failure.
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-9921
>                 URL: https://issues.apache.org/jira/browse/YARN-9921
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Tarun Parimi
>            Assignee: Tarun Parimi
>            Priority: Major
>         Attachments: YARN-9921.001.patch, differenceProtobuf.png
>
>
> When YARN Service AM tries to relaunch a container on failure, we encounter 
> the below error in PlacementConstraints.
> {code:java}
> ERROR impl.AMRMClientAsyncImpl: Exception on heartbeat
> org.apache.hadoop.yarn.exceptions.YarnException: 
> org.apache.hadoop.yarn.exceptions.SchedulerInvalidResoureRequestException: 
> Invalid updated SchedulingRequest added to scheduler, we only allows changing 
> numAllocations for the updated SchedulingRequest. 
> Old=SchedulingRequestPBImpl{priority=0, allocationReqId=0, 
> executionType={Execution Type: GUARANTEED, Enforce Execution Type: true}, 
> allocationTags=[component], 
> resourceSizing=ResourceSizingPBImpl{numAllocations=0, 
> resources=<memory:557056, vCores:1>}, 
> placementConstraint=notin,node,llap:notin,node,yarn_node_partition/=[label]} 
> new=SchedulingRequestPBImpl{priority=0, allocationReqId=0, 
> executionType={Execution Type: GUARANTEED, Enforce Execution Type: true}, 
> allocationTags=[component], 
> resourceSizing=ResourceSizingPBImpl{numAllocations=1, 
> resources=<memory:557056, vCores:1>}, 
> placementConstraint=notin,node,component:notin,node,yarn_node_partition/=[label]},
>  if any fields need to be updated, please cancel the old request (by setting 
> numAllocations to 0) and send a SchedulingRequest with different combination 
> of priority/allocationId
> {code}
> But we can see from the message that the SchedulingRequest is indeed valid 
> with everything same except numAllocations as expected. But still the below 
> equals check in SingleConstraintAppPlacementAllocator fails.
> {code:java}
> // Compare two objects
>       if (!schedulingRequest.equals(newSchedulingRequest)) {
>         // Rollback #numAllocations
>         sizing.setNumAllocations(newNumAllocations);
>         throw new SchedulerInvalidResoureRequestException(
>             "Invalid updated SchedulingRequest added to scheduler, "
>                 + " we only allows changing numAllocations for the updated "
>                 + "SchedulingRequest. Old=" + schedulingRequest.toString()
>                 + " new=" + newSchedulingRequest.toString()
>                 + ", if any fields need to be updated, please cancel the "
>                 + "old request (by setting numAllocations to 0) and send a "
>                 + "SchedulingRequest with different combination of "
>                 + "priority/allocationId");
>       }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to