[ 
https://issues.apache.org/jira/browse/YARN-8355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LongGang Chen updated YARN-8355:
--------------------------------
    Description: 
first, Quickly go through the update logic, Increase as an example:
 * 1: normal work in ApplicationMasterService, DefaultAMSProcessor.    
 * 2: CapacityScheduler.allocate will call 
AbstractYarnScheduler.handleContainerUpdates
 * 3: AbstractYarnScheduler.handleContainerUpdates will call 
handleIncreaseRequests, then call 
ContainerUpdateContext.checkAndAddToOutstandingIncreases
 * 4: cancle && and new: checkAndAddToOutstandingIncreases will check this inc 
update for this container, if there is an outstanding inc, it will cancle it by 
calling appSchedulingInfo.allocate(...) to allocate a dummy container; if the 
update is a fresh one, it will call appSchedulingInfo.updateResourceRequests to 
add a new request. the capacity of this new request is gap value between 
existing container and capacity of updateRequest, for example, if original 
capacity is <memory:10GB>, the target capacity of UpdateRequest is 
<memory:20GB>, the gap[the capacity of the new request which will be added to 
appSchedulingInfo] is <memory:10GB>.
 * 5: swap temp container and existing container: CapacityScheduler.allocate 
call FiCaSchedulerApp.getAllocation(...), getAllocation will call 
SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call 
ContainerUpdateContext.swapContainer,swapContainer will swap the newly 
allocated inc temp container with existing container, for example: original 
capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the 
updated existing container has capacity <memory:10+10=20GB>,inc update done.

the problem is:
 if we send inc update twice for a certain container, for example: send inc 
<memory:10> to <memory:12>, then send inc with new target <memory:14>, the 
final updated capacity is uncertain.

Scenes one:
 * 1: send inc update from <memory:10> to <memory:12>
 * 2: scheduler aproves it, and commit it, so app.liveContainers has this temp 
inc container with capacity<memory:2> in it.
 * 3: send inc with new target <memory:14>, a new resourceRequest with 
capacity<memory:4> will add to appSchedulingInfo, and swap first temp 
container<memory:2>, after that, the existing container has new 
capacity<memory:12>
 * 4: scheduler aproves the send temp resourceRequest, allocate second temp 
container with capacity<memory:4>
 * 5: swap the second inc temp container. so the updated capacity of this 
existing container is <memory:4+12> = <memory:16>, but wanted is <memory:14>

Scenes two:
 * 1: send send inc update from <memory:10> to <memory:12>
 * 2: scheduler aproves it, but the temp container with capacity<memory:2> is 
queued in commitService, wait to commit
 * 3: send inc with new target <memory:14>, will add a new resourceRequest to 
appSchedulingInfo, but with same SchedulerRequestKey.
 * 4: the first temp container commit, app.apply will call 
appSchedulingInfo.allocate to reduce pending num, at this situation, it will 
cancle the second inc request.
 * 5: swap the first int temp container. the updated existing container's 
capacity is <memory:12>, but the wanted is <memory:14>

two key points:
 * 1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle 
previous inc request and put current inc request, it use same 
SchedulerRequestKey , this action has competition with app.apply, like scenes 
two, app.apply will cancle second inc update's request.
 * 2: ContainerUpdateContext.swapContainer do not check the update target 
change or not.

how to fix: 
 * 1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle 
previous inc update request , use a new SchedulerRequestKey for current inc 
update request . we can add a new field createTime to distinguish them, default 
value of createTime is 0
 * 2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer, 
check update target change or not, if change, just ignore this temp container 
and release it. like Scenes one, when we swap first temp inc container, we 
found that if we do this swap, the updated capacity is <memory:12>, but the 
newly target's capacity is <memory:14>, so we just ignore this swap, and 
release the temp container<memory:2>.

 

  was:
first, Quickly go through the update logic, Increase as an example:

step 1: normal work in ApplicationMasterService, DefaultAMSProcessor.
step 2: CapacityScheduler.allocate will call 
AbstractYarnScheduler.handleContainerUpdates
step 3: AbstractYarnScheduler.handleContainerUpdates will call 
handleIncreaseRequests, then call 
ContainerUpdateContext.checkAndAddToOutstandingIncreases
step 4: cancle && and new: checkAndAddToOutstandingIncreases will check this 
inc update for this container, if there is an outstanding inc, it will cancle 
it by calling appSchedulingInfo.allocate(...) to allocate a dummy container; if 
the update is a fresh one, it will call 
appSchedulingInfo.updateResourceRequests to add a new request. the capacity of 
this new request is gap value between exiting rmContainer and capacity of 
updateRequest, for example, if original capacity is <memory:10GB>, the target 
capacity of UpdateRequest is <memory:20GB>, the gap[the capacity of the new 
request which will be added to appSchedulingInfo] is <memory:10GB>.

step 5: swap temp container and existing container: CapacityScheduler.allocate 
call FiCaSchedulerApp.getAllocation(...), getAllocation will call 
SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call 
ContainerUpdateContext.swapContainer,swapContainer will swap the newly 
allocated inc temp container with existing container, for example: original 
capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the 
updated existing container has capacity <memory:10+10=20GB>,inc update done.

the problem is:
if we send inc update twice for a certain container, for example: send inc 
<memory:10> to <memory:12>, then send inc with new target <memory:14>, the 
final updated capacity is uncertain.

Scenes one:
1: send inc update from <memory:10> to <memory:12>
2: scheduler aprove it, and commit it, so app.liveContainers has this temp inc 
container with capacity<memory:2> in it.
3: send inc with new target <memory:14>, a new resourceRequest with 
capacity<memory:4> will add to appSchedulingInfo, and swap first temp 
container<memory:2>, after that, the existing container has new 
capacity<memory:12>
4: scheduler aprove the send temp reqourceRequest, allocate second temp 
container with capacity<memory:4>
5: swap the second inc temp container. so the updated capacity of this existing 
container is <memory:4+12> = <memory:16>, but wanted is <memory:14>

Scenes two:
1: send send inc update from <memory:10> to <memory:12>
2: scheduler aprove it, but the temp container with capacity<memory:2> is 
queued in commitService, wait to commit
3: send inc with new target <memory:14>, will add a new resourceRequest to 
appSchedulingInfo, but with same SchedulerRequestKey.
4: the first temp container commit, app.apply will call 
appSchedulingInfo.allocate to reduce pending num, at this situation, it will 
cancle the second inc request.
5: swap the first int temp container. the updated existing container's capacity 
is <memory:12>, but the wanted is <memory:14>

two key points:
1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle 
previous inc and put current inc request, it use same SchedulerRequestKey as 
before, this action has competition with app.apply, like scenes two, app.apply 
will cancle second inc update's request.

2: ContainerUpdateContext.swapContainer do not check the update target change 
or not.

how to fix:
1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle 
previous inc update, use a new SchedulerRequestKey for current inc update. we 
can add a new field createTime to distinguish them, default value of createTime 
is 0
2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer, check 
update target change or not, if change, just ignore this temp container and 
release it. like Scenes one, when we swap first temp inc container, wo found 
that if we do this swap, the updated capacity is <memory:12>, but the newly 
target's capacity is <memory:14>, so we just ignore this swap, and release the 
temp container<memory:2>.

 


> container update error because of competition
> ---------------------------------------------
>
>                 Key: YARN-8355
>                 URL: https://issues.apache.org/jira/browse/YARN-8355
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: RM
>    Affects Versions: 3.0.x
>            Reporter: LongGang Chen
>            Priority: Major
>
> first, Quickly go through the update logic, Increase as an example:
>  * 1: normal work in ApplicationMasterService, DefaultAMSProcessor.    
>  * 2: CapacityScheduler.allocate will call 
> AbstractYarnScheduler.handleContainerUpdates
>  * 3: AbstractYarnScheduler.handleContainerUpdates will call 
> handleIncreaseRequests, then call 
> ContainerUpdateContext.checkAndAddToOutstandingIncreases
>  * 4: cancle && and new: checkAndAddToOutstandingIncreases will check this 
> inc update for this container, if there is an outstanding inc, it will cancle 
> it by calling appSchedulingInfo.allocate(...) to allocate a dummy container; 
> if the update is a fresh one, it will call 
> appSchedulingInfo.updateResourceRequests to add a new request. the capacity 
> of this new request is gap value between existing container and capacity of 
> updateRequest, for example, if original capacity is <memory:10GB>, the target 
> capacity of UpdateRequest is <memory:20GB>, the gap[the capacity of the new 
> request which will be added to appSchedulingInfo] is <memory:10GB>.
>  * 5: swap temp container and existing container: CapacityScheduler.allocate 
> call FiCaSchedulerApp.getAllocation(...), getAllocation will call 
> SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call 
> ContainerUpdateContext.swapContainer,swapContainer will swap the newly 
> allocated inc temp container with existing container, for example: original 
> capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the 
> updated existing container has capacity <memory:10+10=20GB>,inc update done.
> the problem is:
>  if we send inc update twice for a certain container, for example: send inc 
> <memory:10> to <memory:12>, then send inc with new target <memory:14>, the 
> final updated capacity is uncertain.
> Scenes one:
>  * 1: send inc update from <memory:10> to <memory:12>
>  * 2: scheduler aproves it, and commit it, so app.liveContainers has this 
> temp inc container with capacity<memory:2> in it.
>  * 3: send inc with new target <memory:14>, a new resourceRequest with 
> capacity<memory:4> will add to appSchedulingInfo, and swap first temp 
> container<memory:2>, after that, the existing container has new 
> capacity<memory:12>
>  * 4: scheduler aproves the send temp resourceRequest, allocate second temp 
> container with capacity<memory:4>
>  * 5: swap the second inc temp container. so the updated capacity of this 
> existing container is <memory:4+12> = <memory:16>, but wanted is <memory:14>
> Scenes two:
>  * 1: send send inc update from <memory:10> to <memory:12>
>  * 2: scheduler aproves it, but the temp container with capacity<memory:2> is 
> queued in commitService, wait to commit
>  * 3: send inc with new target <memory:14>, will add a new resourceRequest to 
> appSchedulingInfo, but with same SchedulerRequestKey.
>  * 4: the first temp container commit, app.apply will call 
> appSchedulingInfo.allocate to reduce pending num, at this situation, it will 
> cancle the second inc request.
>  * 5: swap the first int temp container. the updated existing container's 
> capacity is <memory:12>, but the wanted is <memory:14>
> two key points:
>  * 1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle 
> previous inc request and put current inc request, it use same 
> SchedulerRequestKey , this action has competition with app.apply, like scenes 
> two, app.apply will cancle second inc update's request.
>  * 2: ContainerUpdateContext.swapContainer do not check the update target 
> change or not.
> how to fix: 
>  * 1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle 
> previous inc update request , use a new SchedulerRequestKey for current inc 
> update request . we can add a new field createTime to distinguish them, 
> default value of createTime is 0
>  * 2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer, 
> check update target change or not, if change, just ignore this temp container 
> and release it. like Scenes one, when we swap first temp inc container, we 
> found that if we do this swap, the updated capacity is <memory:12>, but the 
> newly target's capacity is <memory:14>, so we just ignore this swap, and 
> release the temp container<memory:2>.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to