[
https://issues.apache.org/jira/browse/YARN-8355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
LongGang Chen updated YARN-8355:
--------------------------------
Description:
first, Quickly go through the update logic, Increase as an example:
* 1: normal work in ApplicationMasterService, DefaultAMSProcessor.
* 2: CapacityScheduler.allocate will call
AbstractYarnScheduler.handleContainerUpdates
* 3: AbstractYarnScheduler.handleContainerUpdates will call
handleIncreaseRequests, then call
ContainerUpdateContext.checkAndAddToOutstandingIncreases
* 4: cancle && and new: checkAndAddToOutstandingIncreases will check this inc
update for this container, if there is an outstanding inc, it will cancle it by
calling appSchedulingInfo.allocate(...) to allocate a dummy container; if the
update is a fresh one, it will call appSchedulingInfo.updateResourceRequests to
add a new request. the capacity of this new request is gap value between
existing container and capacity of updateRequest, for example, if original
capacity is <memory:10GB>, the target capacity of UpdateRequest is
<memory:20GB>, the gap[the capacity of the new request which will be added to
appSchedulingInfo] is <memory:10GB>.
* 5: swap temp container and existing container: CapacityScheduler.allocate
call FiCaSchedulerApp.getAllocation(...), getAllocation will call
SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call
ContainerUpdateContext.swapContainer,swapContainer will swap the newly
allocated inc temp container with existing container, for example: original
capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the
updated existing container has capacity <memory:10+10=20GB>,inc update done.
the problem is:
if we send inc update twice for a certain container, for example: send inc
<memory:10> to <memory:12>, then send inc with new target <memory:14>, the
final updated capacity is uncertain.
Scenes one:
* 1: send inc update from <memory:10> to <memory:12>
* 2: scheduler aproves it, and commit it, so app.liveContainers has this temp
inc container with capacity<memory:2> in it.
* 3: send inc with new target <memory:14>, a new resourceRequest with
capacity<memory:4> will add to appSchedulingInfo, and swap first temp
container<memory:2>, after that, the existing container has new
capacity<memory:12>
* 4: scheduler aproves the send temp resourceRequest, allocate second temp
container with capacity<memory:4>
* 5: swap the second inc temp container. so the updated capacity of this
existing container is <memory:4+12> = <memory:16>, but wanted is <memory:14>
Scenes two:
* 1: send send inc update from <memory:10> to <memory:12>
* 2: scheduler aproves it, but the temp container with capacity<memory:2> is
queued in commitService, wait to commit
* 3: send inc with new target <memory:14>, will add a new resourceRequest to
appSchedulingInfo, but with same SchedulerRequestKey.
* 4: the first temp container commit, app.apply will call
appSchedulingInfo.allocate to reduce pending num, at this situation, it will
cancle the second inc request.
* 5: swap the first int temp container. the updated existing container's
capacity is <memory:12>, but the wanted is <memory:14>
two key points:
* 1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
previous inc request and put current inc request, it use same
SchedulerRequestKey , this action has competition with app.apply, like scenes
two, app.apply will cancle second inc update's request.
* 2: ContainerUpdateContext.swapContainer do not check the update target
change or not.
how to fix:
* 1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
previous inc update request , use a new SchedulerRequestKey for current inc
update request . we can add a new field createTime to distinguish them, default
value of createTime is 0
* 2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer,
check update target change or not, if change, just ignore this temp container
and release it. like Scenes one, when we swap first temp inc container, we
found that if we do this swap, the updated capacity is <memory:12>, but the
newly target's capacity is <memory:14>, so we just ignore this swap, and
release the temp container<memory:2>.
was:
first, Quickly go through the update logic, Increase as an example:
step 1: normal work in ApplicationMasterService, DefaultAMSProcessor.
step 2: CapacityScheduler.allocate will call
AbstractYarnScheduler.handleContainerUpdates
step 3: AbstractYarnScheduler.handleContainerUpdates will call
handleIncreaseRequests, then call
ContainerUpdateContext.checkAndAddToOutstandingIncreases
step 4: cancle && and new: checkAndAddToOutstandingIncreases will check this
inc update for this container, if there is an outstanding inc, it will cancle
it by calling appSchedulingInfo.allocate(...) to allocate a dummy container; if
the update is a fresh one, it will call
appSchedulingInfo.updateResourceRequests to add a new request. the capacity of
this new request is gap value between exiting rmContainer and capacity of
updateRequest, for example, if original capacity is <memory:10GB>, the target
capacity of UpdateRequest is <memory:20GB>, the gap[the capacity of the new
request which will be added to appSchedulingInfo] is <memory:10GB>.
step 5: swap temp container and existing container: CapacityScheduler.allocate
call FiCaSchedulerApp.getAllocation(...), getAllocation will call
SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call
ContainerUpdateContext.swapContainer,swapContainer will swap the newly
allocated inc temp container with existing container, for example: original
capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the
updated existing container has capacity <memory:10+10=20GB>,inc update done.
the problem is:
if we send inc update twice for a certain container, for example: send inc
<memory:10> to <memory:12>, then send inc with new target <memory:14>, the
final updated capacity is uncertain.
Scenes one:
1: send inc update from <memory:10> to <memory:12>
2: scheduler aprove it, and commit it, so app.liveContainers has this temp inc
container with capacity<memory:2> in it.
3: send inc with new target <memory:14>, a new resourceRequest with
capacity<memory:4> will add to appSchedulingInfo, and swap first temp
container<memory:2>, after that, the existing container has new
capacity<memory:12>
4: scheduler aprove the send temp reqourceRequest, allocate second temp
container with capacity<memory:4>
5: swap the second inc temp container. so the updated capacity of this existing
container is <memory:4+12> = <memory:16>, but wanted is <memory:14>
Scenes two:
1: send send inc update from <memory:10> to <memory:12>
2: scheduler aprove it, but the temp container with capacity<memory:2> is
queued in commitService, wait to commit
3: send inc with new target <memory:14>, will add a new resourceRequest to
appSchedulingInfo, but with same SchedulerRequestKey.
4: the first temp container commit, app.apply will call
appSchedulingInfo.allocate to reduce pending num, at this situation, it will
cancle the second inc request.
5: swap the first int temp container. the updated existing container's capacity
is <memory:12>, but the wanted is <memory:14>
two key points:
1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
previous inc and put current inc request, it use same SchedulerRequestKey as
before, this action has competition with app.apply, like scenes two, app.apply
will cancle second inc update's request.
2: ContainerUpdateContext.swapContainer do not check the update target change
or not.
how to fix:
1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
previous inc update, use a new SchedulerRequestKey for current inc update. we
can add a new field createTime to distinguish them, default value of createTime
is 0
2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer, check
update target change or not, if change, just ignore this temp container and
release it. like Scenes one, when we swap first temp inc container, wo found
that if we do this swap, the updated capacity is <memory:12>, but the newly
target's capacity is <memory:14>, so we just ignore this swap, and release the
temp container<memory:2>.
> container update error because of competition
> ---------------------------------------------
>
> Key: YARN-8355
> URL: https://issues.apache.org/jira/browse/YARN-8355
> Project: Hadoop YARN
> Issue Type: Bug
> Components: RM
> Affects Versions: 3.0.x
> Reporter: LongGang Chen
> Priority: Major
>
> first, Quickly go through the update logic, Increase as an example:
> * 1: normal work in ApplicationMasterService, DefaultAMSProcessor.
> * 2: CapacityScheduler.allocate will call
> AbstractYarnScheduler.handleContainerUpdates
> * 3: AbstractYarnScheduler.handleContainerUpdates will call
> handleIncreaseRequests, then call
> ContainerUpdateContext.checkAndAddToOutstandingIncreases
> * 4: cancle && and new: checkAndAddToOutstandingIncreases will check this
> inc update for this container, if there is an outstanding inc, it will cancle
> it by calling appSchedulingInfo.allocate(...) to allocate a dummy container;
> if the update is a fresh one, it will call
> appSchedulingInfo.updateResourceRequests to add a new request. the capacity
> of this new request is gap value between existing container and capacity of
> updateRequest, for example, if original capacity is <memory:10GB>, the target
> capacity of UpdateRequest is <memory:20GB>, the gap[the capacity of the new
> request which will be added to appSchedulingInfo] is <memory:10GB>.
> * 5: swap temp container and existing container: CapacityScheduler.allocate
> call FiCaSchedulerApp.getAllocation(...), getAllocation will call
> SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call
> ContainerUpdateContext.swapContainer,swapContainer will swap the newly
> allocated inc temp container with existing container, for example: original
> capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the
> updated existing container has capacity <memory:10+10=20GB>,inc update done.
> the problem is:
> if we send inc update twice for a certain container, for example: send inc
> <memory:10> to <memory:12>, then send inc with new target <memory:14>, the
> final updated capacity is uncertain.
> Scenes one:
> * 1: send inc update from <memory:10> to <memory:12>
> * 2: scheduler aproves it, and commit it, so app.liveContainers has this
> temp inc container with capacity<memory:2> in it.
> * 3: send inc with new target <memory:14>, a new resourceRequest with
> capacity<memory:4> will add to appSchedulingInfo, and swap first temp
> container<memory:2>, after that, the existing container has new
> capacity<memory:12>
> * 4: scheduler aproves the send temp resourceRequest, allocate second temp
> container with capacity<memory:4>
> * 5: swap the second inc temp container. so the updated capacity of this
> existing container is <memory:4+12> = <memory:16>, but wanted is <memory:14>
> Scenes two:
> * 1: send send inc update from <memory:10> to <memory:12>
> * 2: scheduler aproves it, but the temp container with capacity<memory:2> is
> queued in commitService, wait to commit
> * 3: send inc with new target <memory:14>, will add a new resourceRequest to
> appSchedulingInfo, but with same SchedulerRequestKey.
> * 4: the first temp container commit, app.apply will call
> appSchedulingInfo.allocate to reduce pending num, at this situation, it will
> cancle the second inc request.
> * 5: swap the first int temp container. the updated existing container's
> capacity is <memory:12>, but the wanted is <memory:14>
> two key points:
> * 1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
> previous inc request and put current inc request, it use same
> SchedulerRequestKey , this action has competition with app.apply, like scenes
> two, app.apply will cancle second inc update's request.
> * 2: ContainerUpdateContext.swapContainer do not check the update target
> change or not.
> how to fix:
> * 1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
> previous inc update request , use a new SchedulerRequestKey for current inc
> update request . we can add a new field createTime to distinguish them,
> default value of createTime is 0
> * 2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer,
> check update target change or not, if change, just ignore this temp container
> and release it. like Scenes one, when we swap first temp inc container, we
> found that if we do this swap, the updated capacity is <memory:12>, but the
> newly target's capacity is <memory:14>, so we just ignore this swap, and
> release the temp container<memory:2>.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]