LongGang Chen created YARN-8355:
-----------------------------------
Summary: container update error because of competition
Key: YARN-8355
URL: https://issues.apache.org/jira/browse/YARN-8355
Project: Hadoop YARN
Issue Type: Bug
Components: RM
Affects Versions: 3.0.x
Reporter: LongGang Chen
first, Quickly go through the update logic, Increase as an example:
step 1: normal work in ApplicationMasterService, DefaultAMSProcessor.
step 2: CapacityScheduler.allocate will call
AbstractYarnScheduler.handleContainerUpdates
step 3: AbstractYarnScheduler.handleContainerUpdates will call
handleIncreaseRequests, then call
ContainerUpdateContext.checkAndAddToOutstandingIncreases
step 4: cancle && and new: checkAndAddToOutstandingIncreases will check this
inc update for this container, if there is an outstanding inc, it will cancle
it by calling appSchedulingInfo.allocate(...) to allocate a dummy container; if
the update is a fresh one, it will call
appSchedulingInfo.updateResourceRequests to add a new request. the capacity of
this new request is gap value between exiting rmContainer and capacity of
updateRequest, for example, if original capacity is <memory:10GB>, the target
capacity of UpdateRequest is <memory:20GB>, the gap[the capacity of the new
request which will be added to appSchedulingInfo] is <memory:10GB>.
step 5: swap temp container and existing container: CapacityScheduler.allocate
call FiCaSchedulerApp.getAllocation(...), getAllocation will call
SchedulerApplicationAttempt.pullNewlyIncreasedContainers, then call
ContainerUpdateContext.swapContainer,swapContainer will swap the newly
allocated inc temp container with existing container, for example: original
capacity <memory:10GB>, temp inc container's capacity <memory:10GB>, so the
updated existing container has capacity <memory:10+10=20GB>,inc update done.
the problem is:
if we send inc update twice for a certain container, for example: send inc
<memory:10> to <memory:12>, then send inc with new target <memory:14>, the
final updated capacity is uncertain.
Scenes one:
1: send inc update from <memory:10> to <memory:12>
2: scheduler aprove it, and commit it, so app.liveContainers has this temp inc
container with capacity<memory:2> in it.
3: send inc with new target <memory:14>, a new resourceRequest with
capacity<memory:4> will add to appSchedulingInfo, and swap first temp
container<memory:2>, after that, the existing container has new
capacity<memory:12>
4: scheduler aprove the send temp reqourceRequest, allocate second temp
container with capacity<memory:4>
5: swap the second inc temp container. so the updated capacity of this existing
container is <memory:4+12> = <memory:16>, but wanted is <memory:14>
Scenes two:
1: send send inc update from <memory:10> to <memory:12>
2: scheduler aprove it, but the temp container with capacity<memory:2> is
queued in commitService, wait to commit
3: send inc with new target <memory:14>, will add a new resourceRequest to
appSchedulingInfo, but with same SchedulerRequestKey.
4: the first temp container commit, app.apply will call
appSchedulingInfo.allocate to reduce pending num, at this situation, it will
cancle the second inc request.
5: swap the first int temp container. the updated existing container's capacity
is <memory:12>, but the wanted is <memory:14>
two key points:
1: when ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
previous inc and put current inc request, it use same SchedulerRequestKey as
before, this action has competition with app.apply, like scenes two, app.apply
will cancle second inc update's request.
2: ContainerUpdateContext.swapContainer do not check the update target change
or not.
how to fix:
1: after ContainerUpdateContext.checkAndAddToOutstandingIncreases cancle
previous inc update, use a new SchedulerRequestKey for current inc update. we
can add a new field createTime to distinguish them, default value of createTime
is 0
2: change ContainerUpdateContext.swapContainer to checkAndSwapContainer, check
update target change or not, if change, just ignore this temp container and
release it. like Scenes one, when we swap first temp inc container, wo found
that if we do this swap, the updated capacity is <memory:12>, but the newly
target's capacity is <memory:14>, so we just ignore this swap, and release the
temp container<memory:2>.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]