Jian He commented on YARN-1651:

Thanks Wangda, some comments on my side:
- Remove ApplicationMasterService#checkDuplicatedIncreaseDecreaseRequest
- revert ClientRMService change
- AbstractYarnScheduler#checkAndNormalizeContainerChangeRequests
 -- attempt parameter is not used
 -- this will never be null ?
if (null == sr) {
 --  pass exception as a parameter, instead of string concatenation ?
LOG.warn("Error happens when checking increase request:" + e
    + ". Ignoring 
this request");
 -- the invalid resize requests are simply ignored. How will the AM know that 
their resize requests are ignored and so lost ? Similar thing for the 
toBeRemovedRequests inside IncreaseContainerAllocator.
- updateIncreaseRequests: originalResource-> prevChangeRequest; similarly for 
the logging {{original capacity = }}
- use decreaseRequest#getSchedulerNode instead of 
unreserveContainerIncreaseRequest(clusterResource, app,
csContext.getNode(decreaseRequest.getNodeId()), rmContainer);
.decreaseContainer(decreaseRequest.getContainerId(), absDelta);
- remove SchedulerApplicationAttempt#hasIncreaseRequestForContainer
- newlyIncreasedContainers will not be null
List<Container> newlyIncreasedContainers =
if (null != newlyIncreasedContainers) {
- why is the decision made to give increase request highest priority ? I think 
we need to augment the ContainerResourceChangeRequest API to have the priority 
too so that it can be compared against regular resource requests?
 * Try to allocate increase container first, and if we failed to allocate
anything, we will try to allocate regular container

- SchedulerContainerResourceChangeRequest, this name is so long that it reduces 
code readability a lot as it’s used in so many places, how about 
SchedContainerChangeRequest ? or any other concise name you can think of ?
- It is true that node resource would change. But changing node resource is 
such a rare event that the probability of hitting this much lower than hitting 
it in the allocate call. I suggest we should also fail the allocate call in 
this case.
// The reason of doing check here instead of adding increase request
// to 
scheduler because node's resource could be updated after
// request added.
- why equal ?
if (null == request
    || reservedContainer.getContainerState() != 
request.getDeltaCapacity())) {

- allocateIncreaseRequestFromReservedContainer- remove the unused parameter
- I think killing a container in this case is too harsh. Nothing is wrong with 
the container itself. The container can run fine. I think we should revert the 
increased resource and have some way to signal back that the increaseRequest 
failed, maybe in the allocate response ?
// When the container expired, and it has a pending increased request, we
will kill the container.

new KillTransition().transition(container, event);
- I think the container.containerIncreasedAndAcquired boolean flag is not 
- unreserveContainerIncreaseRequest-> unreserveIncreasedContainer
- internalReleaseResource: remove unused application parameter
- fix the format 
 RMAppAttempt appAttempt =

> CapacityScheduler side changes to support increase/decrease container 
> resource.
> -------------------------------------------------------------------------------
>                 Key: YARN-1651
>                 URL: https://issues.apache.org/jira/browse/YARN-1651
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager, scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-1651-1.YARN-1197.patch, 
> YARN-1651-2.YARN-1197.patch, YARN-1651-3.YARN-1197.patch, 
> YARN-1651-4.YARN-1197.patch

This message was sent by Atlassian JIRA

Reply via email to