MENG DING commented on YARN-1651:

Hi, [~leftnoteasy]

I am ok with most of the reply comments. Thanks.

bq. It seems no need to do the null check here. When it becomes null? I prefer 
to keep it as-is and it will throw NPE if any fatal issue happens.
The {{updateContainerAndNMToken}} may return null:
      Container updatedContainer =
          updateContainerAndNMToken(rmContainer, false, increase);

I only mention this because {{pullNewlyAllocatedContainers()}} has a check for 
null for the same logic, so I think we may want to make it consistent?

Some remaining comments:
* As you mentioned in the code, currently reserved resource increase request 
does not participate in the continuous reservation looking logic. So, based on 
my understanding, if an application has reserved some resource for a container 
resource increase request on a node, that amount of resource should never be 
unreserved in order for the application to allocate a regular container on some 
other node. But that doesn't seem to be the case right now? Can you confirm?
If so, I am thinking a simple solution would be to *exclude* resources reserved 
for increased containers when trying to find an unreserved container for 
regular container allocation.
      unreservedContainer =
          application.findNodeToUnreserve(clusterResource, node, priority,      
<===== Don't consider resources reserved for container increase request
* I think it will be desirable to implement a {{pendingDecrease}} set in 
{{SchedulerApplicationAttempt}}, and corresponding logic, just like 
{{SchedulerApplicationAttempt.pendingRelease}}. This is to guard against the 
situation *when decrease requests are received while RM is in the middle of 
recovery, and has not received all container statuses from NM yet*.

* Some nits
** Comments in {{NMReportedContainerChangeIsDoneTransition}} doesn't seem right.
** IncreaseContainerAllocator: {{LOG.debug("  Headroom is satisifed, 
skip..");}} --> {{LOG.debug("  Headroom is not satisfied, skip..");}}

> CapacityScheduler side changes to support increase/decrease container 
> resource.
> -------------------------------------------------------------------------------
>                 Key: YARN-1651
>                 URL: https://issues.apache.org/jira/browse/YARN-1651
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager, scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-1651-1.YARN-1197.patch, 
> YARN-1651-2.YARN-1197.patch, YARN-1651-3.YARN-1197.patch

This message was sent by Atlassian JIRA

Reply via email to