[jira] [Commented] (YARN-5716) Add global scheduler interface definition and update CapacityScheduler to use it.

Wangda Tan (JIRA) Thu, 20 Oct 2016 16:02:10 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593292#comment-15593292
 ]


Wangda Tan commented on YARN-5716:
----------------------------------

Thanks [~jianhe] for the thorough review!

bq. how about check whether "getNode(node.getNodeID())" equals to null, I feel 
that's easier to reason for a removed node
It is also possible that ref to the node is changed, for example, node resource 
updated. In that case, we may need to skip such node for safety.

bq. This if condition can be merged into previous "if (reservedContainer != 
null) {" condition, as they are the same.
No we cannot do this merge, because it is possible that in the previous 
reservedContainer != null if, we reserve a new container so the check is not 
valid.

bq. Looks like one behavior change is that previously on node heartbeat, we 
always satisfy reservedContainer first, now in async scheduling, it's not the 
case any more ?
It is still the same, if you look at {{allocateContainerOnSingleNode}}, we try 
to satisify reserved contaienr first.

bq. PlacementSet, placement is an abstract name, how about NodeSet to be more 
concret?
I would prefer use "PlacementSet", since it is for "placement", we could add 
more information to it, for example, racks.

bq. PlacementSetUtils.getSingleNode -> hasSingleNode
But what we need to do is to return the node if it is a 
single-node-placement-set, I think this name is better

bq. nodePartition parameter is not needed, it can be inferred from 'node' 
parameter
The original purpose adding the partition is, the partition of node could be 
updated in between of proposal proposed and applied, it will be used to check 
if we should reject the proposal when partition of the node changed. I have a 
separate "TODO" in FiCaSchedulerApp: 
{code}
// TODO, make sure all node labels are not changed
{code}

bq. In LeafQueue#updateCurrentResourceLimits, multiple threads will update 
cachedResourceLimitsForHeadroom without synchronization
This is intentional: we want the resourceLimitsForHeadroom is up-to-date, it is 
possible that one thread has some inconsistency data, but it will be corrected 
soon by other threads. Since the resourceLimitsForHeadroom is only used to just 
give more hints to application, it should be fine.
And ResourceLimits is volatile, so it is safe as well.

bq. SchedulerApplicationAttempt#incNumAllocatedContainers, all the locality 
statistics functionality are removed ?
Oh I missed that, will update it in the next iteration.

Addressed all the other comments

Uploaded patch ver.5

> Add global scheduler interface definition and update CapacityScheduler to use 
> it.
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-5716
>                 URL: https://issues.apache.org/jira/browse/YARN-5716
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-5716.001.patch, YARN-5716.002.patch, 
> YARN-5716.003.patch, YARN-5716.004.patch, YARN-5716.005.patch
>
>
> Target of this JIRA:
> - Definition of interfaces / objects which will be used by global scheduling, 
> this will be shared by different schedulers.
> - Modify CapacityScheduler to use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-5716) Add global scheduler interface definition and update CapacityScheduler to use it.

Reply via email to