Wangda Tan updated YARN-3361:
    Attachment: YARN-3361.3.patch

Thanks for your comments, [~vinodkv]/[~jianhe]:

* Main code comments from Vinod: *

bq. checkNodeLabelExpression: NPEs on labelExpression can happen?
No, I removed checkings

bq. FiCaSchedulerNode: exclusive, setters, getters -> exclusivePartition
They're not used by anybody, removed

bq. ExclusiveType renames

bq. AbstractCSQueue:
1. Change to nodePartitionToLookAt: Done
2. Now all queues checks needResources
3. Renamed to hasPendingResourceRequest as suggested by Jian

bq. checkResourceRequestMatchingNodeLabel can be moved into the application?
Moved to SchedulerUtils

bq. checkResourceRequestMatchingNodeLabel nodeLabelToLookAt arg is not used 
anywhere else.
Done (merged it in SchedulerUtils.checkResourceRequestMatchingNodePartition)

bq. addNonExclusiveSchedulingOpportunity
Renamed to reset/addMissedNonPartitionedRequestSchedulingOpportunity

bq. It seems like we are not putting absolute max-capacities on the individual 
queues when not-respecting-partitions. Describe why? Similarly, describe as to 
why user-limit-factor is ignored in the not-respecting-paritions mode.

* Test code comments from Vinod: *
bq. testNonExclusiveNodeLabelsAllocationIgnoreAppSubmitOrder

bq. testNonExclusiveNodeLabelsAllocationIgnorePriority
Rename to testPreferenceOfNeedyPrioritiesUnderSameAppTowardsNodePartitions
bq. Actually, now that I rename it that way, this may not be the right 
behavior. Not respecting priorities within an app can result in scheduling 
This will not lead deadlock, because we separately count resource usage under 
each partition, priority=1 goes first on partition=y before priority=0 all 
satisifed only because priority=1 is the lowest priority asks for partition=y.

bq. testLabeledResourceRequestsGetPreferrenceInHierarchyOfQueue
Renamed to testPreferenceOfQueuesTowardsNodePartitions

bq. testNonLabeledQueueUsesLabeledResource

bq. Let's move all these node-label related tests into their own test-case.
Moved to TestNodeLabelContainerAllocation

Add more tests:
1. Added testAMContainerAllocationWillAlwaysBeExclusive to make sure AM will be 
always excluisve.
2. Added testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity to 
make sure max-capacities on individual queues ignored when doing ignore 
exclusivity allocation

* Main code comments from Jian: *
bq. Merge queue#needResource and application#needResource
Done, now moved common implementation to 

bq. Merge queue#needResource and application#needResource

bq. Some methods like canAssignToThisQueue where both nodeLabels and 
exclusiveType are passed, it may be simplified by passing the current 
partitionToAllocate to simplify the internal if/else check.
Actually, it will not simplify logic too much, I checked there're only few 
places can leverage nodePartitionToLookAt, I perfer to keep semantics of 

bq. The following may be incorrect, as the current request may be not the AM 
container request, though null == rmAppAttempt.getMasterContainer()
I understand masterContainer could be async initialized in RMApp, but the 
interval could be ignored, doing the null check here can make sure AM container 
isn't get allocated.

bq. below if/else can be avoided if passing the nodePartition into 

bq. the second limit won’t be hit?
Yeah, it will not be hit, but set it to be "maxUserLimit" will enhance 

bq. nonExclusiveSchedulingOpportunities#setCount -> add(Priority)

Attached new patch (ver.3)

> CapacityScheduler side changes to support non-exclusive node labels
> -------------------------------------------------------------------
>                 Key: YARN-3361
>                 URL: https://issues.apache.org/jira/browse/YARN-3361
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-3361.1.patch, YARN-3361.2.patch, YARN-3361.3.patch
> According to design doc attached in YARN-3214, we need implement following 
> logic in CapacityScheduler:
> 1) When allocate a resource request with no node-label specified, it should 
> get preferentially allocated to node without labels.
> 2) When there're some available resource in a node with label, they can be 
> used by applications with following order:
> - Applications under queues which can access the label and ask for same 
> labeled resource. 
> - Applications under queues which can access the label and ask for 
> non-labeled resource.
> - Applications under queues cannot access the label and ask for non-labeled 
> resource.
> 3) Expose necessary information that can be used by preemption policy to make 
> preemption decisions.

This message was sent by Atlassian JIRA

Reply via email to