[ 
https://issues.apache.org/jira/browse/YARN-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216641#comment-15216641
 ] 

Sunil G commented on YARN-4822:
-------------------------------

Thanks  [~leftnoteasy].
Yes, That make sense. 
1.
In that case, could we mark Interface visibility as Unstable/Evolving for this 
{{CapacitySchedulerPreemptionUtils}} class since its public?

2. Also I am thinking whether we can make CapacitySchedulerPreemptionContext 
also visible with a limited scope. I can query some useful read-only interfaces 
from Scheduler for metric purpose. May be to get {{getKillableContainers}}.  I 
am still thinking how much information we can fetch from context rather than 
directly poking PCPP for some useful metric/log or debuggabillity perspective. 
Thoughts?

3. In {{selectCandidates}}, we skip containers which are already selected. And 
also we are checking whether this container is already present in 
{{preemptMap}} too. This check happens for every container now, may be some 
optimization can be planned in next version?

4. If {{tryPreemptContainerAndDeductResToObtain}} return false because 
container is already present in {{preemptMap}}, we can continue try for other 
container for same app , correct? Now we are breaking. May be I am wrong, pls 
help to share if missed some other factor.

5.
{code}
Resource maxAMCapacityForThisQueue = Resources.multiply(
143                 Resources.multiply(clusterResource,
144                     leafQueue.getAbsoluteCapacity()),
145                 leafQueue.getMaxAMResourcePerQueuePercent());
{code}

I think we need to consider AM resource percentage per partition also. I will 
raise another ticket if we are planning to handle separately.

Some minor nits:
CapacitySchedulerPreemptionUtils#getResToObtainByPartitionForLeafQueue
1. resToObtain ==> resToObtainByPartition
2. Can improve the comment here:
// Only add resToObtain when it >= 0   >>  Only add resToObtain when 
actuallyToBePreempted resource >= 0

3. In {{tryPreemptContainerAndDeductResToObtain}}, could you pls add queuename 
to the debug log.





> Refactor existing Preemption Policy of CS for easier adding new approach to 
> select preemption candidates
> --------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4822
>                 URL: https://issues.apache.org/jira/browse/YARN-4822
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-4822.1.patch, YARN-4822.2.patch, YARN-4822.3.patch, 
> YARN-4822.4.patch, YARN-4822.5.patch, YARN-4822.6.patch
>
>
> Currently, ProportionalCapacityPreemptionPolicy has hard coded logic to 
> select candidates to be preempted (based on FIFO order of 
> applications/containers). It's not a simple to add new candidate-selection 
> logics, such as preemption for large container, intra-queeu fairness/policy, 
> etc.
> In this JIRA, I propose to do following changes:
> 1) Cleanup code bases, consolidate current logic into 3 stages:
> - Compute ideal sharing of queues
> - Select to-be-preempt candidates
> - Send preemption/kill events to scheduler
> 2) Add a new interface: {{PreemptionCandidatesSelectionPolicy}} for above 
> "select to-be-preempt candidates" part. Move existing how to select 
> candidates logics to {{FifoPreemptionCandidatesSelectionPolicy}}. 
> 3) Allow multiple PreemptionCandidatesSelectionPolicies work together in a 
> chain. Preceding PreemptionCandidatesSelectionPolicy has higher priority to 
> select candidates, and later PreemptionCandidatesSelectionPolicy can make 
> decisions according to already selected candidates and pre-computed queue 
> ideal shares of resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to