[ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102135#comment-15102135
 ] 

Sunil G commented on YARN-4108:
-------------------------------

Thanks [~leftnoteasy] for the clarifications and thank [~eepayne] for the 
inputs. I skimmed through the patch and it looks fine.

bq. It would be interesting to know what your thoughts are on making further 
modifications to PCPP to make more informed choices about which containers to 
kill
As Wangda mentioned, I think we can discuss this point in the new Jira. In top 
of my mind, few use case are there.
We are selecting containers based on priority/submission time. But few 
containers might be *costly* for AM and AM can try save those containers 
maximum posible way (will surely preempt if AM cannot spare any other container 
for the demand from PCPP). Also *time remaining / % of completion* are also 
some good option. All these cannot be plugged together, so based on use case we 
can choose the set of policies needed there. its a very rough/raw idea now, we 
need to discuss and refine it more (because possible proto changes may needed 
for AM-RM communication).

bq.I understand this could lead to unnecessary add-container-to-preempt-list 
event send to AM, but I think it's better than excessive killing containers
I think this relaxation can be taken. Looks fine.

bq.From PreemptionManager,
{code}private Set<ContainerId> killableContainers = new HashSet<>();{code}
I think we can avoid this list. rather we can verify from PreemptionEntity 
itself.

bq.A queue who is using more than max-capacity, an it has killable container, 
we will try to kill containers for such queues to make sure it doesn't violate 
max-capacity
Now along with PCPP, preemption will happen on above case. I think we can add 
some more detailed diagnostics here to give reason for preemption that 
max-capacity is violated etc.. It will helpful while debugging.

Not in the scope of this jira, if we can add a log summary after each PCPP 
round such as:
>Demand is 8024 for queueA
>> Unreserved 2GB resource of application <appId> from node <nodeId>
>> Killed 3 nos of 2GB container of application <appId> from queueB
>> Killed 2 nos of 3GB container of application <appId> from queueB
Now this log will make more sense. After this ticket, if its fine, we can add 
this separately.

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4108
>                 URL: https://issues.apache.org/jira/browse/YARN-4108
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, 
> YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to