[
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257292#comment-15257292
]
Wangda Tan commented on YARN-4390:
----------------------------------
[~sunilg],
Thanks for comments:
bq. 1. In
CapacitySchedulerPreemptionUtils#deductPreemptableResourcesBasedSelectedCandidates,
partition is retrieved from SchedulerNode...
Retriving it from SchedulerNode is because scheduler handles node label changes
asynchronizely, and preemption policy should make decision based on scheduler's
state. And it doesn't have bad performance issue as well.
bq. 2. context.getQueueByPartition has a code path flow to return NULL...
Modified patch, now throw YarnRuntimeException when null happens and handled it
outside. (skips the preemption run)
bq. 3 3. From FifoCandidatesSelector,...
This is a good point. However, since we have different ways to calculate ideal
resource allocation, which belongs to selector, such as:
{code}
preemptableAmountCalculator.computeIdealAllocation(clusterResource,
totalPreemptionAllowed);
{code}
Deduct resources should happen after ideal allocation as well.
bq. 4. One doubt about the newly changed PreemptableResourceCalculator ctor...
Since PreemptableResourceCalculator belongs to Selector, PreemptionContext
doesn't know what's the correct value of
considersReservedResourceWhenCalculateIdeal.
Added a simple Java doc to PreemptableResourceCalculator constructor for more
details.
bq. 5. getNodesForPreemption in ReservedContainerCandidatesSelector has to scan
all nodes in cluster
Jian has similar concern, will run benchmark tests for this.
bq. 6. In ReservedContainerCandidatesSelector, containers are sorted against
launch time.
Launch time may not be correct because we should respect FIFO. I think we
should consider more about this, for example, application priority, etc. In
latest patch I updated it to sort based on containerId (reverse order), we can
do more work for it in a separate patch
bq. 7. As I see RMContainer is read-only interface. I feel setQueueName can be
moved out from this interface and place in RMContainer...
Done
bq. Minor comments ..
All addressed.
> Do surgical preemption based on reserved container in CapacityScheduler
> -----------------------------------------------------------------------
>
> Key: YARN-4390
> URL: https://issues.apache.org/jira/browse/YARN-4390
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacity scheduler
> Affects Versions: 3.0.0, 2.8.0, 2.7.3
> Reporter: Eric Payne
> Assignee: Wangda Tan
> Attachments: YARN-4390-design.1.pdf, YARN-4390-test-results.pdf,
> YARN-4390.1.patch, YARN-4390.2.patch, YARN-4390.3.branch-2.patch,
> YARN-4390.3.patch, YARN-4390.4.patch, YARN-4390.5.patch, YARN-4390.6.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt
> containers. One is that an app could be requesting a large container (say
> 8-GB), and the preemption monitor could conceivably preempt multiple
> containers (say 8, 1-GB containers) in order to fill the large container
> request. These smaller containers would then be rejected by the requesting AM
> and potentially given right back to the preempted app.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)