[
https://issues.apache.org/jira/browse/YARN-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963202#comment-15963202
]
Arun Suresh edited comment on YARN-6443 at 4/10/17 5:11 PM:
------------------------------------------------------------
bq. Ah, so this apparently is describing a problem that only can occur if
scheduler keys are being used?
So, if all requests are made with different allocId, then yes, there is
performance degradation. But on the flip side, if no allocId were used, then
you might improve performance but at the cost of:
# No way of matching a returned container to a request
# No way of asking for different Resource sizes on the same Node.
So, I wouldn't say it is a bug per-se, since it is always possible for users to
NOT specify allocationId, and it would revert to the old behavior.
That said, I think we should change the heading of the JIRA. On further
analysis, we find that It is possible to improve performance without affecting
priority order. The confusion arose, as you pointed out, since we initially
included the allocId in the priority ordering, which we should not. But we
still need to fix the performance problem for users who want Containers matched
to requests.
For context, when we were initially prototyping the SchedulerKey feature, we
had considered pushing the allocationId down in the
{{AppSchedulingInfo::requests}} datastructure as follows:
1. Pre-allocationId state of the map:
{noformat}
requests = Map<Priority, Map<ResourceName, ResourceRequests>>
{noformat}
2. Initial Consideration:
{noformat}
requests = Map<Priority, Map<ResourceName, Map<AllocId, ResourceRequests>>>
{noformat}
3. Current Situation:
{noformat}
requests = Map<SchedulerKey, Map<ResourceName, ResourceRequests>>
where
SchdeulerKey = struct{Priority, AllocId}
{noformat}
2 and 3 above is actually somewhat equivalent, in terms of iteration cost
because you must still iterate over all the outstanding allocReqIds on a node
HB for the ANY case. For relaxLocality = false, 2 has a slight edge w.r.t 3
since the Requests for that node will be considered first, but in both cases,
all allocReqIds will be examined by the scheduler inner loop.
My understanding ([~hrsharma] can correct me) is that we can prune the list of
examined SchedulerKeys to consider only the Requests for the node and consider
the rack and ANY requests based on if the those SchedulerKeys have crossed the
missed opportunity threshold.
Will probably be easier to explain once we have a patch - which we will try to
attach here shortly.
was (Author: asuresh):
bq. Ah, so this apparently is describing a problem that only can occur if
scheduler keys are being used?
So, if all requests are made with different allocId, then yes, there is
performance degradation. But on the flip side, if no allocId were used, then
you might improve performance but at the cost of:
# No way of matching a returned container to a request
# No way of asking for different Resource sizes on the same Node.
I wouldn't say it is a bug per-se, since it is always possible for users to NOT
specify allocationId, and it would revert to the old behavior.
That said, I think we should change the heading of the JIRA. On further
analysis, we find that It is possible to improve performance without affecting
priority order. The confusion arose, as you pointed out, since we initially
included the allocId in the priority ordering, which we should not. But we
still need to fix the performance problem for users who want Containers matched
to requests.
For context, when we were initially prototyping the SchedulerKey feature, we
had considered pushing the allocationId down in the
{{AppSchedulingInfo::requests}} datastructure as follows:
1. Pre-allocationId state of the map:
{noformat}
requests = Map<Priority, Map<ResourceName, ResourceRequests>>
{noformat}
2. Initial Consideration:
{noformat}
requests = Map<Priority, Map<ResourceName, Map<AllocId, ResourceRequests>>>
{noformat}
3. Current Situation:
{noformat}
requests = Map<SchedulerKey, Map<ResourceName, ResourceRequests>>
where
SchdeulerKey = struct{Priority, AllocId}
{noformat}
2 and 3 above is actually somewhat equivalent, in terms of iteration cost
because you must still iterate over all the outstanding allocReqIds on a node
HB for the ANY case. For relaxLocality = false, 2 has a slight edge w.r.t 3
since the Requests for that node will be considered first, but in both cases,
all allocReqIds will be examined by the scheduler inner loop.
My understanding ([~hrsharma] can correct me) is that we can prune the list of
examined SchedulerKeys to consider only the Requests for the node and consider
the rack and ANY requests based on if the those SchedulerKeys have crossed the
missed opportunity threshold.
Will probably be easier to explain once we have a patch - which we will try to
attach here shortly.
> Allow for Priority order relaxing in favor of improved node/rack locality
> --------------------------------------------------------------------------
>
> Key: YARN-6443
> URL: https://issues.apache.org/jira/browse/YARN-6443
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: capacity scheduler, fairscheduler
> Reporter: Arun Suresh
> Assignee: Hitesh Sharma
>
> Currently the Schedulers examine an applications pending Requests in Priority
> order. This JIRA proposes to introduce a flag (either via the
> ApplicationMasterService::registerApplication() or via some Scheduler
> configuration) to favor an ordering that is baised to the node that is
> currently heartbeating by relaxing the priority constraint.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]