[jira] [Comment Edited] (YARN-6443) Allow for Priority order relaxing in favor of improved node/rack locality

Arun Suresh (JIRA) Mon, 10 Apr 2017 10:12:06 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963202#comment-15963202
 ]


Arun Suresh edited comment on YARN-6443 at 4/10/17 5:11 PM:
------------------------------------------------------------

bq. Ah, so this apparently is describing a problem that only can occur if 
scheduler keys are being used?
So, if all requests are made with different allocId, then yes, there is 
performance degradation. But on the flip side, if no allocId were used, then 
you might improve performance but at the cost of:
# No way of matching a returned container to a request
# No way of asking for different Resource sizes on the same Node.

So, I wouldn't say it is a bug per-se, since it is always possible for users to 
NOT specify allocationId, and it would revert to the old behavior.

That said, I think we should change the heading of the JIRA. On further 
analysis, we find that It is possible to improve performance without affecting 
priority order. The confusion arose, as you pointed out, since we initially 
included the allocId in the priority ordering, which we should not. But we 
still need to fix the performance problem for users who want Containers matched 
to requests.

For context, when we were initially prototyping the SchedulerKey feature, we 
had considered pushing the allocationId down in the 
{{AppSchedulingInfo::requests}} datastructure as follows:

1. Pre-allocationId state of the map:
{noformat}
requests = Map<Priority, Map<ResourceName, ResourceRequests>>
{noformat}

2. Initial Consideration:
{noformat}
requests = Map<Priority, Map<ResourceName, Map<AllocId, ResourceRequests>>>
{noformat}

3. Current Situation:
{noformat}
requests = Map<SchedulerKey, Map<ResourceName, ResourceRequests>>
where
SchdeulerKey = struct{Priority, AllocId}
{noformat}

2 and 3 above is actually somewhat equivalent, in terms of iteration cost 
because you must still iterate over all the outstanding allocReqIds on a node 
HB for the ANY case. For relaxLocality = false, 2 has a slight edge w.r.t 3 
since the Requests for that node will be considered first, but in both cases, 
all allocReqIds will be examined by the scheduler inner loop.

My understanding ([~hrsharma] can correct me) is that we can prune the list of 
examined SchedulerKeys to consider only the Requests for the node and consider 
the rack and ANY requests based on if the those SchedulerKeys have crossed the 
missed opportunity threshold.

Will probably be easier to explain once we have a patch - which we will try to 
attach here shortly.


was (Author: asuresh):
bq. Ah, so this apparently is describing a problem that only can occur if 
scheduler keys are being used?
So, if all requests are made with different allocId, then yes, there is 
performance degradation. But on the flip side, if no allocId were used, then 
you might improve performance but at the cost of:
# No way of matching a returned container to a request
# No way of asking for different Resource sizes on the same Node.
I wouldn't say it is a bug per-se, since it is always possible for users to NOT 
specify allocationId, and it would revert to the old behavior.

That said, I think we should change the heading of the JIRA. On further 
analysis, we find that It is possible to improve performance without affecting 
priority order. The confusion arose, as you pointed out, since we initially 
included the allocId in the priority ordering, which we should not. But we 
still need to fix the performance problem for users who want Containers matched 
to requests.

For context, when we were initially prototyping the SchedulerKey feature, we 
had considered pushing the allocationId down in the 
{{AppSchedulingInfo::requests}} datastructure as follows:

1. Pre-allocationId state of the map:
{noformat}
requests = Map<Priority, Map<ResourceName, ResourceRequests>>
{noformat}

2. Initial Consideration:
{noformat}
requests = Map<Priority, Map<ResourceName, Map<AllocId, ResourceRequests>>>
{noformat}

3. Current Situation:
{noformat}
requests = Map<SchedulerKey, Map<ResourceName, ResourceRequests>>
where
SchdeulerKey = struct{Priority, AllocId}
{noformat}

2 and 3 above is actually somewhat equivalent, in terms of iteration cost 
because you must still iterate over all the outstanding allocReqIds on a node 
HB for the ANY case. For relaxLocality = false, 2 has a slight edge w.r.t 3 
since the Requests for that node will be considered first, but in both cases, 
all allocReqIds will be examined by the scheduler inner loop.

My understanding ([~hrsharma] can correct me) is that we can prune the list of 
examined SchedulerKeys to consider only the Requests for the node and consider 
the rack and ANY requests based on if the those SchedulerKeys have crossed the 
missed opportunity threshold.

Will probably be easier to explain once we have a patch - which we will try to 
attach here shortly.

> Allow for Priority order relaxing in favor of improved node/rack locality 
> --------------------------------------------------------------------------
>
>                 Key: YARN-6443
>                 URL: https://issues.apache.org/jira/browse/YARN-6443
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler, fairscheduler
>            Reporter: Arun Suresh
>            Assignee: Hitesh Sharma
>
> Currently the Schedulers examine an applications pending Requests in Priority 
> order. This JIRA proposes to introduce a flag (either via the 
> ApplicationMasterService::registerApplication() or via some Scheduler 
> configuration) to favor an ordering that is baised to the node that is 
> currently heartbeating by relaxing the priority constraint.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YARN-6443) Allow for Priority order relaxing in favor of improved node/rack locality

Reply via email to