Nathan Roberts created YARN-4287:
------------------------------------
Summary: Capacity Scheduler: Rack Locality improvement
Key: YARN-4287
URL: https://issues.apache.org/jira/browse/YARN-4287
Project: Hadoop YARN
Issue Type: Improvement
Components: capacityscheduler
Affects Versions: 2.7.1
Reporter: Nathan Roberts
Assignee: Nathan Roberts
YARN-4189 does an excellent job describing the issues with the current delay
scheduling algorithms within the capacity scheduler. The design proposal also
seems like a good direction.
This jira proposes a simple interim solution to the key issue we've been
experiencing on a regular basis:
- rackLocal assignments trickle out due to nodeLocalityDelay. This can have
significant impact on things like CombineFileInputFormat which targets very
specific nodes in its split calculations.
I'm not sure when YARN-4189 will become reality so I thought a simple interim
patch might make sense. The basic idea is simple:
1) Separate delays for rackLocal, and OffSwitch (today there is only 1)
2) When we're getting rackLocal assignments, subsequent rackLocal assignments
should not be delayed
Patch will be uploaded shortly. No big deal if the consensus is to go straight
to YARN-4189.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)