Nathan Roberts created YARN-4287:

             Summary: Capacity Scheduler: Rack Locality improvement
                 Key: YARN-4287
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacityscheduler
    Affects Versions: 2.7.1
            Reporter: Nathan Roberts
            Assignee: Nathan Roberts

YARN-4189 does an excellent job describing the issues with the current delay 
scheduling algorithms within the capacity scheduler. The design proposal also 
seems like a good direction.

This jira proposes a simple interim solution to the key issue we've been 
experiencing on a regular basis:
 - rackLocal assignments trickle out due to nodeLocalityDelay. This can have 
significant impact on things like CombineFileInputFormat which targets very 
specific nodes in its split calculations.

I'm not sure when YARN-4189 will become reality so I thought a simple interim 
patch might make sense. The basic idea is simple: 
1) Separate delays for rackLocal, and OffSwitch (today there is only 1)
2) When we're getting rackLocal assignments, subsequent rackLocal assignments 
should not be delayed

Patch will be uploaded shortly. No big deal if the consensus is to go straight 
to YARN-4189. 

This message was sent by Atlassian JIRA

Reply via email to