Junping Du commented on YARN-19:

Hi [~ashahab], thanks for your feedback on this! 
I remember long time ago, the community decide to go hierarchical way instead 
of plugable way so the patch here may not suitable to go forward (please check 
YARN-18 design doc for details). I haven't get bandwidth to follow up the new 
design for a new implementation given other priorities. However, if you are 
interested, please feel free to take over YARN-18 and 19 and move it forward 
(better to conform with new design), and I will try to help on review.

> 4-layer topology (with NodeGroup layer) implementation of Container 
> Assignment and Task Scheduling (for YARN)
> -------------------------------------------------------------------------------------------------------------
>                 Key: YARN-19
>                 URL: https://issues.apache.org/jira/browse/YARN-19
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: 
> HADOOP-8475-ContainerAssignmentTaskScheduling-withNodeGroup.patch, 
> MAPREDUCE-4310-v1.patch, MAPREDUCE-4310.patch, YARN-19-v2.patch, 
> YARN-19-v3-alpha.patch, YARN-19-v4.patch, YARN-19.patch
> There are several classes in YARN’s container assignment and task scheduling 
> algorithms that related to data locality which were updated to give 
> preference to running a container on the same nodegroup. This section 
> summarized the changes in the patch that provides a new implementation to 
> support a four-layer hierarchy.
> When the ApplicationMaster makes a resource allocation request to the 
> scheduler of ResourceManager, it will add the node group to the list of 
> attributes in the ResourceRequest. The parameters of the resource request 
> will change from <priority, (host, rack, *), memory, #containers> to 
> <priority, (host, nodegroup, rack, *), memory, #containers>.
> After receiving the ResoureRequest the RM scheduler will assign containers 
> for requests in the sequence of data-local, nodegroup-local, rack-local and 
> off-switch.Then, ApplicationMaster schedules tasks on allocated containers in 
> sequence of data- local, nodegroup-local, rack-local and off-switch.
> In terms of code changes made to YARN task scheduling, we updated the class 
> ContainerRequestEvent so that applications can requests for containers can 
> include anodegroup. In RM schedulers, FifoScheduler and CapacityScheduler 
> were updated. For the FifoScheduler, the changes were in the method 
> assignContainers. For the Capacity Scheduler the method 
> assignContainersOnNode in the class of LeafQueue was updated. In both changes 
> a new method, assignNodeGroupLocalContainers() was added in between the 
> assignment data-local and rack-local.

This message was sent by Atlassian JIRA

Reply via email to