Srikanth Kandula commented on YARN-2745:

Just a brief update on this JIRA... 

1) [~chris.douglas] pushed through "collection" of network and disk usages to 
Hadoop common. See Hadoop 12210. 

2) [~elgoiri] and [~kasha] in Yarn 3534 and Yarn 3980 collecting cpu and memory 
info of containers, push that information from the NM to the RM and make it 
available to the scheduler.

3) Packing requires the scheduler to look past the first "schedulable" task 
discovered by the capacity scheduler loop. Based on the feedback above, we have 
decoupled the architectural change needed from the actual packing policy. See 
Yarn 4056, called bundling. Many different packing policies are allowed in the 

4) These changes are complementary and orthogonal to Yarn-1011. That JIRA 
recommends, rightly, to adapt RM allocation based on dynamic resource usage of 
the allocated containers. This JIRA is more about packing containers. It 
currently does so based on expected resource usages as indicated in the ask. 
Indeed, packing based on dynamic usage information would be strictly better and 
is left for future work.

> Extend YARN to support multi-resource packing of tasks
> ------------------------------------------------------
>                 Key: YARN-2745
>                 URL: https://issues.apache.org/jira/browse/YARN-2745
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager, resourcemanager, scheduler
>            Reporter: Robert Grandl
>            Assignee: Robert Grandl
>         Attachments: sigcomm_14_tetris_talk.pptx, tetris_design_doc.docx, 
> tetris_paper.pdf
> In this umbrella JIRA we propose an extension to existing scheduling 
> techniques, which accounts for all resources used by a task (CPU, memory, 
> disk, network) and it is able to achieve three competing objectives: 
> fairness, improve cluster utilization and reduces average job completion time.

This message was sent by Atlassian JIRA

Reply via email to