[ https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219844#comment-14219844 ]
Karthik Kambatla commented on YARN-2877: ---------------------------------------- +1 to the idea, particularly to reduce the allocation latency. I definitely see Impala wanting to use this in the future. Not mentioned in the description, I believe scale is probably another big reason for distributed scheduling. bq. Improve cluster utilization by opportunistically executing tasks otherwise idle resources on individual machines. A centralized RM could schedule tasks opportunistically too? Is the intention to quickly adapt to changing resource usage on the node, and the latency due to NM-RM-NM communication being too long to loose this window of opportunity? > Extend YARN to support distributed scheduling > --------------------------------------------- > > Key: YARN-2877 > URL: https://issues.apache.org/jira/browse/YARN-2877 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager > Reporter: Sriram Rao > > This is an umbrella JIRA that proposes to extend YARN to support distributed > scheduling. Briefly, some of the motivations for distributed scheduling are > the following: > 1. Improve cluster utilization by opportunistically executing tasks otherwise > idle resources on individual machines. > 2. Reduce allocation latency. Tasks where the scheduling time dominates > (i.e., task execution time is much less compared to the time required for > obtaining a container from the RM). > -- This message was sent by Atlassian JIRA (v6.3.4#6332)