Robert, Thank you for your response. I've placed some questions and comments inline below.
Cheers, Tim ----- Original Message ----- > From: "Robert Evans" <[email protected]> > To: [email protected] > Sent: Friday, April 19, 2013 12:34:52 PM > Subject: Re: Omega vs. YARN > > Tim, > > They are very interesting points. From a scalability point I don't think > we have really run into those situations yet but they are coming. YARN > currently has some very "simplistic" scheduling for the RM. All of the > complexity comes out in the AM. There have been a number of JIRA to make > requests more complex, to help support more "picky" applications like the > paper says. These would make YARN shift a bit more from a two-level > scheduler towards a Monolithic one, and thereby reducing some of the > scalability of the system, but making it support more complex scheduling > patterns. The largest YARN cluster I know of right now is about 4000 > nodes. On it we are hitting some bottlenecks with the current scheduler. > We have looked at some ways to speed it up with more conventional > approaches like allowing the scheduler to me multithreaded. We expect to > be able to easily support 4000-6000 nodes through YARN with a few > optimizations. Going to tens of thousands of nodes would require some more > significant changes. If there are JIRA(s) which outline the limitations I would be interested in knowing more. > > As far as utilization is concerned the presented architecture does provide > some very interesting points, but all of that can be addressed with a > Monolithic scheduler so long as we don't have to scale very large. It also > would probably require a complete redesign of YARN and the MR AM, which is > not a small undertaking. There is also the question of trusted code. In > a shared state system where all of the various schedulers are peers how > would we enforce resource constraints? I think the biggest open questions I have with a distributed approach, are; priority, preemption policies, and fragmentation. > Each of the schedulers would have > to enforce them themselves, and as such would have to be trusted code. > This makes adding in new application types on the fly difficult. > > I suppose we could do a hybrid approach, where the RM is a single type of > scheduler among many. It would provide the same API that currently exists > for YARN applications, but MR applications could have one or more > "JobTracker" like schedulers that share state with the RM, and what other > "schedulers" there are out. That would be something fun to try out, but > sadly I really don't have time to even get started thinking about a proof > of concept on something like that. At least that is until we hit a > significant business use case that would drive it over the architecture we > already have. > > For example needing 10s of thousands of nodes in a > cluster, or a huge shift in different types of jobs on to YARN so that we > are doing a lot more than just MR on the same cluster. Something tells me it may come fast, if/when the YARN application space expands. > > --Bobby > > On 4/19/13 9:47 AM, "Tim St Clair" <[email protected]> wrote: > > >I recently read Googles Omega paper, and wondering if any of the YARN > >developers were planning to address some of the items considered as key > >points. > > > >http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf > > > >Cheers, > >Tim > >
