Tim,

They are very interesting points.  From a scalability point I don't think
we have really run into those situations yet but they are coming.  YARN
currently has some very "simplistic" scheduling for the RM.  All of the
complexity comes out in the AM.  There have been a number of JIRA to make
requests more complex, to help support more "picky" applications like the
paper says.  These would make YARN shift a bit more from a two-level
scheduler towards a Monolithic one, and thereby reducing some of the
scalability of the system, but making it support more complex scheduling
patterns.  The largest YARN cluster I know of right now is about 4000
nodes. On it we are hitting some bottlenecks with the current scheduler.
We have looked at some ways to speed it up with more conventional
approaches like allowing the scheduler to me multithreaded.  We expect to
be able to easily support 4000-6000 nodes through YARN with a few
optimizations. Going to tens of thousands of nodes would require some more
significant changes.

As far as utilization is concerned the presented architecture does provide
some very interesting points, but all of that can be addressed with a
Monolithic scheduler so long as we don't have to scale very large. It also
would probably require a complete redesign of YARN and the MR AM, which is
not a small undertaking.  There is also the question of trusted code.  In
a shared state system where all of the various schedulers are peers how
would we enforce resource constraints?  Each of the schedulers would have
to enforce them themselves, and as such would have to be trusted code.
This makes adding in new application types on the fly difficult.

I suppose we could do a hybrid approach, where the RM is a single type of
scheduler among many.  It would provide the same API that currently exists
for YARN applications, but MR applications could have one or more
"JobTracker" like schedulers that share state with the RM, and what other
"schedulers" there are out.  That would be something fun to try out, but
sadly I really don't have time to even get started thinking about a proof
of concept on something like that. At least that is until we hit a
significant business use case that would drive it over the architecture we
already have.  For example needing 10s of thousands of nodes in a
cluster, or a huge shift in different types of jobs on to YARN so that we
are doing a lot more than just MR on the same cluster.

--Bobby

On 4/19/13 9:47 AM, "Tim St Clair" <[email protected]> wrote:

>I recently read Googles Omega paper, and wondering if any of the YARN
>developers were planning to address some of the items considered as key
>points.  
>
>http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf
>
>Cheers,
>Tim

Reply via email to