Re: Llama - Low Latency Application MAster

Steve Loughran Fri, 27 Sep 2013 03:31:55 -0700

On 27 September 2013 00:48, Alejandro Abdelnur <[email protected]> wrote:


> Earlier this week I've posted the following comment for tomorrow's Yarn
> meetup. I just realized that most folks may miss that post, thus sending it
> to the alias.
>
> We've been working on getting Impala running on YARN and wanted to share
> Llama, a system that mediates between Impala and YARN.
>
> Our design and some of the rationale behind it and the code are available
> at:
>
> Docs: http://cloudera.github.io/llama...
> Code: https://github.com/cloudera/llam...
>

I think it's an interesting strategy, even if the summary doc is a bit
unduly negative about the alternate "Model a Service as a Yarn Application"
strategy, which, as you will be aware means that YARN can actually enforce
cgroup throttling of a service -and the YARN-896 stuff matters for all
long-lifespan apps in YARN, Llama included.

A quick look at the code hints to me that what you are doing in the AM is
asking YARN for resources on a machine, but not actually running anything
other than sleep in the NM process, but instead effectively block-booking
capacity on that box for other work. Yes?

If so its a bit like something that someone I know had MRv1 co-existing
with another grid scheduler -when the other scheduler (which nominally
owned the boxes) ran more code, the #of slots reported by the TT was
reduced. It worked -ish, but without two-way comms it was limited. It
sounds like the NM hookup is to let the existing "legacy" scheduler know
that there's capacity it can currently use, with that scheduler asking YARN
 nicely for the resources, rather than just take them and let the TT sort
it out for itself.

What's your failure handling strategy going to be? Without YARN-1041 when
the AM rolls over, it loses all its containers. Is your NM plugin set to
pick that up and tell Impala it can't have them any more?


Two quick YARN-service code things

-why didn't you make YarnRMLlamaAMConnector extend CompositeService and let
the superclass handle the lifecycle of the various YARN services you are
using yourself? It's what I've been doing and so far I've not hit any
problems. It's services all the way down.

-remember to make your serviceStop() calls best effort and to not NPE if
they get called from a service that isn't actually started. I spent a lot
of time going through all of hadoop core YARN services to make sure that
was the case, and I'm not going to do the same for other projects.

-Steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Llama - Low Latency Application MAster

Reply via email to