On 27 September 2013 00:48, Alejandro Abdelnur <[email protected]> wrote:
> Earlier this week I've posted the following comment for tomorrow's Yarn > meetup. I just realized that most folks may miss that post, thus sending it > to the alias. > > We've been working on getting Impala running on YARN and wanted to share > Llama, a system that mediates between Impala and YARN. > > Our design and some of the rationale behind it and the code are available > at: > > Docs: http://cloudera.github.io/llama... > Code: https://github.com/cloudera/llam... > I think it's an interesting strategy, even if the summary doc is a bit unduly negative about the alternate "Model a Service as a Yarn Application" strategy, which, as you will be aware means that YARN can actually enforce cgroup throttling of a service -and the YARN-896 stuff matters for all long-lifespan apps in YARN, Llama included. A quick look at the code hints to me that what you are doing in the AM is asking YARN for resources on a machine, but not actually running anything other than sleep in the NM process, but instead effectively block-booking capacity on that box for other work. Yes? If so its a bit like something that someone I know had MRv1 co-existing with another grid scheduler -when the other scheduler (which nominally owned the boxes) ran more code, the #of slots reported by the TT was reduced. It worked -ish, but without two-way comms it was limited. It sounds like the NM hookup is to let the existing "legacy" scheduler know that there's capacity it can currently use, with that scheduler asking YARN nicely for the resources, rather than just take them and let the TT sort it out for itself. What's your failure handling strategy going to be? Without YARN-1041 when the AM rolls over, it loses all its containers. Is your NM plugin set to pick that up and tell Impala it can't have them any more? Two quick YARN-service code things -why didn't you make YarnRMLlamaAMConnector extend CompositeService and let the superclass handle the lifecycle of the various YARN services you are using yourself? It's what I've been doing and so far I've not hit any problems. It's services all the way down. -remember to make your serviceStop() calls best effort and to not NPE if they get called from a service that isn't actually started. I spent a lot of time going through all of hadoop core YARN services to make sure that was the case, and I'm not going to do the same for other projects. -Steve -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
