On 27 September 2013 16:32, Sandy Ryza <[email protected]> wrote:
> Thanks for taking a look Steve. Some responses inline. > > > On Fri, Sep 27, 2013 at 3:30 AM, Steve Loughran <[email protected] > >wrote: > > > On 27 September 2013 00:48, Alejandro Abdelnur <[email protected]> > wrote: > > > > > Earlier this week I've posted the following comment for tomorrow's Yarn > > > > > I think it's an interesting strategy, even if the summary doc is a bit > > unduly negative about the alternate "Model a Service as a Yarn > Application" > > strategy, which, as you will be aware means that YARN can actually > enforce > > cgroup throttling of a service -and the YARN-896 stuff matters for all > > long-lifespan apps in YARN, Llama included. > > > > We think important work is going on with YARN-896, some of which will > definitely benefit Llama. We just think that, because of their different > needs, the model used for frameworks like Storm and Hoya isn't a good fit > for frameworks like Impala. > In the list of needs for something to work with Hoya, everything in the MUST/MUST NOT categories pretty much applies to everything that YARN can work with: https://github.com/hortonworks/hoya/blob/master/src/site/md/app_needs.md dynamically installed apps that use the DFS for all storage, can get killed on a whim and use dynamic binding mechanisms to locate peers, rather than have everything predefined in config files > > > > A quick look at the code hints to me that what you are doing in the AM is > > asking YARN for resources on a machine, but not actually running anything > > other than sleep in the NM process, but instead effectively block-booking > > capacity on that box for other work. Yes? > > > > That's right. > > > > If so its a bit like something that someone I know had MRv1 co-existing > > with another grid scheduler -when the other scheduler (which nominally > > owned the boxes) ran more code, the #of slots reported by the TT was > > reduced. It worked -ish, but without two-way comms it was limited. It > > sounds like the NM hookup is to let the existing "legacy" scheduler know > > that there's capacity it can currently use, with that scheduler asking > YARN > > nicely for the resources, rather than just take them and let the TT sort > > it out for itself. > > > > To us, the bit about asking the the scheduler nicely for resources is a > pretty big difference. Impala asks for resources in the same way as > MapReduce, allowing a central YARN scheduler to have the final say and the > user to think in terms of queues instead of frameworks. Asking for > resources based on which replicas have capacity is just an optimization > motivated by Impala's need for strict locality. > I think once we add a "long-lived" bit to an App request, you could start to think about schedulers making different placement decisions knowing that the resource will be retained for a while. Examples: hold back a bit longer before downgrading locality, on the basis that if a service is there for some weeks, locality really matters. > > > > What's your failure handling strategy going to be? Without YARN-1041 when > > the AM rolls over, it loses all its containers. Is your NM plugin set to > > pick that up and tell Impala it can't have them any more? > > > > Right. As soon as the NM kills an Impala container the NM plugin passes > that on to Impala, which releases the relevant resources. > > I see. Without that AM failure would leak unknown resources, which would be a disaster. One more code question: Thrift RPC. Why the choice of that? I'm curious because there is a bias in the Hadoop stack to Hadoop IPC and now Hadoop+protobuf, but you've chosen a different path. Why? Strengths? Weaknesses? -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
