@Sandy: Ok, will do. @Hitesh: Correct me if I'm wrong, but I think the multi resource proposal using DRF discussed in the JIRA only handles memory and CPU.
On Fri, Sep 6, 2013 at 6:02 PM, Sandy Ryza <[email protected]> wrote: > Janne, > In 2.1.0, a ResourceRequest for node/rack/* is still required, even for > strict locality requests. Using AMRMClient makes this a lot easier and is > the preferred way of submitting resource requests. Yes, strict locality > also works for just racks. > > Hilfi, > I'm not aware of an existing JIRA for adding network bandwidth as a > resource. Filing one would definitely be appreciated. If you're interested > in contributing this to Hadoop, it would be helpful to start with a > design/proposal discussing issues such as what units to use, how it would > be enforced, any interesting risks, etc. > > Thanks! > -Sandy > > > On Sat, Sep 7, 2013 at 5:10 AM, Janne Valkealahti < > [email protected]> wrote: > > > In terms of strict locality, how the actual in-house functionality > differs > > how it was "done" in 2.0.X. You needed to do request for node/rack/* and > if > > you got lucky you got the node you wanted. Do you still need to allocate > > host for node/rack/* or is plain host just fine? > > > > Will strict locality also work for allocation for just racks? > > > > > > On Fri, Sep 6, 2013 at 7:52 PM, Hitesh Shah <[email protected]> wrote: > > > > > Have you taken a look at > https://issues.apache.org/jira/browse/YARN-326? > > > > > > -- Hitesh > > > > > > On Sep 6, 2013, at 11:02 AM, hilfi alkaff wrote: > > > > > > > Thanks for all the replies. I think I have found the relevant codes > > that > > > I > > > > would like to modify. That said, a project that I'm doing now > requires > > > > containers to have network bandwidth as one of its resources (In > > > > Resource.java: it currently only models memory). > > > > > > > > Since I'm planning to implement it anyway, I hope to be able to help > > > > Hadoop's development. However, I could not find the relevant JIRA for > > > this. > > > > If you know of an existing ticket that is relevant to the > > aforementioned > > > > issue, let me know. If there is none, should I make my changes first > > (as > > > > listed http://wiki.apache.org/hadoop/HowToContribute) and get back > > after > > > > I'm done with my code? > > > > > > > > Thanks in advance. > > > > > > > > > > > > On Fri, Sep 6, 2013 at 6:37 AM, Steve Loughran < > [email protected] > > > >wrote: > > > > > > > >> worth adding is that this can generate a bias towards affinitive > > > >> assignment of an apps containers; for the YARN-896 service we've put > > > >> anti-affinity as a subtask, along with having AM opt to get > > > notifications > > > >> if assignments can't be met in a bounded period (or it could just > > > examine > > > >> its queue of outstanding requests and reach the same conclusion > based > > on > > > >> when the requests were submitted) > > > >> > > > >> > > > >> On 6 September 2013 07:43, Sandy Ryza <[email protected]> > > wrote: > > > >> > > > >>> That's right. Nodes keep checking in and, when they do, the > > > >>> ResourceManager looks for outstanding requests. This means that > > > >> assignment > > > >>> of containers to nodes depends on the order that they heartbeat in. > > If > > > >>> container requests come in for specific nodes locality is achieved > > > >> through > > > >>> delay scheduling - the ResourceManager will wait for a configurable > > > >> number > > > >>> of heartbeats before assigning a container to a non-local node. If > > > >> strict > > > >>> locality is turned on, the ResourceManager will wait indefinitely > > for a > > > >>> local node. > > > >>> > > > >>> -Sandy > > > >>> > > > >>> > > > >>> On Fri, Sep 6, 2013 at 3:33 PM, hilfi alkaff < > [email protected]> > > > >>> wrote: > > > >>> > > > >>>> I see. What I'm wondering about is; when an application master > tries > > > to > > > >>>> request a container from resource manager, which part of the code > in > > > >> the > > > >>>> resource manager actually decide which node to fetch this > container > > > >> from. > > > >>>> Is this step being done asynchronously (ie: Nodes keep checking if > > > >> there > > > >>>> are requests from the ResourceManager during the node update > event?) > > > >>>> > > > >>>> > > > >>>> On Fri, Sep 6, 2013 at 1:22 AM, Sandy Ryza < > [email protected] > > > > > > >>>> wrote: > > > >>>> > > > >>>>> Hi Hilfi, > > > >>>>> > > > >>>>> Nodes are constantly heartbeating to the ResourceManager. A node > > > >>> update > > > >>>>> event is triggered each time this happens. > > > >>>>> > > > >>>>> -Sandy > > > >>>>> > > > >>>>> > > > >>>>> On Fri, Sep 6, 2013 at 3:20 PM, hilfi alkaff < > > [email protected]> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Hi, > > > >>>>>> > > > >>>>>> I'm trying to trace the code flow on the scheduling done in > YARN. > > I > > > >>>> would > > > >>>>>> like to know where the code that does which node to schedule for > > > >> the > > > >>>>> jobs. > > > >>>>>> > > > >>>>>> I found the handle() function in the resource manager's > scheduler > > > >>> (eg: > > > >>>>>> CapacityScheduler.java) that handles node update event which > then > > > >>>>> executes > > > >>>>>> the assignment of containers for that particular node, but I do > > not > > > >>>>>> understand how that node even get chosen. > > > >>>>>> > > > >>>>>> If anybody could tell me about a file, function or module name > > that > > > >>>> does > > > >>>>>> this, that would be extremely helpful. > > > >>>>>> > > > >>>>>> -- > > > >>>>>> ~Hilfi Alkaff~ > > > >>>>>> > > > >>>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> ~Hilfi Alkaff~ > > > >>>> > > > >>> > > > >> > > > >> -- > > > >> CONFIDENTIALITY NOTICE > > > >> NOTICE: This message is intended for the use of the individual or > > > entity to > > > >> which it is addressed and may contain information that is > > confidential, > > > >> privileged and exempt from disclosure under applicable law. If the > > > reader > > > >> of this message is not the intended recipient, you are hereby > notified > > > that > > > >> any printing, copying, dissemination, distribution, disclosure or > > > >> forwarding of this communication is strictly prohibited. If you have > > > >> received this communication in error, please contact the sender > > > immediately > > > >> and delete it from your system. Thank You. > > > >> > > > > > > > > > > > > > > > > -- > > > > ~Hilfi Alkaff~ > > > > > > > > > -- ~Hilfi Alkaff~
