On Mon, Sep 9, 2013 at 5:08 AM, Steve Loughran <[email protected]>wrote:
> On 6 September 2013 19:02, hilfi alkaff <[email protected]> wrote: > > > Thanks for all the replies. I think I have found the relevant codes that > I > > would like to modify. That said, a project that I'm doing now requires > > containers to have network bandwidth as one of its resources (In > > Resource.java: it currently only models memory). > > > > > This is something that's been discussed before, IO bandwidth being the > other constraint. > > > Limits like this would most benefit mixed workload clusters, where what you > are trying to limit is not the net & IO bandwidth a low-latency service > ness, but the load the batch jobs place on the machines -it's not so much > restrictions on the service bandwidth you want, but the ability to > (dynamically?) throttle back the bandwidth that other containers are using. > > However, you need to take into account that a lot of network traffic is > generated on off-host HDFS IO; throttle that back and your remote file IO > will also be restricted. Local HDFS operations will not be restricted -even > if you cgroup-limit your process- because that goes through the local > Datanode. > I agree. I think this gets trickier especially with network resources since the bottleneck of a host-host pair can be influenced by another pair depending the network topology that Hadoop is running on. I haven't found a good way of abstracting the requirements for such kind of communication yet. One thing that I can think about is to have a user to be able to specify a guaranteed amount of bandwidth during the shuffle phase which gets tricky as I mentioned earlier. > > > Since I'm planning to implement it anyway, I hope to be able to help > > Hadoop's development. However, I could not find the relevant JIRA for > this. > > If you know of an existing ticket that is relevant to the aforementioned > > issue, let me know. If there is none, should I make my changes first (as > > listed http://wiki.apache.org/hadoop/HowToContribute) and get back after > > I'm done with my code? > > > > > These are pretty big changes, and doing it off on your own and turning up > with a big set of changes it's unlikely to get in, due to the intimacy of > the changes across the codebase, and the fact that you don't yet have a > track record of working in this area (to be fair, nobody would trust me to > dabble in the scheduler either, even though I have the commit rights). > > This would have to be collaborative development process, where even if you > do most of the coding of the feature and its test suite, you need to do it > visibly, get feedback & act on it -starting with the design > > -we're obsessive about testing, so try and come up with a design for > testing all this that would measure the effects of the throttling. You > should also set up your own test infrastructure with Jenkins doing local > tests of your branch, ideally with a pool of real/VM servers. > > -Having someone act as a mentor would help. I'm not going to volunteer, not > only due to existing commitments, but because its not an area of my > expertise. > > -before undertaking a big project, try to pick a few small (existing?) > issues and go through the process of developing patches and nurturing them > in. > > One thing I would like to see test-wise is something we can deploy on a > YARN cluster to generate system load: net, IO, HDFS, CPU, etc. I'm doing > something like this for Hoya [ > http://www.slideshare.net/steve_l/hoya-hbase-on-yarn-20130820-hbase-hug ], > where I also want to simulate failures; if you want to help me with that > it'd be appreciated. (If the net load can be generated between peer nodes, > then you have a pretty good stress test of the network and a way of > measuring its blsection bandwidth too, though for cluster standup it's best > to use the standard unix tools for isolation of problems, and ease of > comparison with other clusters. > Thanks for the advice. Sure, I will try to evaluate design-wise to tackle the problem and see if I could pick-off a more well-defined, existing issues in this area and get back to the mailing list. > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. > -- ~Hilfi Alkaff~
