The tl;dr version of the below is that I'm keen for ZooKeeper to become a TLP in the future, but it's not (yet) clear that now is the right time.
1. Is there any way in which the current structure practically hinders ZooKeeper's active development? One concern that has been raised elsewhere is that the Hadoop PMC doesn't have community insight into its subprojects, and therefore is making decisions on matters about which it has relatively little knowledge. This doesn't seem to be true or be affecting ZK currently - to my knowledge there haven't been issues e.g. promoting committers. Our current problem is encouraging more people to get to a position where they can be considered for committership. 2. Are we likely to harm our eventual standing with Apache by not splitting out now - which they are hoping we'll do? I doubt so, especially if we tell the board that we fully intend to become a TLP when we are in a position to, but I can't speak for the Apache board. 3. ZooKeeper has, in the past one or two months, begun to see a lot more interest from a diverse set of contributors. This is extremely promising, and I'm keen we don't run any risk of damaging this growth by undergoing a major project change, during which time contributing might be made more complex as links get broken and so on. This growth also makes me confident that we might be able to revisit a decision to stay as a subproject in about six months or so. We still need to get more people working in the core, however. 4. We should certainly be a TLP eventually - we don't really have any kind of dependency at all on Hadoop, certainly when we revisit the RecordIO stuff we'll be pretty much completely separate. I don't think we'll suffer too much from a branding problem - Pig, HBase and Avro will all need visitors and traffic as well, and I'm sure that the Hadoop 'family' will continue to exist even though the bureaucratic structure will alter. So let's figure some strategies to get more people involved and regularly making significant contributions, and then I think most of the objections to becoming a TLP should disappear. cheers, Henry On 22 March 2010 11:32, Patrick Hunt <ph...@apache.org> wrote: > You have probably heard by now that there is a discussion going on in > the Hadoop PMC as to whether a number of the subprojects (Hbase, Avro, > Zookeeper, Hive, and Pig) should move out from under the Hadoop > umbrella and become top level Apache projects (TLP). This discussion > has picked up recently since the Apache board has clearly communicated > to the Hadoop PMC that it is concerned that Hadoop is acting as an > umbrella project with many disjoint subprojects underneath it. They > are concerned that this gives Apache little insight into the health > and happenings of the subproject communities which in turn means > Apache cannot properly mentor those communities. > > The purpose of this email is to start a discussion within the > ZooKeeper community about this topic. Let me cover first what becoming > TLP would mean for ZooKeeper, and then I'll go into what options I > think we as a community have. > > Becoming a TLP would mean that ZooKeeper would itself have a PMC that > would report directly to the Apache board. Who would be on the PMC > would be something we as a community would need to decide. Common > options would be to say all active committers are on the PMC, or all > active committers who have been a committer for at least a year. We > would also need to elect a chair of the PMC. This lucky person would > have no additional power, but would have the additional responsibility > of writing quarterly reports on ZooKeeper's status for Apache board > meetings, as well as coordinating with Apache to get accounts for new > committers, etc. We currently submit these same reports, however they > are forwarded to the board through the Hadoop PMC Chair. For more > information see > http://www.apache.org/foundation/how-it-works.html#roles > > Becoming a TLP would not mean that we are ostracized from the Hadoop > community. We would continue to be invited to Hadoop Summits, HUGs, > etc. > > I see three ways that we as a community can respond to this: > > 1) Say yes, we want to be a TLP now. > > 2) Say yes, we want to be a TLP, but not yet. We feel we need more > time to mature. If we choose this option we need to be able to clearly > articulate how much time we need and what we hope to see change in > that time. > > 3) Say no, we feel the benefits for us staying with Hadoop outweigh > the drawbacks of being a disjoint subproject. If we choose this, we > need to be able to say exactly what those benefits are and why we feel > they will be compromised by leaving the Hadoop project. > > There may other options that I haven't thought of. Please feel free to > suggest any you think of. > > Here are the thoughts I've formed so far on the subject: > > Benefits of moving to TLP: > > a) Here's the boards view as communicated to me: > > "we're looking to ensure that proper and effective oversight is > reached, and umbrellas can get in the way of that. If you *also* think > that all of your communities have proper oversight, and that you're > communicating enough about each/all of them to the Board, so that *it* > can provide oversight, then that's just fine. Go do the review and > come back and say, "we're all good. no changes are necessary."" > > b) setting our own course - we would have our own PMC and therefore > have more latitude (within the apache rules of course) in setting > direction. PMC members would be focused on ZooKeeper exclusively. > > > Serious reservations I personally have with a move to TLP today: > > a) I do not think ZooKeeper currently has a sufficiently large and > diverse enough community such that it can fend for itself as a > TLP. Our community is working hard to establish a critical mass, given > our maturity level, complexity of code, and the stakes involved (ZK is > literally the linchpin of many of our user's computing > infrastructures) it has been hard to attract/promote developers. We > currently have 5 active committers, 4 from one company and 1 from > a separate one (who only recently joined the committer ranks). The > board has stated they are willing to break their own rules here (form > a TLP with less than acceptable diversity) however I don't believe that > would be prudent from our perspective. > > b) Loss of branding and discover-ability - "in the land of the cloud > the elephant is king". IMO being associated with Hadoop is a huge win > for us in terms of branding and discover-ability. This is similar to > the benefits we get of being an Apache project. People who are serious > about the cloud need to look at Hadoop. In the process they discover > ZooKeeper. > > c) "if ain't broke don't fix it". I have frequent interactions with > Hadoop PMC/Chair and an Apache board member. We are getting excellent > representation through this process and I don't see how visibility > "up" or support "down" could be improved. > > Questions? Thoughts? Rebuttal? Let the discussion begin. > > Patrick > > -- Henry Robinson Software Engineer Cloudera 415-994-6679