You have probably heard by now that there is a discussion going on in the Hadoop PMC as to whether a number of the subprojects (Hbase, Avro, Zookeeper, Hive, and Pig) should move out from under the Hadoop umbrella and become top level Apache projects (TLP). This discussion has picked up recently since the Apache board has clearly communicated to the Hadoop PMC that it is concerned that Hadoop is acting as an umbrella project with many disjoint subprojects underneath it. They are concerned that this gives Apache little insight into the health and happenings of the subproject communities which in turn means Apache cannot properly mentor those communities.
The purpose of this email is to start a discussion within the ZooKeeper community about this topic. Let me cover first what becoming TLP would mean for ZooKeeper, and then I'll go into what options I think we as a community have. Becoming a TLP would mean that ZooKeeper would itself have a PMC that would report directly to the Apache board. Who would be on the PMC would be something we as a community would need to decide. Common options would be to say all active committers are on the PMC, or all active committers who have been a committer for at least a year. We would also need to elect a chair of the PMC. This lucky person would have no additional power, but would have the additional responsibility of writing quarterly reports on ZooKeeper's status for Apache board meetings, as well as coordinating with Apache to get accounts for new committers, etc. We currently submit these same reports, however they are forwarded to the board through the Hadoop PMC Chair. For more information see http://www.apache.org/foundation/how-it-works.html#roles Becoming a TLP would not mean that we are ostracized from the Hadoop community. We would continue to be invited to Hadoop Summits, HUGs, etc. I see three ways that we as a community can respond to this: 1) Say yes, we want to be a TLP now. 2) Say yes, we want to be a TLP, but not yet. We feel we need more time to mature. If we choose this option we need to be able to clearly articulate how much time we need and what we hope to see change in that time. 3) Say no, we feel the benefits for us staying with Hadoop outweigh the drawbacks of being a disjoint subproject. If we choose this, we need to be able to say exactly what those benefits are and why we feel they will be compromised by leaving the Hadoop project. There may other options that I haven't thought of. Please feel free to suggest any you think of. Here are the thoughts I've formed so far on the subject: Benefits of moving to TLP: a) Here's the boards view as communicated to me: "we're looking to ensure that proper and effective oversight is reached, and umbrellas can get in the way of that. If you *also* think that all of your communities have proper oversight, and that you're communicating enough about each/all of them to the Board, so that *it* can provide oversight, then that's just fine. Go do the review and come back and say, "we're all good. no changes are necessary."" b) setting our own course - we would have our own PMC and therefore have more latitude (within the apache rules of course) in setting direction. PMC members would be focused on ZooKeeper exclusively. Serious reservations I personally have with a move to TLP today: a) I do not think ZooKeeper currently has a sufficiently large and diverse enough community such that it can fend for itself as a TLP. Our community is working hard to establish a critical mass, given our maturity level, complexity of code, and the stakes involved (ZK is literally the linchpin of many of our user's computing infrastructures) it has been hard to attract/promote developers. We currently have 5 active committers, 4 from one company and 1 from a separate one (who only recently joined the committer ranks). The board has stated they are willing to break their own rules here (form a TLP with less than acceptable diversity) however I don't believe that would be prudent from our perspective. b) Loss of branding and discover-ability - "in the land of the cloud the elephant is king". IMO being associated with Hadoop is a huge win for us in terms of branding and discover-ability. This is similar to the benefits we get of being an Apache project. People who are serious about the cloud need to look at Hadoop. In the process they discover ZooKeeper. c) "if ain't broke don't fix it". I have frequent interactions with Hadoop PMC/Chair and an Apache board member. We are getting excellent representation through this process and I don't see how visibility "up" or support "down" could be improved. Questions? Thoughts? Rebuttal? Let the discussion begin. Patrick