On Jul 24, 2012, at 9:09 PM, Sumana Harihareswara wrote:

> On 07/17/2012 08:41 PM, Rob Lanphier wrote:
>> It would appear from reading this page that the only alternative to
>> Gerrit that has a serious following is GitHub.  Is that the case?

> 


        There's some irony and yet so apropos in that now that Gerrit is 
finally stabilizing we're discussing alternatives. :-)

        Oh well… Here's my 2c on GitHub…

        In an "ignore reality" world, I suppose my personal choices would be 1) 
GitHub; 2) Phabricator; 3) everything else. But let's cross GitHub off that 
list (for WMF).

        Maybe in some future when our development process more closely models a 
seat-of-the-pants startup universe of code first, break often, recover fast we 
could consider GitHub for hosting some of our public repositories, but since I 
don't see that happening anytime soon (ever?).

        …

        The nonstarter is that while we could host the public repositories, we 
do have a lot of non-public stuff in Gerrit right now. That stuff can't go into 
the cloud.

        Well on to specifics…

> But I have a lot of reservations about using GitHub as our primary
> source control and code review platform.  There's the free-as-in-freedom
> issue, of course,

        Personally I think this ship sailed the day we used Google Apps for 
e-mail. :-)

> but I'm also concerned about flexibility, account
> management, fragmentation of community and duplication of tools, and
> their terms of service.
> 
> == Flexibility ==
> I see GitHub as kind of like a Mac.

        This trope is too facile. But I do agree with what you are alluding to 
which is while it's fine for some, that doesn't mean it's fine for us. 
Especially us in our current development process.

>  It has a nice UI for the use case
> that its creators envision.  It's fine for personal use.

        A great many very large open source projects are currently using or 
hosted at GitHub (including node, jQuery, and our Android/PhoneGap app ;-))

>  And if we try
> it, everything'll be great.... until we smack into an invisible brick
> wall.  We'll want to work around one little thing, the way that we sneak
> around various issues in Gerrit, with hacks and searches and upgrades,
> if it's not in GitHub's web UI or API [3], we'll be stuck.

        The API is simplistic but serviceable. However, the satellite tools 
that are built around it are either part of GitHub (their internal issue 
tracker, their own Ruby-based wiki Gollum, etc) or are mostly for commercial 
use/cloud-based. For instance, in tools that assist in deployment from a GitHub 
repositiory (even if that was feasible for us which it isn't), most seem to 
have a hidden assumption that these are Web 2.0 companies deploying on AWS… not 
to mention that usage of those tools clearly violates our policy and values.

> Right now we have our primary Git repo on our own machines, which is the
> ultimate backdoor. The way we have been modifying our tools, automating
> certain kinds of commits (like with l10n-bot), troubleshooting by
> looking at our logfiles, and generally customizing things to suit our
> weird needs -- GitHub is closed source and won't let us do that.  We are
> not the typical use case for GitHub.  Since we have hundreds of
> extensions, each with their own repository, we would have way more
> repositories and members than almost any other organization on there.
> So, one example: arbitrary sortability of lists of repositories.  We
> could mod Gerrit to do it, but not GitHub.  How would we centralize and
> list the repositories so they're easy to browse, search, edit, follow,
> and watch them together?  It looks like GitHub's less suitable for that,
> but I'd welcome examples of orgs that create their own sub-GitHub hubs.

        Well GitHubs modality doesn't prevent operating on the git repository 
through the API. But I agree since where is the support on our end for 
doing/writing these when we already have something servicable in Gerrit?

> == Accounts ==
> By using GitHub, we would no longer be managing the user accounts. This
> would make single sign-on with other Wikimedia services (especially
> Labs) completely impossible.

        Technically this integration could be done by them authorizing us to 
their accounts via OAuth2. It's not the same thing as what you're saying 
though… it's kind of the opposite of what you're saying. What you want is what 
GitHub Enterprise is for.

> I mentioned above that GitHub seems more meant for single FLOSS projects
> than for confederations of related repositories. GitHub does not have
> the concept of "groups," so granting access to collections of repos
> would be a time-consuming process. GitHub does not support branch-level
> permissions, either (it encourages "forking" and then merging back to
> master), and that does not seem as suitable for long-term collaborative
> branches.

        This isn't quite true. GitHub does have the concept of groups (you can 
create as many as you want and control access levels (read-only, read/write, 
admin) on a project-by-project and between projects). However you cannot do it 
as robustly as Gerrit does.

        More troublesome I think is not that GitHubs's forking-merge model 
handles permission, but that GitHub's model is fundamentally a different 
modality than our gated trunk code review model. GitHub effectively allows 
self-review because there is no concept of review.

        On the other hand, since this is handled through a Pull Request instead 
of a Gerrit ChangeId, it does mean the history of the code commits, etc. 
doesn't get lost or munged down like it does in Gerrit. Too bad because I'd 
like this and it's fairly transparent (not requiring Git voodoo to handle these 
things). It's not our workflow though.

> == Duplication of tools, fragmentation of community ==
> We don't want to fragment our communication EVEN MORE.  GitHub wikis and
> bug management aren't such a big deal since we can probably disable
> those.  But messaging and notification .... "oh, did you say that on
> GitHub? We didn't see that there."  That's already a big enough
> headache, with Bugzilla and all the mailing lists and IRC channels and
> talk pages and and and.  :-)

        GitHubs only collabs are the repository itself, Gollum (their wiki), 
their Issue tracker, and the commit comments.

        Assuming that Gollum and Issue tracker are turned off (pity, I don't 
care for Gollum, but their Issue tracker is nicely integrated), the commit 
comments and repository are no different than what currently is a feature in 
Gerrit. Dare I say it, but GitHub's commit comments are awesome. They leave 
Gerrit and every other review tool in the dust as far as I've seen.

        I should mention that Gerrit's actual review of a change is nicer than 
even Phabricator's. You can step through and it will mark them as reviewed as 
you go. Obviously GitHub has no such thing since it has no concept of a 
pre-commit review or gated review.

> 
> == The Terms of Service ==
> GitHub's ToS/Security/Privacy policies[2] pose a few problems for our needs.
> 
> One is that people under 13 can't sign up.  I do not want to limit our
> community that way.

        Makes sense. I didn't consider this.

> Another is: "You may not duplicate, copy, or reuse any portion of the
> HTML/CSS, Javascript, or visual design elements or concepts without
> express written permission from GitHub."  Do we really want to get into
> a possible situation where we have noticed a design concept or cool use
> of JS on GitHub but don't feel okay reusing it in our personal or
> professional projects?

        I think this is boilerplate. In any case, that part should apply even 
if we were mirroring on GitHub. Besides, since we have no inclination of 
building a GitHub competitor, who cares.

> And, considering our level of activity, check out this clause: "If your
> bandwidth usage significantly exceeds the average bandwidth usage (as
> determined solely by GitHub) of other GitHub customers, we reserve the
> right to immediately disable your account or throttle your file hosting
> until you can reduce your bandwidth consumption."  We simply cannot
> afford to have GitHub disable our access with no notice.

        We still have the code and I'm sure there will be busier repositories 
out there. This is to prevent abuse on their side so they don't have to 
guarantee service.

> 
> == A couple open questions ==
> * What's the FLOSS project on GitHub that's most like us, in terms of
> size, number of unique repositories, privacy concerns, robustness needs,
> and so on?  How are they dealing with these issues?

        I don't know this. I believe when it migrates the jQuery plugin 
repository and jQuery itself will probably be the larger in terms of number of 
users and size. But they don't manage in a cascading web of trust.

> * What does GitHub Enterprise buy us?  Which of these issues would that fix?

        It's a self-hosted GitHub. It would allow us to have private 
repositories (good for deploys, ops, etc.) and manage our own user database (we 
could integrate with our own auth system) and probably waives the 13 and under 
rule above.

        The price is too steep since its a per-seat license. A nonstarter if 
the WMF is going to have to pay for every potential developer who wants to 
attach.

> We do need a GitHub strategy -- to make our projects more discoverable,
> make use of more contributions, and participate in the GitHub
> reputational economy.  So we must figure out the right ways to mirror
> and sync.  But I doubt our own long-term needs would work well with
> using GitHub as our main platform.

        I'm 1000% with you on this.

        We should definitely at some point mirror our code in GitHub like the 
PHP project does <http://www.php.net/git.php>. Being able to publish and handle 
pull requests coming from GitHub would be a nice feature in Gerrit or any 
replacement.  It'd be nice if others can have their own MW extensions or 
versions of extensions and core on GitHub and pull from us (and us from them) 
esp. for extensions that may need some love or have changes that don't satisfy 
the WMF code quality bar.

        …

        As for actually dealing with the pro/con of Gerrit vs. 
others-than-GitHub, I suppose I'll sit down and add matrix them on the wikipage 
if I have some time. I haven't yet thought things through enough to bore you 
with an even longer e-mail. :-)


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to