Re: [Wikitech-l] Shell requests backlog

MZMcBride Tue, 17 May 2011 11:19:34 -0700

Brion Vibber wrote:
> On Tue, May 17, 2011 at 3:56 AM, MZMcBride <[email protected]> wrote:
>> Mark A. Hershberger wrote:
>>> There is hope for every bug, no matter how old.
>> 
>> As long as people stop resolving old bugs as "wontfix" simply because
>> they're old. I went through and re-opened quite a few bugs that were
>> improperly marked as resolved over the weekend.
>> 
>> People take time to register an account on Bugzilla, describe their issue,
>> and then wait, often for _years_, for their issue to get looked at, much
>> less worked on and resolved. I can't really stress enough how completely
>> unacceptable it is to go around marking these bugs as "wontfix" without any
>> explanation or simply because it's been a long time since the bug was
>> filed.
> 
> MZ, if you're aware of *any* specific bugs where this is an issue, please
> let me or Mark know and we'll take a look at them.


Thank you for the polite and detailed response; I really appreciate it. I've
replied to this portion of the post off-list.

>> I actually took this to mean sister projects (non-Wikipedias), not foreign
>> language projects. If Wikisource has gotten 30 minutes of Wikimedia
>> Foundation development time _this year_, I'd be shocked.
> 
> Wikisource could definitely use more attention; we're still thinking about
> ways to keep some of the smaller projects more in the loop.
> 
> The primary method is by ensuring that folks can do site script & gadget
> JavaScript development without bottlenecking through Wikimedia which as a
> matter of mission has prioritized keeping servers running and Wikipedia
> functional.
> 
> Wiktionary and Commons have created a *lot* of additional UI functionality
> this way, without that bottleneck, and that's something I'm very happy
> about.
> 
> Wikisource has a number of extensions that were targeted specifically to it
> (classic DPL, labeled section transclusion, PagedTiffHandler, ProofreadPage)
> which at least get some review maintenance but are mostly created &
> maintained by volunteers. Making sure that they're up to date, modernized,
> user-friendly, and well-performing is something that we at Wikimedia should
> at least be supporting -- and I hope we'll be able to assign a little more
> explicit time to making that happen. To date, they generally get maintenance
> updates from volunteers but deployment doesn't get pushed, so the old
> versions stay in place until a major upgrade with our current deployment
> system.
> 
> But for the meantime, I can only say that it's important to accentuate the
> positive -- find specific projects that need some effort, and help round up
> people to help with them.

Yes, okay, specificity would help here. Let me lay out some examples of
problems I've hit.

If you want to work with Wikisource, one if its fundamental tools is the
ProofreadPage extension. There's no API for it. Seriously. So any outside
developers / third parties wanting to work with its content in place are out
of luck, unless they want to screen-scrape or try to hack something up.

If you want to work with / re-use Wiktionary's content, you'd generally want
to split the content up by language. People who want to re-use Wiktionary's
content generally want to create a free English dictionary or a free French
dictionary or a free Spanish dictionary. The Wiktionary dumps don't split
the words by language, meaning that anyone wanting to re-use this content
has to go through each page and try to parse it to find the English
definitions or the French definitions or the Spanish definitions. There have
been attempts by Toolserver users to make this situation less horrible, but
people are then relying on volunteers and their (questionable) code, rather
than being able to rely on Wikimedia producing useful dumps. Outside
developers / third parties once again get screwed.

If you want to work with any Wikimedia wiki's content on a large scale, you
have to deal with database dumps. For reasons unknown, there are no longer
HTML dumps, only wikitext dumps. And there's only one fully working parser
(on Wikimedia's servers). So if you want to get the rendered page contents
of a lot of articles, your options are to use ?action=parse via the API
(this will only take about two months for every article on the English
Wikipedia) or you can try to grab the rendered HTML from the Squid servers.
This makes outside development incredibly painful, if not prohibitively
impossible, for most people.

These are examples of fundamental and underlying issues that prohibit
outside developers / tool builders / dreamers from taking content and
running wild with it. If there isn't a stable groundwork for others, it's
fairly difficult to build something on top of it. Laying that groundwork
seems like what Wikimedia should be doing. Otherwise, in my opinion,
Wikimedia should end the charade of supporting "sister projects" and be
upfront with developers, readers, editors, and other contributors that these
projects are simply not supported.

One idea for the future might be to make all Google Summer of Code projects
focus on sites other than Wikipedia. As you note (and as anyone paying any
kind of attention has realized), Wikimedia's priority is Wikipedia. There
are plenty of resources behind Wikipedia, but the directed volunteer efforts
(such as GSOC) could be directed at all non-Wikipedias.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Shell requests backlog

Reply via email to