Aryeh Gregor wrote:
> The same could be said of practically any user-visible change.  I
> mean, maybe if we add a new special page we'll break some script that
> was screen-scraping Special:SpecialPages.  We can either freeze
> MediaWiki and never change anything for fear that we'll break
> something, or we can evaluate each potential change on the basis of
> how likely it is to break anything.  I can't see anything breaking too
> badly if rev_len is reported in characters instead of bytes -- the
> only place it's likely to be useful is in heuristics, and by their
> nature, those won't break too badly if the numbers they're based on
> change somewhat.

This is problematic logic for a few reasons. I see a change to the rev_len logic
as being similar to a change in article count logic. The same arguments work in
both places, specifically the "step problem" that will cause nasty jumps in
graphs.[1]

In some cases, as you've noted, we're talking about a change by a factor of
three. Plenty of scripts rely on hard-coded values to determine size thresholds
for certain behaviors. While these scripts may not have the best
implementations, I don't think it's fair to say that they're worth breaking.

The comparison to screen-scraping seems pretty spurious as well. The reason it's
acceptable to break screen-scraping scripts is that there's a functioning API
alternative that is designed for bots and scripts. One of the design principles
is consistency. Altering a metric by up to a factor of three (and even worse,
doing so in an unpredictable manner) breaks this consistency needlessly.

Is it worth the cost to add 300 million+ rows to easily have character count? I
don't know. Personally, I don't mind rev_len being in bytes; it makes more sense
from a database and technical perspective to me. Admittedly, though, I deal
mostly with English sites.

MZMcBride

[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=11868#c8


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to