On Fri, May 13, 2011 at 3:31 AM, M. Williamson <[email protected]> wrote:
> I still don't think page titles should be case sensitive. Last time I asked
> how useful this really was, back in 2005 or so, I got a tersely-worded
> response that we need it to disambiguate certain pages. OK, but how many
> cases does that actually apply to? I would think that the increased
> usability from removing case sensitivity would far outweigh the benefit of
> natural disambiguation that only applies to a tiny minority of pages, and
> which could easily be replaced with disambiguation pages.

>From a software perspective, the way to do this would be to store a
canonicalized version of each page's title, and require that to be
unique instead of the title itself.  This would be nice because we
could allow underscores in page titles, for instance, in addition to
being able to do case-folding.

Note that Unicode capitalization is locale-dependent, but case-folding
is not.  Thus we could use the same case-folding on all projects,
including international projects like Commons.  There's only one
exception -- Turkish, with its dotless and dotted i's.  But that's
minor enough that we should be able to work around it without too much
pain.

Some projects, like probably all Wiktionaries, would doubtless not
want case-folding at all, so we should support different
canonicalization algorithms.  Even the ones that don't want
case-folding could still benefit from allowing underscores in titles.

But all this would require a very intrusive rewrite.  Assumptions like
"replace spaces by underscores to get dbkey" are hardwired into
MediaWiki all over the place, unfortunately.  It's not clear that it's
worth it, since there are downsides to case-folding too.  It might
make more sense to auto-generate redirects instead, which would be a
much easier project that wouldn't have the downsides.

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to