On 11-05-13 08:48 AM, Aryeh Gregor wrote: > On Fri, May 13, 2011 at 3:31 AM, M. Williamson <[email protected]> wrote: >> I still don't think page titles should be case sensitive. Last time I asked >> how useful this really was, back in 2005 or so, I got a tersely-worded >> response that we need it to disambiguate certain pages. OK, but how many >> cases does that actually apply to? I would think that the increased >> usability from removing case sensitivity would far outweigh the benefit of >> natural disambiguation that only applies to a tiny minority of pages, and >> which could easily be replaced with disambiguation pages. > From a software perspective, the way to do this would be to store a > canonicalized version of each page's title, and require that to be > unique instead of the title itself. This would be nice because we > could allow underscores in page titles, for instance, in addition to > being able to do case-folding. > > Note that Unicode capitalization is locale-dependent, but case-folding > is not. Thus we could use the same case-folding on all projects, > including international projects like Commons. There's only one > exception -- Turkish, with its dotless and dotted i's. But that's > minor enough that we should be able to work around it without too much > pain. > > Some projects, like probably all Wiktionaries, would doubtless not > want case-folding at all, so we should support different > canonicalization algorithms. Even the ones that don't want > case-folding could still benefit from allowing underscores in titles. > > But all this would require a very intrusive rewrite. Assumptions like > "replace spaces by underscores to get dbkey" are hardwired into > MediaWiki all over the place, unfortunately. It's not clear that it's > worth it, since there are downsides to case-folding too. It might > make more sense to auto-generate redirects instead, which would be a > much easier project that wouldn't have the downsides. Fortunately I think most of the space/underscore switching done by code is actually isolated to a subset of Title and perhaps a few other core classes (probably ones like User and the filerepo stuff), most code should be using the title interface. When I tried that first rewrite I had more of an issue with the wide use of $user->getName() to test if two user objects were the same.
-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
