https://bugzilla.wikimedia.org/show_bug.cgi?id=45282

       Web browser: ---
            Bug ID: 45282
           Summary: SiteLinkTable should normalize titles before saving
                    and lookup
           Product: MediaWiki extensions
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: WikidataRepo
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected],
                    [email protected]
    Classification: Unclassified
   Mobile Platform: ---

SiteLinkTable should apply light weight normalization to page titles before
storing the. This would avoid issues with specifying titles with or without
spaces as parameters to API calls, etc.

The following normalization should be applied:

* strip leading and trailing whitespace
* unicode normalization
* converting underscores to spaces (currently, the items_per_site table uses
spaces in the page titles, in violation of current practice elsewhere in the
database schema)

The following normalization should not be applied:
* namespace normalization (this requires knowledge of the target wiki's config)
* first letter capitalization (requires knowledge about the target wiki's
content language, but also about namespaces)
* redirect resolution (requires access to the target wiki's database)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to