Aleksey_WMDE added a comment.

Ways to build URL for external IDs

Here are some thoughts on this topic:

  • Replace $1 with id in URL template like http://domain.com/$1 (current way)
  • Use external IDs format that will work with current approach (ex.: for https://cricketarchive.com/Archive/Players/41/41464/41464.html external ID should be 41/41464/41464 but not 41464)
    • PRO No need to implement anything
    • CONS A bit more space in DB
    • CONS Decision is up to community
    • CONS Have to migrate data
  • Splitting the property into multiple properties
    • PRO IMDB template on English Wikipedia has different template for characters, companies, ...
    • CONS It is up to comunity to decide (and seems like the proposal was declined)
    • CONS Doesn't cover all the cases (ex.: CricketArchive)
    • CONS Have to migrate templates that already use current property (ex.: IMDB template on German Wikipedia)
  • Wikibase provides list of predefined generic transformations, which then somehow set for each external identifier property individually (for example, as classifiers for formatter URL property) and applied to value on render.
    • PRO Easy to use
    • PRO Simple cases will look pretty elegant (like "strip spaces", "remove dashes")
    • CONS Have to be supported by developers
    • CONS Will have some super specific transformations (like "IMDB id transformation" and "CricketArchive ID transformation", because URL generation algorithms are very specific)
    • CONS Probably new data type should be added
  • Use full URLs as identifiers instead of some string IDs
    • PRO If we treat URL as URI we don't have to do formatting and will have URL for free
    • CONS More space in DB
    • CONS Decision is up to community
    • CONS Have to migrate data
  • Ask the external site to implement a more generic URL scheme
    • PRO We don't have to do anything except writing mails
    • CONS As soon as we want stable URL our request to the site owner would look something like this (in simple words): "Could you please create and maintain one more public interface because our code is not flexible enough and also we don't want to change it". It does not sound as something people would agree to do.
  • Use regex with capturing groups as qualifier and replace $ variables in formatter URL accordingly (ex.: (\d{2})-(\d{4}) and http://www.bls.gov/soc/2010/soc$2$3.htm)
    • PRO Not so hard to implement
    • PRO Easy to understand how to use
    • PRO Easy to handle simple cases
    • CONS Does not cover all the current cases (ex.: IMDB, HURDAT, ZVG, CricketArchive)
    • CONS performance is undefined and heavily depends on user input
  • Use Lua code as property's property to define formatter function (as a raw code or a module reference).
    • PRO Performance can be controlled. As soon as we run code in the sandbox we can(?) limit its memory consumption and CPU time
    • PRO Users are more involved in product development and can do much more without developers
    • PRO Some nice features as validation and filtering/preformatting on property basis can be easily implemented after this one is
    • CONS Wikibase start to depend (may be optionally) on Lua PHP extension which might be is hard to install, depending on the environment (good buy shared hosting).
    • CONS Might be relatively hard to implement
    • CONS Performance depends on user input
    • CONS Probably, new data type should be added

TASK DETAIL
https://phabricator.wikimedia.org/T151329

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Aleksey_WMDE
Cc: Aleksey_WMDE, thiemowmde, Pasleim, Esc3300, Aklapper, Stigmj, Nikki, hoo, Lydia_Pintscher, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to