Re: [Wikidata] Wikibase as a decentralized perspective for Wikidata

James Heald Fri, 28 Dec 2018 09:23:24 -0800

Coming back to the question of P's and Q's (sorry, it's been a busy fewweeks)

I read people saying "Don't worry because prefixes", but with respect Idon't agree.

IMO "Don't worry because prefixes" may make sense as a response if oneinteracts with Wikidata primarily via RDF dumps, or SPARQL, or perhapswriting system code -- environments where those prefixes may begenerally present and used.

But for anyone actually working first-hand with the data, whose workinvolves any substantial checking and/or manual editing of data throughthe wikibase user interface, I think it fails to ring true. Extensivehand-editing in this way tends to be an unavoidable aspect when curatinga dataset in wikibase -- eg investigating anomalies revealed by queryreports, perhaps after a large data upload or data matching procedure,and then identifying and making appropriate edits to resolve them.

For people making a lot of hand-edits like that, a process which as Ihave said I think is inevitable when actively curating datasets, certainproperty identifiers become so often encountered and so often used andrepeated that they become so deeply ingrained and internalised as tobecome essentially second nature -- eg P18 for image, P373 forcommonscat, P131 for located in administrative territorial entity, etcetc, the precise properties depending on the kind of data and items oneis working with. Similarly also for a lot of certain item identifiers,eg Q5 human etc.

If one's doing a lot of editing and looking-up through the interface,these identifications become very very familiar - as internalised andunconscious and automatic as breathing.

So I do think that reusing the same identifiers for quite differentmeanings in a different wikibase (but with essentially exactly the sameediting interface) is to create a cognitive dissonance which (IMO) issignificant, unnecessary, unfortunate, and (I believe) ought to beavoidable.

A second issue is Daniel's scenarios 2 to 4, where external repos wantto be using and referencing some or all of Wikidata's items andproperties, with the same identifiers as Wikidata, plus some additionalfurther properties and items of their own defined locally.

That's not straightforward, if they all have to be placed in the sameshared numerical sequences following the same restricted set of initialletters.

I do take the point that it is useful to be able to use the initialletter to distinguish different kinds of Wikibase object -- ieProperties (P), Items (Q), Lexemes (L), MediaInfo items (M)

One solution might be to allow Wikibase instances to use additionalcharacters in the identifier for the local properties, items etcspecific to that Wikibase -- so that that the Wikibase could haveproperty identifiers like Px50 or Pz50 or Pm50 to distinguish them fromWikidata's P50, or identifiers like Qx5000 or Qz5000 or Qosm5000 todistinguish them from Wikidata's Q5000.

This would straightforwardly allow Wikidata and local items andproperties to exist side by side, and avoid confusion and dissonancewith internalised learnt identifier codes from the items and propertieson Wikidata itself.



Best regards,

   James.

---
This email has been checked for viruses by AVG.
https://www.avg.com


_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikibase as a decentralized perspective for Wikidata

Reply via email to