daniel added a comment.
Quick summary of a discussion between Adrian, Thiemo, and me today:
Agreement:
- Keep the object structure for entityid values
- Add a new "id" key for the serialized entity id
- `{ "id":"Q184", "entity-type":"item", "numeric-id":184 }`
- For internal storage, we can drop the old fields right away, and go to `{
"id":"Q184" }`
- `{ "id":"Q184" }` should also become the default for API output at some
point. We may continue to support "entity-type" and "numeric-id" as an option.
Note: we also want to support optional "url" and "uri" keys, but they should
probably be treated as derived values, and not be part of the entity id value
itself.
-----
There are some questions that remain open regarding the support of external
entity ids, for a federated setup. The changes outlined above can however be
made without deciding the questions below.
The main question regarding external identifiers is: should a prefix that
encodes the home repo of an entity be included in the id field? There are
basically two options:
1. `{ "id":"foo:Q184" }` optionally expanded to `{ "id":"foo:Q184",
"repo":"foo" }`. Local IDs would have the form `"Q184"` without a prefix.
2. `{ "id":"Q184" "repo":"foo" }` optionally augmented with `{ "id":"Q184"
"repo":"foo", "qname":"foo:Q184" }`. The qname for a local ID would have the
form `mywiki:Q184`, using a configurable prefix. An empty prefix could be
allowed here, whih would lead to the form `:Q184` for local entities.
Arguments for option(1):
- The "id" field in API output has the same form as the expected input for
API parameters an URLs.
- We want to use prefixed IDs internally (nearly) everywhere where we
currently have non-prefixed IDs. The notion of entity ID would be extended to
include entities in other repos.
- Clients that do not expect to see references to external entities would
fail early, since they would not be able to parse prefixed IDs.
- On a repo that is not federated with other repos, nothing changes, except
that IDs become available as strings.
Arguments against (1):
- The parser needs to detect whether the ID has a prefix or not.
- IDs may not contain a colon (or a colon would need encoding or escaping).
Arguments for (2):
- the "id" field never has a prefix, the "qname" field always has a prefix.
- Clear distinction between "IDs" and "references".
Arguments against (2):
- Clients that do not know about external entities may read the "id" field as
being local, even if they are not.
- IDs with the local prefix need to be accepted as input everywhere,
including as parts of the URL.
- What's the canonical form of the ID - with out without prefix? Should the
canonical URI also contain the prefix?
TASK DETAIL
https://phabricator.wikimedia.org/T56085
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: daniel
Cc: adrianheine, Tobi_WMDE_SW, Tpt, thiemowmde, hoo, Jimkont, Ricordisamoa,
Sumit, Lucie, Aklapper, Wikidata-bugs, Denny, JeroenDeDauw, Lydia_Pintscher,
daniel, D3r1ck01, Izno, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs