[Wikidata-bugs] [Maniphest] [Created] T194884: Get a random set of entities from Wikidata

abian Thu, 17 May 2018 06:45:36 -0700

abian created this task.
abian added projects: Wikidata, Wikidata-Query-Service.
Herald added a subscriber: Aklapper.
Herald added a project: Discovery.

TASK DESCRIPTION

There are many entities in Wikidata and processing them all is too expensive for certain purposes. However, for statistical purposes (for example, to get any kind of proportion of completeness, consistency, etc.), it's not necessary to retrieve and process them all, a small subset can be enough if representative (random).

Currently, it's hard to retrieve a random data set from Wikidata because:

the Wikidata Query Service doesn't retrieve entities randomly;
Special:Random requires two requests for every retrieved entity (first, a HTTP GET to Special:Random; then, a HTTP GET to the suggested item), doesn't support filters, and offers no significant advantage over directly generating random integers and addressing HTTP requests to the corresponding URIs.

It would be useful to have either:

the possibility of randomly retrieving data through the Wikidata Query Service (best option), or
a new tool to download an arbitrary number of random entities from Wikidata as a single file on demand.

TASK DETAIL

https://phabricator.wikimedia.org/T194884

EMAIL PREFERENCES

https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: abian
Cc: abian, Aklapper, Lahi, Gq86, Darkminds3113, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Avner, Gehel, Jonas, FloNight, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331

_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

[Wikidata-bugs] [Maniphest] [Created] T194884: Get a random set of entities from Wikidata

Reply via email to