| Yurik created this task. Yurik added projects: Wikidata-Query-Service, Analytics. Herald added a subscriber: Aklapper. Herald added projects: Wikidata, Discovery. |
At this point, the only way to rank various Wikidata results is to order them by sitelink-count. This offers a fairly good indicator of how many different languages/cultures are interested in a topic, but is not very accurate, especially when a topic is mostly related to a single language
I propose we introduce a new type of entries to WDQS:
sparql # Naming is TBD <https://en.wikipedia.org/wiki/Albert_Einstein> prefix:total_page_views [integer] . <https://en.wikipedia.org/wiki/Albert_Einstein> prefix:last_24h_page_views [integer] .
Some script would download files from dumps, and increment the counters once an hour. The updates should happen in bulk. Each file is about 5 million entries (<40MB gz).
Additionally, we may want to keep the total for the last 24 hours - a bit trickier, but also very doable - e.g. by keeping the totals for the last 24 files in memory, and uploading the deltas.
P.S. I am hacking on it at the moment (python). Need naming suggestions for the predicate.
Cc: Smalyshev, Aklapper, Yurik, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, debt, Gehel, Jonas, FloNight, Xmlizer, Izno, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331, jeremyb
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
