Darcyisverycute claimed this task. Darcyisverycute added a comment.
{F35452779} {F35452776} Sorry I didn't have time to write here yesterday, I worked on this as part of the hackathon. I gave a presentation (slides and data in xlsx export attached, it doesn't render great so I anonymously published online here <https://docs.google.com/spreadsheets/d/e/2PACX-1vTJKs0nFHxnvoBb6ztNZOQCntBk8KruWK5pZIf_8cxNedXexZ8Op12AOOCCQcEzSaWaZo0F4xj6u4HJ/pubhtml#> as well). The approach I did was to circumvent that there is no fast way to test if a given article about a wikidata item is in mainspace, I instead rely on inclusion in a large encyclopedia ID system (I chose Encyclopedia Britannica, info in the slides). It's fast enough to run a comparison between two langs through the ~170k items in the particular ID system, within the 1 minute query timeout window on https://query.wikidata.org/ So to fill out the rest of the matrix I just need to work out a way to programmatically combine the queries into a table and run on a database dump, or run queries of the form in my presentation sequentially (possibly also with a database dump). The full matrix is ~170 language wikis across 250+ languages, so about 28900 queries to run in total if we wanted the full table. @Lydia_Pintscher do you have any advice on scaling up this approach? (NB my spreadsheet is the same as in the idea description but transposed) TASK DETAIL https://phabricator.wikimedia.org/T283466 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Darcyisverycute Cc: Darcyisverycute, amy_rc, WMDE-leszek, GoranSMilovanovic, EpicPupper, Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, Alan_Ang-WMDE, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dinadineke, DannyS712, Nandana, tabish.shaikh91, Lahi, Gq86, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Omar_sansi, Wikidata-bugs, aude, TheDJ, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org