Darcyisverycute claimed this task.
Darcyisverycute added a comment.

  {F35452779} {F35452776}
  Sorry I didn't have time to write here yesterday, I worked on this as part of 
the hackathon. I gave a presentation (slides and data in xlsx export attached, 
it doesn't render great so I anonymously published online here 
<https://docs.google.com/spreadsheets/d/e/2PACX-1vTJKs0nFHxnvoBb6ztNZOQCntBk8KruWK5pZIf_8cxNedXexZ8Op12AOOCCQcEzSaWaZo0F4xj6u4HJ/pubhtml#>
 as well). The approach I did was to circumvent that there is no fast way to 
test if a given article about a wikidata item is in mainspace, I instead rely 
on inclusion in a large encyclopedia ID system (I chose Encyclopedia 
Britannica, info in the slides). It's fast enough to run a comparison between 
two langs through the ~170k items in the particular ID system, within the 1 
minute query timeout window on https://query.wikidata.org/
  
  So to fill out the rest of the matrix I just need to work out a way to 
programmatically combine the queries into a table and run on a database dump, 
or run queries of the form in my presentation sequentially (possibly also with 
a database dump). The full matrix is ~170 language wikis across 250+ languages, 
so about 28900 queries to run in total if we wanted the full table. 
@Lydia_Pintscher do you have any advice on scaling up this approach?
  
  (NB my spreadsheet is the same as in the idea description but transposed)

TASK DETAIL
  https://phabricator.wikimedia.org/T283466

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Darcyisverycute
Cc: Darcyisverycute, amy_rc, WMDE-leszek, GoranSMilovanovic, EpicPupper, 
Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, Alan_Ang-WMDE, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dinadineke, DannyS712, Nandana, 
tabish.shaikh91, Lahi, Gq86, Jayprakash12345, JakeTheDeveloper, QZanden, 
merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Omar_sansi, 
Wikidata-bugs, aude, TheDJ, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to