GoranSMilovanovic added a comment.
@Lydia_Pintscher @RazShuty @Halfak
Ok, here's what I've got:
item revision timestamp usage
1 Q36524 924799644 2019-04-26 06:29:25.0 6791020
2 Q54919 929383859 2019-04-30 21:14:14.0 4376000
3 Q423048 919180363 2019-04-19 18:57:17.0 4252235
4 Q36578 866859320 2019-02-25 08:49:00.0 3692702
5 Q193563 919018095 2019-04-19 15:07:51.0 3389081
6 Q131454 928584935 2019-04-30 04:31:54.0 3353011
The table know about: items (`item`), their latest revision IDs (`revision`),
the timestamp of the latest revision (`timestamp`), and their WDCM usage
statistic (`usage`; measures the item's usage across the WMF projects).
That would be Pyspark. What I **need** now are the latest ORES scores, tagged
by revision IDs, for all WD items, so that I can join them to this table.
Making millions of ORES API calls is obviously not feasible.
@Halfak Any ideas? If you have such a dataset could you please let me know
where does it live? Many thanks.
TASK DETAIL
https://phabricator.wikimedia.org/T195702
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: GoranSMilovanovic
Cc: Halfak, RazShuty, Harej, hoo, Aklapper, Esc3300, Lydia_Pintscher,
darthmon_wmde, Nandana, Lahi, Gq86, Vacio, GoranSMilovanovic, Fz-29, QZanden,
LawExplorer, _jensen, rosalieper, Mkdw, notconfusing, srodlund, Wikidata-bugs,
aude, Alchimista, Mbch331, Rxy
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs