GoranSMilovanovic added a comment.
@Manuel > Maybe let's quickly talk about this in our 1:1? Of course. > What would you cluster by? Well, I guess in the beginning it would only be a matrix of (1) Wikidata classes x (2) the counts of ORES A, B, C, D, E scored items per class. That would be the most straightforward exploration of the distribution of ORES quality scores across the classes, and it would help us pile up at least some of those half million classes together in (hopefully) meaningful groups : ) > What additional information could we join in? (I was thinking about some user and or edit data like last edited, number of unique users, number of edits etc that could give meaningful clusters.) All that you are saying makes sense, except for that I would not go for solving a more complicate problem (ORES scores + additional information on Wikidata classes --> clusters) before the already very complicated problem (ORES scores --> clusters) is solved. As I hope to be able to explain in our 1:1 today, clustering `472,035` Wikidata classes across five simple integer observations (A, B, C, D, E) already presents a challenge. So my suggestion would be to smart small. TASK DETAIL https://phabricator.wikimedia.org/T285458 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Ladsgroup, Lydia_Pintscher, Tobi_WMDE_SW, Manuel, GoranSMilovanovic, Aklapper, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
