GoranSMilovanovic added a comment.
- Batch processing over sparse matrices (`dgCMatrix` class) is now employed
to compute
- the co-occurence data set: **success**, using approx. order of magnitude
less resources than the previously employed procedure, and
- the Jaccard similarity and distance matrices: **testing**, the procedure
is memory efficient but slow (subsetting the `dgCMatrix` class matrix...).
TASK DETAIL
https://phabricator.wikimedia.org/T223118
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: GoranSMilovanovic
Cc: Aklapper, Lydia_Pintscher, RazShuty, GoranSMilovanovic, darthmon_wmde,
DannyS712, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper,
Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs