dcausse added a comment.
In T342123#9081490 <https://phabricator.wikimedia.org/T342123#9081490>, @AndrewTavis_WMDE wrote: > Minor question on this, @dcausse: why aren't we caching `df_wikidata_rdf` and `sa_and_sasc_ids` above? My assumption is that we should given that we're using them in multiple later calculations, but then I just tried to cache them and then a calculation that normally would finish then lost resources and stalled with three separate stages running. Did you explicitly choose not to cache them, and if so why not? :) I don't remember having such problems nor thinking too much about what to cache. Generally speaking caching comes with an extra cost and it's not always obvious that you'll get a net benefit but here I tend to agree that `sa_and_sasc_ids` might sound like a good candidate for caching (single column, relatively few rows) and I'm not sure to understand why it could fail... have you tried multiple times? Might possibly be unrelated to caching. If your notebook has had its kernel open for a long time (several days) and that the spark session was still open during that time I would not be surprised that hadoop had tried to cleanup some things in the meantime making spark unhappy... just making random guesses here. If after retrying on a fresh spark session (by killing your kernel) it still does not work please feel free to upload your code somewhere and I'll give it a try. TASK DETAIL https://phabricator.wikimedia.org/T342123 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE, dcausse Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org