dcausse added a comment.

  In T342123#9081490 <https://phabricator.wikimedia.org/T342123#9081490>, 
@AndrewTavis_WMDE wrote:
  
  > Minor question on this, @dcausse: why aren't we caching `df_wikidata_rdf` 
and `sa_and_sasc_ids` above? My assumption is that we should given that we're 
using them in multiple later calculations, but then I just tried to cache them 
and then a calculation that normally would finish then lost resources and 
stalled with three separate stages running. Did you explicitly choose not to 
cache them, and if so why not? :)
  
  I don't remember having such problems nor thinking too much about what to 
cache. Generally speaking caching comes with an extra cost and it's not always 
obvious that you'll get a net benefit but here I tend to agree that 
`sa_and_sasc_ids` might sound like a good candidate for caching (single column, 
relatively few rows) and I'm not sure to understand why it could fail... have 
you tried multiple times? Might possibly be unrelated to caching. If your 
notebook has had its kernel open for a long time (several days) and that the 
spark session was still open during that time I would not be surprised that 
hadoop had tried to cleanup some things in the meantime making spark unhappy... 
just making random guesses here. If after retrying on a fresh spark session (by 
killing your kernel) it still does not work please feel free to upload your 
code somewhere and I'll give it a try.

TASK DETAIL
  https://phabricator.wikimedia.org/T342123

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, dcausse
Cc: dcausse, Lydia_Pintscher, dr0ptp4kt, Aklapper, Manuel, 
Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to