brion added a comment.

  If I understand, the case for passing more metadata to the blob store is as a 
hint for cross-blob data compression.
  
  For this I think we mainly want to pass the identifier of a related blob: the 
blob with the data from the same slot in the previous revision. If the related 
blob is in the same store, then the blob store can potentially optimize its 
actual backing storage (with diff-based storage, or by gzipping adjacent blob 
contents together with a window size larger than the blobs, etc).
  
  It might also be useful to specify a type for 'hey this is precompressed 
binary data, don't bother trying to recompress it or diff it'.
  
  But I would strongly recommend against being too clever. Revision metadata 
may change (yes, change -- revdel etc) and blobs are explicitly reused across 
multiple revisions. Revision histories can be rewritten (yes, rewritten -- 
import/export and delete/undelete can change ordering & adjacency, etc). And 
definitely don't include things like titles that are completely arbitrary and 
may change at any time.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, brion
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to