brion added a comment.
If I understand, the case for passing more metadata to the blob store is as a hint for cross-blob data compression. For this I think we mainly want to pass the identifier of a related blob: the blob with the data from the same slot in the previous revision. If the related blob is in the same store, then the blob store can potentially optimize its actual backing storage (with diff-based storage, or by gzipping adjacent blob contents together with a window size larger than the blobs, etc). It might also be useful to specify a type for 'hey this is precompressed binary data, don't bother trying to recompress it or diff it'. But I would strongly recommend against being too clever. Revision metadata may change (yes, change -- revdel etc) and blobs are explicitly reused across multiple revisions. Revision histories can be rewritten (yes, rewritten -- import/export and delete/undelete can change ordering & adjacency, etc). And definitely don't include things like titles that are completely arbitrary and may change at any time. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, brion Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
