Our pageview dumps were in the middle of a refactor when our team changed a lot. We haven't been able to finish it, but we do actually have a well-compressed version that we just haven't properly launched as a new dataset. I'm working on prioritizing that.
On Sun, Sep 4, 2022 at 02:58 Gergő Tisza <[email protected]> wrote: > I'd imagine the current format is optimized for being able to output > hourly dumps (and thus reducing data latency and data processing costs), > not so much for storage space > _______________________________________________ > Wikitech-l mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________ Wikitech-l mailing list -- [email protected] To unsubscribe send an email to [email protected] https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
