bd808 added a comment.

See the comment at the bottom of https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/ApiAction:

-- NOTE: there are many params we do not want to count distinct values of
-- at all (eg maxlag, smaxage, maxage, requestid, origin, centralauthtoken,
-- titles, pageids). Rather than trying to make an extensive blacklist and
-- potentially allow new parameters to slip through which have high
-- cardinality or sensitive information, the ETL process will use a whitelist
-- approach to count params that have been deemed to be useful.
--
-- The initial whitelist is (query, prop), (query, list), (query, meta),
-- (flow, module), (*, generator). The prop, list and meta parameters will
-- additionally be split on '|' with each component counted separately.

So action = ""> isn't one of the things that my aggregation script is computing rollup values for. This is still running out of cron as my user with local scripts on stat1005. See T137321: Run ETL for wmf_raw.ActionApi into wmf.action_* aggregate tables for the long stalled task to actually make these rollup tables official and properly manged.


TASK DETAIL

EMAIL PREFERENCES

To: bd808
Cc: Aklapper, Lucas_Werkmeister_WMDE, bd808, Ladsgroup, GoranSMilovanovic, QZanden, Izno, JAllemandou, Wikidata-bugs, aude, Mbch331, jeremyb
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to