Lucas_Werkmeister_WMDE added a comment.
I’ve started putting together a script to do the purge in
`~lucaswerkmeister-wmde/purge-other-usages/` on Toolforge. (It’s a Git
repository, with some history too.) The script can be run on a Buster bastion
using the venv in that directory; it goes through the “other” usages in
entity_id, page_id order (both ASC), taking the item and page IDs to resume
from as arguments and printing the last ones after each batch (i.e. after
interrupting the script, resume it with the last IDs it printed). It queries
the DB in batches of 500 at a time, and (so far) exits after the first batch,
so the first run purged 500 pages.
> Processed up to Q10261989, 2244976.
Before:
MariaDB [warwiki]> SELECT NOW() AS asof, COUNT(*) AS total, SUM(eu_aspect
LIKE 'C%') AS statements, SUM(eu_aspect LIKE 'O%') AS others FROM
wbc_entity_usage;
+---------------------+---------+------------+---------+
| asof | total | statements | others |
+---------------------+---------+------------+---------+
| 2022-02-28 15:26:52 | 3686466 | 1138696 | 1119080 |
+---------------------+---------+------------+---------+
1 row in set (1.242 sec)
After:
MariaDB [warwiki]> SELECT NOW() AS asof, COUNT(*) AS total, SUM(eu_aspect
LIKE 'C%') AS statements, SUM(eu_aspect LIKE 'O%') AS others FROM
wbc_entity_usage;
+---------------------+---------+------------+---------+
| asof | total | statements | others |
+---------------------+---------+------------+---------+
| 2022-02-28 15:51:05 | 3686016 | 1138751 | 1118575 |
+---------------------+---------+------------+---------+
1 row in set (1.257 sec)
So this batch decreased the number of other usages by 505, and increased the
number of statement usages by 55. It also took over twenty minutes – purging is
rate limited (for good reason!), so the script only processes 30 pages per 75
seconds. (30 pages per 60 seconds is the sharp rate limit, I’m letting the
script sleep for a bit longer to leave a bit of “breathing space”.) At this
rate, getting rid of the remaining 1119163 other usages would take a bit over
**32 days**, just over a month.
TASK DETAIL
https://phabricator.wikimedia.org/T296383
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: noarave, Lucas_Werkmeister_WMDE
Cc: Ladsgroup, Aklapper, Lydia_Pintscher, hoo, Michael, Marostegui, alaa,
Lucas_Werkmeister_WMDE, Lmlmljs, karapayneWMDE, Invadibot, maantietaja,
Akuckartz, Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, QZanden,
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Addshore,
Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]