Lucas_Werkmeister_WMDE added a comment.

  I’ve started putting together a script to do the purge in 
`~lucaswerkmeister-wmde/purge-other-usages/` on Toolforge. (It’s a Git 
repository, with some history too.) The script can be run on a Buster bastion 
using the venv in that directory; it goes through the “other” usages in 
entity_id, page_id order (both ASC), taking the item and page IDs to resume 
from as arguments and printing the last ones after each batch (i.e. after 
interrupting the script, resume it with the last IDs it printed). It queries 
the DB in batches of 500 at a time, and (so far) exits after the first batch, 
so the first run purged 500 pages.
  
  > Processed up to Q10261989, 2244976.
  
  Before:
  
    MariaDB [warwiki]> SELECT NOW() AS asof, COUNT(*) AS total, SUM(eu_aspect 
LIKE 'C%') AS statements, SUM(eu_aspect LIKE 'O%') AS others FROM 
wbc_entity_usage;
    +---------------------+---------+------------+---------+
    | asof                | total   | statements | others  |
    +---------------------+---------+------------+---------+
    | 2022-02-28 15:26:52 | 3686466 |    1138696 | 1119080 |
    +---------------------+---------+------------+---------+
    1 row in set (1.242 sec)
  
  After:
  
    MariaDB [warwiki]> SELECT NOW() AS asof, COUNT(*) AS total, SUM(eu_aspect 
LIKE 'C%') AS statements, SUM(eu_aspect LIKE 'O%') AS others FROM 
wbc_entity_usage;
    +---------------------+---------+------------+---------+
    | asof                | total   | statements | others  |
    +---------------------+---------+------------+---------+
    | 2022-02-28 15:51:05 | 3686016 |    1138751 | 1118575 |
    +---------------------+---------+------------+---------+
    1 row in set (1.257 sec)
  
  So this batch decreased the number of other usages by 505, and increased the 
number of statement usages by 55. It also took over twenty minutes – purging is 
rate limited (for good reason!), so the script only processes 30 pages per 75 
seconds. (30 pages per 60 seconds is the sharp rate limit, I’m letting the 
script sleep for a bit longer to leave a bit of “breathing space”.) At this 
rate, getting rid of the remaining 1119163 other usages would take a bit over 
**32 days**, just over a month.

TASK DETAIL
  https://phabricator.wikimedia.org/T296383

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: noarave, Lucas_Werkmeister_WMDE
Cc: Ladsgroup, Aklapper, Lydia_Pintscher, hoo, Michael, Marostegui, alaa, 
Lucas_Werkmeister_WMDE, Lmlmljs, karapayneWMDE, Invadibot, maantietaja, 
Akuckartz, Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Addshore, 
Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to