| Anomie added a comment. |
Populate empty ar_rev_id fields:
- Determine how many rows in archive have ar_rev_id = NULL. Let's call that number m. (e.g. enwiki has 508811 such rows, out of ~87793416 rows)
- Reserve m (or m+k, for good measure) IDs in the revision table:
- Make a note of max( max( rev_id ), max( ar_rev_id ) ), let's call it b.
- Insert a row with rev_id = b+m+k into the revision table, and delete it again, to bump the auto-increment counter.
- For any row in archive that has ar_rev_id = NULL, set ar_rev_id to a unique id between b+1 and b+m+k. This could be done via a temporary table, or programmatically.
For running as a maintenance script, I'm not so sure about this plan. If something goes wrong the list of "reserved" rows is likely to be lost and a new list would have to be reserved. Instead, I'd fetch a batch of archive rows needing IDs, reserve that many revision rows and assign them immediately, then repeat until there are no more batches.
Also, "Insert a row with rev_id = b+m+k into the revision table, and delete it again, to bump the auto-increment counter." won't work for all databases. It'll work for MySQL/MariaDB (at least with InnoDB), SQLite, and MSSQL, but for PostgreSQL and Oracle you'd instead have to update the sequence that generates the incrementing IDs.
Cc: Anomie, aude, Aklapper, Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, SandraF_WMF, GoranSMilovanovic, QZanden, Tramullas, Acer, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
