https://bugzilla.wikimedia.org/show_bug.cgi?id=18333
--- Comment #1 from [email protected] 2009-04-04 00:35:37 UTC --- The above will stop new duplication, and would be a shame not to implement. But what about all the years and years of current duplication already existing in one's text table? Should there be a program in maintenance/ available to squeeze it out? Should it also be run by update.php? Or just once in a wiki's lifetime? Or just by interested parties who feel the need? That program would squeeze out duplicates by: {for each page {go down its list of revisions making duplicate pointers point to their first}}, the run purgeOldText.php. One needn't go to "SHA1 mapping to unbloat the text table" ( http://lists.wikimedia.org/pipermail/wikitech-l/2009-March/042373.html ) extremes. However, perhaps we needn't restrict our thinking to a per article paradigm, but instead just consider the whole revisions->text table mapping. Maybe that would be a simpler and smarter way to do this. We would thus only involved two tables... (wait, we must consider all tables that have any mapping to the text table! Also all this must be done with the wiki locked probably, though it would probably only take few seconds for a small wiki.) E.g., running our shell/perl scripts above we find 279 separate pointers to blank (0 byte, vandalism) article revisions. These could all be made to point to a single text row, even though they are not of the same article. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
