https://bugzilla.wikimedia.org/show_bug.cgi?id=37225
--- Comment #48 from Richard Guk <[email protected]> 2012-06-25 09:23:38 UTC --- Thanks for running those queries and making the data available. Apologies for the typo. Also, I shouldn't have included "rc_new_len>0" within the WHERE condition of trcmin (it defeats the HAVING condition which should have excluded previously-blanked pages), so the above results contain "background noise" of a few spurious unblanking edits. For reference, the following SQL should exclude these cases, and is reformatted to exclude some irrelevant columns and revdeleted rows (simplifying where null handling is not needed): -- SELECT trc.rc_timestamp, trc.rc_id, trc.rc_user, trc.rc_user_text, trc.rc_namespace, trc.rc_title, trc.rc_minor, trc.rc_bot, trc.rc_cur_id, trc.rc_this_oldid, trc.rc_last_oldid, trc.rc_new_len, trc.rc_deleted, trc.rc_logid, trc.rc_log_type, trc.rc_comment FROM recentchanges trc, ( SELECT rc_cur_id, MIN(rc_id) AS rc_id_min, MIN(rc_new_len) from recentchanges WHERE rc_type<=1 GROUP BY 1 HAVING MIN(rc_new_len)>0 ) trcmin WHERE trc.rc_old_len=0 AND trc.rc_type<=1 AND trc.rc_cur_id=trcmin.rc_cur_id AND trc.rc_id>trcmin.rc_id_min AND trc.rc_deleted=0 ORDER BY 1 -- But it's probably not necessary to re-run a query, because the onset timing is clear if we filter out all non-mainspace edits and obvious mainspace blanking reverts. Oddly, the rc_last_oldid values seem to be correct, even though rc_old_len is always incorrect. The earliest enwiki mainspace delta discrepancy is at 2012-05-29T08:50:30 UTC: http://en.wikipedia.org/w/index.php?title=Impeachment_of_Renato_Corona&diff=494919562&oldid=494919551 The first 15 enwiki mainspace discrepancies are: * 2012-05-29T08:50:30 Impeachment_of_Renato_Corona [non-null delta-discrepancy edit] * 2012-05-29T08:50:39 Johannes_Brahms * 2012-05-29T08:50:41 Impeachment_of_Renato_Corona * 2012-05-29T08:50:43 Johannes_Brahms * 2012-05-29T08:50:54 Impeachment_of_Renato_Corona * 2012-05-29T08:51:05 Impeachment_of_Renato_Corona * 2012-05-29T09:41:51 Uzi * 2012-05-29T09:41:53 Tuba * 2012-05-29T09:41:57 Tuba * 2012-05-29T09:53:09 UEFA_Euro_2012_squads * 2012-05-29T10:08:57 Sydney_Opera_House * 2012-05-29T10:17:34 Nicole_Kidman * 2012-05-29T10:20:04 Confederation_of_African_Football * 2012-05-29T10:26:51 Diamond_Jubilee_of_Elizabeth_II * 2012-05-29T10:27:19 Diamond_Jubilee_of_Elizabeth_II The error rate seems to have stabilised at around 15 an hour, other than a significantly higher rate on 2012-06-20 between approximately 21:00 and 22:00 UTC. The earliest plwiki mainspace discrepancy is at 2012-05-30T06:35:26 UTC: http://pl.wikipedia.org/w/index.php?title=Ben_10:_Tajemnica_Omnitrixa&diff=31467920&oldid=31467918 The plwiki discrepancies recur at approximately 3-hour intervals during waking hours so there may be some significance to the 22 hour difference between enwiki onset and plwiki onset. Cross-checking against the server admin log, could the 2012-05-29 disk space error be relevant? (Perhaps it was not logged until a few hours after data corruption: "16:10: hashar: srv187 and srv188 are out of disk space".) Hope that's helpful - I don't know much about the innards of MediaWiki so I'm guessing what might be relevant! -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
