https://bugzilla.wikimedia.org/show_bug.cgi?id=67439
Bug ID: 67439
Summary: Deal with logging query spam on crawler 404 floods
Product: MediaWiki
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: General/Unknown
Assignee: [email protected]
Reporter: [email protected]
Web browser: ---
Mobile Platform: ---
As seen at https://tendril.wikimedia.org/report/, we have a bunch of crawlers
of various types hitting non-existent pages. We do a move/delete log query on
such page views...which is fine except when lots of queries come in at once.
They end up taking 16s to 18s.
Possible solution is to avoid calling the LogEventList method in
showMissingArticle based on a Bloom Filter in Redis. This would be updated on
the fly. Not sure how to estimate the set size to keep the false hit rate down.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l