aude added a comment. In addition to serializing the diff, the way ChangeDispatcher::getPendingChanges works is not so good or efficient.
This could be improved by checking the wb_subscriptions table for wikis that use that, and do filtering that way: https://phabricator.wikimedia.org/T110528 For wikis that don't use subscriptions yet... 1. we should go ahead and enable usage tracking everywhere so that they all do. even if the wiki only uses site links for now, it can still have usage tracking. 2. otherwise, check against the items per site table, although a nice join is not possible because items per site uses numeric ids. Also, obviously dispatching via delayed jobs would also help. another option might be to process changes, change by change (and then group changes into buckets / batches per site) and then add change notification jobs. when the client gets notified, then update chd_seen. (currently, we pick a site and then go find changes) To control batching, we could make it so that say every third edit (or some percentage, depending on how much batching we want) results in a change dispatcher job. TASK DETAIL https://phabricator.wikimedia.org/T109088 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: aude Cc: daniel, aude, Aklapper, Wikidata-bugs, Malyacko _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
