aude added a comment.

In addition to serializing the diff, the way 
ChangeDispatcher::getPendingChanges works is not so good or efficient.

This could be improved by checking the wb_subscriptions table for wikis that 
use that, and do filtering that way: https://phabricator.wikimedia.org/T110528  
For wikis that don't use subscriptions yet...

1. we should go ahead and enable usage tracking everywhere so that they all do. 
 even if the wiki only uses site links for now, it can still have usage 
tracking.

2. otherwise, check against the items per site table, although a nice join is 
not possible because items per site uses numeric ids.

Also, obviously dispatching via delayed jobs would also help.

another option might be to process changes, change by change (and then group 
changes into buckets / batches per site) and then add change notification jobs. 
when the client gets notified, then update chd_seen.  (currently, we pick a 
site and then go find changes)

To control batching, we could make it so that say every third edit (or some 
percentage, depending on how much batching we want) results in a change 
dispatcher job.


TASK DETAIL
  https://phabricator.wikimedia.org/T109088

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: daniel, aude, Aklapper, Wikidata-bugs, Malyacko



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to