https://bugzilla.wikimedia.org/show_bug.cgi?id=67117
--- Comment #1 from Aaron Schulz <[email protected]> --- A few things: * This seems to be missing a ArticleRevisionVisibilitySet handler * http_post() needs to handle $wgHTTPProxy * The maximum jobs attempts for the queue will have to be set very high to avoid update losses (maxTries) * NewRevisionFromEditComplete and other hooks trigger before COMMIT so the jobs should probably be delayed (using 'jobReleaseTimestamp'). Moving them post-COMMIT is not an option since the network partition could cause nothing to be enqueued (the reverse, a job and no COMMIT, is wasteful but harmless). * We tend to run lots of jobs for one wiki at a time. http_post() could benefit from some sort of singleton on the curl handle instead closing it each time. See http://stackoverflow.com/questions/972925/persistent-keepalive-http-with-the-php-curl-library. * createLastChangesOutput() should use LIMIT+DISTINCT instead of a "break" statement. Also, I'm not sure how well that works. There can only be one job for hitting the URL that returns this result in the queue, but it only does the last 60 seconds of changes. Also, it selects rc_timestamp but does not use it now. Is it OK if the Hub missing a bunch of changes from this (e.g. are the per-Title jobs good enough?) * It's curious that the hub is supposed to talk back to a special page, why not an API page instead? * The Link headers also go there. What is the use of these? Also, since they'd take 30 days to apply to all pages (the varnish cache TTL), it would be a pain to change them. They definitely need to be stable. Come to think of it, it seems like we need to send the following events to the hub: * New revisions * Page (un)deletions * Revision (un)deletions * Page moves All of the above leave either edit or log entries in recent changes. Page moves only leave one at the old title...though rc_params can be inspected to get the new title. I wonder if instead of a job per title if there can instead be a single job that sends all changes since the "last update time" and updates the "last update time" on success. The advantages would be: a) Far fewer jobs needed b) All updates would be batched c) Supporting more hubs is easier since only another job and time position is needed (rather than N jobs for each hub for each title) Of course I may have missed something. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
