Michael Dale <md...@wikimedia.org> writes:

> I recommended that the image daemon run semi-synchronously since the
> changes needed to maintain multiple states and return non-cached
> place-holder images while managing updates and page purges for when the
> updated images are available within the wikimedia server architecture
> probably won't be completed in the summer of code time-line. But if the
> student is up for it the concept would be useful for other components
> like video transformation / transcoding, sequence flattening etc. But
> its not what I would recommend for the summer of code time-line.

I may have problems understanding the concept "semi-synchronously", does
it mean when MW parse a page that contains thumbnail images, the parser
sends requests to daemon which would reply twice for each request, one
immediately with a best fit or a place holder (synchronously), one later
on when thumbnail is ready (asynchronously)?

> == what would probably be better for the image resize efforts should
> focus on ===
>
> (1) making the existing system "more robust" and (2) better taking
> advantage of multi-threaded servers.
>
> (1) right now the system chokes on large images we should deploy
> support for an in-place image resize maybe something like vips (?)
> (http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use)
> The system should intelligently call vips to transform the image to a
> reasonable size at time of upload then use those derivative for just
> in time thumbs for articles. ( If vips is unavailable we don't
> transform and we don't crash the apache node.)

Wow, vips sounds great, still reading its documentation. How is its
performance on relatively small size (not huge, a few hundreds pixels in
width/height) images compared with traditional single threaded resizing
programs?

> (2) maybe spinning out the image transform process early on in the
> parsing of the page with a place-holder and callback so by the time
> all the templates and links have been looked up the image is ready for
> output. (maybe another function wfShellBackgroundExec($cmd,
> $callback_function) (maybe using |pcntl_fork then normal |wfShellExec
> then| ||pcntl_waitpid then callback function ... which sets some var
> in the parent process so that pageOutput knows its good to go) |

Asynchronous daemon doesn't make much sense if page purge occurs on
server side, but what if we put off page purge to the browser? It works
like this:

1. mw parser send request to daemon
2. daemon finds the work non-trivial, reply *immediately* with a best
   fit or just a place holder
3. browser renders the page, finds it's not final, so sends a request to
   daemon directly using AJAX
4. daemon reply to the browser when thumbnail is ready
5. browser replace temporary best fit / place holder with new thumb
   using Javascript

Daemon now have to deal with two kinds of clients: mw servers and
browsers.

Letting browser wait instead of mw server has the benefit of reduced
latency for users while still have an acceptable page to read before
image replacing takes place and a perfect page after that. For most of
users, it's likely that the replacing occurs as soon as page loading
ends, since transfering page takes some time, and daemon would have
already finished thumbnailing in the process.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to