Hello,

I understand the need for cite, thats why it is still there :) But...

- We format Cite references list every 100th request to backend,  
though it takes 8.15% backend response time (thanks parser cache,  
without it Cite formatting would take 815% cluster time - though  
developers should understand I'm not exactly right at this hyperbole ;-)

- When parsing articles like one of most popular today,  
[[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the  
page, 17s is spent on Cite block, executing {{cite}} mostly. That  
makes every editor wait for ages to get a page displayed, and due to  
cache stampede after invalidation it causes considerable stress on  
site (look at numbers mentioned above).

- This 8% is in real-time, which includes waiting for search,  
databases, and simply CPU contention, which we end up having today.  
CPU-time wise it is way higher, so can actually have 20% CPU time  
impact on our application farm. Thats at least 100k$ worth of hardware  
(and rising), even if new/modern one, just for citation formatting.

So, a checklist what can be done ( simple to complex )

[  ] - Simplification of {{cite}}
[  ] - Separate cache for Cite, to avoid reparsing on minor edits,  
that don't involve citations. I have no idea how much this would win,  
but there is theoretical chance of stripping 1% or so. ;)
[  ] - Offload some templates like {{cite}} to actual PHP extensions  
(can of worms, but, oh well, can be standardized process too)
[  ] - Implement proper scripting engine like Lua for metatemplates 
(http://pecl.php.net/package/lua 
  - another can of worms, though yet again, can be managed via trusted  
set of people, on top20 wikis or so).
[  ] - Frustrated operations guy adding something like ( return ""; )  
in some random extension, and syncing the live hack. Obviously there  
would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.

I for one can directly participate in at least two of these options. ;-)

Unfortunately, {{cite}} is the only template I can profile/account for  
now, we don't have proper per-template profiling, but I wish to get  
one some day. Then we'd have more "war on ..." topics ;-D

Generally, templates are major part of our parsing, and thats over 50%  
of our current cluster CPU load.
As we've actually managed to hit 100% last week, something what hasn't  
happened for a while, some of work has to be done here.

Of course, new hardware will help for a while, but I for one have huge  
personal satisfaction saving donation money. ;-)

CHEERS!
-- 
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to