Summary:
--------
XftGlyphCore can waste a lot of time if asked to write glyphs outside of its drawable.
There is one major application (Opera/QT) that hits this problem in Spades.
I propose a patch to fix it.

XftGlyphCore:
-------------

These remarks relate to the version of xftcore.c identified by
 * $Id: xftcore.c,v 1.4 2005/07/03 07:00:57 daniels Exp $
My system is  (via uame -a)
SunOS clerew 5.10 Generic_118822-25 sun4u sparc SUNW,Ultra-2

Essentially, XftGlyphCore is provided with a Drawable (via XftDraw), a bunch of glyphs, and a point (x,y) at which they are to be drawn. When used in anti-aliasing mode (when the XftDraw specifies TrueColor), it can consume a lot of resources (including at least one round-trip to the Xserver from a call of XGetImage). Even if the area where the glyphs are to be drawn does not intersect with any area within the drawable, it still goes through the whole process of drawing the glyphs, right through to obtaining the (supposed) existing background, calculating how the glyphs are to be merged with it, and writing the results to an (supposed) XImage which is never used.

In particular, the call of XGetImage fails, and it reverts to using its 'use_pixmap' mode for the next few calls. This involes a call of XCreatePixmap to create a pixmap the size of the bunch of glyphs, a call of XCreateGC and of XCopyArea to populate it with the (supposed) background, and a further call of XGetImage to get that background in XImage form (that's three Xserver round-trips).

It then writes those glyphs to the XImage, and uses XPutImage to write them back to the Drawable (which then discovers there is no intersection, and so ignores them). No harm is done; nothing breaks; but if you do it often it consumes huge resources.

But why, you may ask, should anybody in his Right Mind call XftGlyphCore to write glyphs that are not even inside the Drawable? A Good Question, indeed, but sadly there is one major applications that does it all the time :-( .

Opera/QT:
---------

The Opera Web Browser is written on top of the QT Toolkit, which in turn is written on top of LibXft. It includes a feature for reading and composing emails, and hence contains a text editor (also used when filling in Web Forms). I had long been aware that it had started to consume vast resources when composing large emails (or replying to large emails), and a long moan on opera.os.solaris had produced zilch response. So in desperation I set out to discover what was happening.

The first observations was that it only happened on one of the two screens on my machine (the one behind the fancy Creator Graphics card). After much poking around with truss and mdb, I discovered where the machine cycles were going to and, after downloading the source code of LibXft, I saw that the problem was related to the use of 24bit color plus TrueColor (my other screen uses 8bit color plus PseudoColor). Note that, up until that time, I had ever even heard of LibXft, or of the Render extensions, or of anti-aliasing (thanks to Wikipedia for explaining that). Though I must confess that, to those of us whose accomodation is long gone and who have to sit at a very precise distance from the screen to see it all in focus, anti-aliasing does indeed give quite an improvement. So I had a very steep learning curve to follow :-( .

Anyway, I eventually pieced together what Opera plus QT was actuaslly doing, so here it is (it is not a pretty story, and I have yet to discover whether it is an Opera problem or a QT problem).

Opera keeps a record of all the "word"s written to the editing window (a "word" is essentially a sequence of alphanumeric characters - any other character seems to be treated as a word of its own). Such words are used in calls of XftDrawString16, which duly calls XftGlyphCore. Each time you type a character (or use an arrow key, or delete a character) it discovers which bit of the window it needs to redraw, and constructs a brand new Pixmap of that size and prefills it with the supposed background of the window at that place (which, in practice, is always just pure white pixels). So now it needs to copy the required glyphs to that Pixmap (XftDrawString16), and when that is done it copies the Pixmap back to the original Window using XCopyArea, and then it throws the Pixmap away. A bit long-winded you might think, but You Ain't Seen Nothin' Yet.

For, to do this, it needs to know which glyphs are to be written into this (usually small) Pixmap. You might think that was a straightforward task, but No! It systematically goes through the WHOLE WINDOW, rewriting All the "words" known to be in it to that small Pixmap, whether they belong there or not. Most of them don't, of course! So, it your window is full of text, and you type some characters in at a reasonable typing speed, you can then sit an watch for several seconds while they all gradually appear (cursor movements and backspaces included) one-by-one. Not a pleasant way to construct your emails :-( .

But there is worse to come! Being an editing window, it naturally contains a cursor (this is the point-of-insertion cursor, not the mouse cursor). And this cursor blinks - 1/2 second on, 1/2 second off. Now it has the good sense not to use XDrawString16 to draw the cursor, BUT it does regard the cursor as part of the background, and so whatever glyph there might be at that point has to be re-anti-aliassed. You can see what is coming ...

Twice every second, it has to redraw every "word" in the window, on the offchance that it overlaps the 2x15 Pixmap where the cursor is ........

OK, time for some numbers. The worst case is when the window contains "words" of 1 character each, so I wrote a window containing alternate 'x' and SP - that's 1700 'x's altogether, and observed the CPU load involved just to keep that cursor blinking.

Now my machine has two processors of 300MHz each (there are faster machine around, but that is still quite some computing power), and of those two processors
  XSun  was using 32.7%  - call it 65% of one processor
  Opera was using 26.0%  - call it 52% of the other processor
just to keep the cursor blinking. After applying the Patch which I shall describe, that reduced to
  XSun  was using 0.3%
  Opera was using 3.5%
and now I can compose my emails in peace again.

But what an incredibly Stupid way to program an application! Yes, I shall be moaning again to the Opera (or QT) people, but in the meantime I think LibXft needs to be made proof against such stupidities, because stupid applications are still going to happen.

xftcore.patch:
--------------

I have attached my Patch. It essentially does three things:

The macro XftIntMult is modified to optimize the case where the background is pure white or the glyph color is opaque. This was an early mod I made, and though possibly useful is not essential.

_XftSmoothGlyphGray8888 is modified so that it only draws the part of the glyph(s) that intersect with the XImage of the Drawable (which is always a Pixmap in the Opera case). Without this, there is now a danger of writing over unallocated storage.

XftGlyphCore now uses XGetGeometry to discover the size of the Drawable (cacheing it in a static variable to save Xserver round-trips). Then it determines the intersection with the glyphs to be drawn, bailing out if the intersection is empty. Finally, it draws whatever portion of the glyphs lies within the intersection. It also, for good measure, checks the intersection and bails out in the same way when sharp glyphs are used.

Of course, this all causes some extra overhead in cases where the all the glyphs do lie within the Drawable, but not too much of it AFAICS.

Note that, if this patch gets adopted, it will probably be necessary to apply similar treatment to XftGlyphSpecCore and to the other _XftSmoothGlyph*, and I would be happy to work on that if needed (though I am not sure I could test them). But what I have done so far is sufficient for my present need, and for proof of concept.

--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131     Web: http://www.cs.man.ac.uk/~chl
Email:[EMAIL PROTECTED]: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

Attachment: xftcore.patch
Description: Binary data

_______________________________________________
xorg mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/xorg

Reply via email to