This is possibly a daft question, but...

In traditional TeX, character tokens are processed and put into boxes individually, with fairly primitive ligature tables. Obviously XeTeX doesn't do this, using Harfbuzz (or ICU or whatever) to do the shaping and layout.

My question is, if you're not "showing" individual characters to the shaping engine for it to consider, what defines how big a string of characters to shape at a time? Does XeTeX break at the "word" level and then shape a word, and if so what defines a word? (Chinese has no word breaks!) Or does it shape an entire paragraph of text at a time (!) and then box up the glyphs individually? Or...?

(I've tried starting at layoutChars in XeTeXLayoutInterface.cpp and working backwards but I can't understand where I end up: measure_native_node shapes a node, but what's a node?)


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Reply via email to