Re: [XeTeX] Contextual shaping

Jonathan Kew Wed, 27 Nov 2013 05:07:47 -0800

On 27/11/13 12:46, Khaled Hosny wrote:

On Wed, Nov 27, 2013 at 09:10:02PM +0900, Simon Cozens wrote:

This is possibly a daft question, but...


In traditional TeX, character tokens are processed and put into boxes
individually, with fairly primitive ligature tables. Obviously XeTeX doesn't
do this, using Harfbuzz (or ICU or whatever) to do the shaping and layout.

My question is, if you're not "showing" individual characters to the shaping
engine for it to consider, what defines how big a string of characters to
shape at a time? Does XeTeX break at the "word" level and then shape a word,
and if so what defines a word? (Chinese has no word breaks!) Or does it
shape an entire paragraph of text at a time (!) and then box up the glyphs
individually? Or...?


XeTeX shapes words one at a time, a word is basically any consecutive
sequence of character nodes (using the same font) after TeX has done its
macro expansion and is ready to typeset the material. The AAT code,
additionally, tries to merge word sequences separated by spaces into one
node.

In particular, in case it's not sufficiently clear from the above, notethat <space>s, being glue nodes, are NOT part of such a "consecutivesequence of character nodes". And therefore a known limitation of xetexis that OpenType lookups that try to match the <space> glyph will notwork. Shaping happens only within a run of non-space characters in agiven font.

Most fonts are not affected by this, but it is an issue for certainfonts that want to do complex multi-word ligatures, or contextual formsthat depend on the adjacent <space> glyph.


JK



--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Contextual shaping

Reply via email to