On 5/12/13 12:48, C. Scott Ananian wrote:
Can anyone point me to docs on XeT--TeX? A Google the other day failed to turn up anything useful.
(TeX--XeT, not XeT--TeX.) This is part of e-TeX; see the e-TeX manual[1], section 4.1. HTH, JK [1] http://tug.ctan.org/systems/e-tex/v2/doc/etex_man.pdf
Also: polyglossia appears to be doing some amount of LTR/RTL directionality switching based on the character block. Can anyone offer advice on how to avoid fighting with that, if I'm implementing my own bidi algorithm? Finally: any advice on using CJK languages with polyglossia? Embedded CJK is quite common. Should I be writing gloss-ja etc files to set the right directionality and font and get the appropriate CJK support packages loaded? --scott On Dec 5, 2013 5:42 AM, "Jonathan Kew" <[email protected] <mailto:[email protected]>> wrote: On 4/12/13 13:24, C. Scott Ananian wrote: The goal is to match the Unicode bidi algorithm, because that is how the web page displays and thus how the original author saw the text as they wrote. This would be a nice enhancement, but would require a significant amount of work (or in other words, it's not likely to get implemented quickly, if at all). Currently, typesetting bidi text with xetex requires correct use of the TeX--XeT bidi commands (\beginR, \endR, \beginL, \endL) to mark up the text direction. These could be used directly, or via higher-level markup that's tagging script and language, but you definitely need them to be present in some way. Sorry, that's not what you want to hear, but it's how things are. At this point, I think the most practical way forward in your situation is probably to implement this as part of whatever tool is taking the wikipedia content and converting it to (Xe)LaTeX markup - that tool could inspect the content of each element it's processing, and add any necessary direction controls for XeTeX. JK Guessing the proper language tag to use is likely infeasible; note that the example given contains titles in Turkish as well as English. The safest option is probably to treat embedded LTR text in an RTL context as 'exotic' and not to attempt hyphenation. I've heard it said that LuaTeX has "better bidi support". What does that mean, exactly? Should I be considering switching? --scott On Dec 4, 2013 4:08 AM, "Keith J. Schultz" <[email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>__> wrote: Hi Scott, Am 03.12.2013 um 19:42 schrieb C. Scott Ananian <[email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>>: > > But in the XeLaTeX/polyglossia/bidi output, the "soft space" weak > directionality of the Unicode BiDi algorithm doesn't seem to be > honored (or implemented?) and so the English article titles appear > with the individual words in RTL order, which is a mess. Manually > tagging the language of the article title is probably the Right thing, > but infeasible for the entire wikipedia. Well, without proper tagging you can not expect any system to work properly or as expected! For most entries a simple script should do the trick to add the language tags to the article titles. Hope this helps regards Keith. ------------------------------__-------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/__listinfo/xetex <http://tug.org/mailman/listinfo/xetex> ------------------------------__-------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/__listinfo/xetex <http://tug.org/mailman/listinfo/xetex> ------------------------------__-------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/__listinfo/xetex <http://tug.org/mailman/listinfo/xetex> -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
-------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
