Now, that the practicability is cleared, let's come back to the
philosophical part:
Should  =u00a0 be active and treated as ~ by default? Just like
u202f and u2009 should be active and treated as \, and \,\hspace{0pt}?
Where would such a default take place:
- XeTeX engine
- XeLaTeX format
- some package (xunicode, fontspec, some new package)
- my own package/preamble template
As was discussed in the Thread "Space characters and whitespace", using
these characters without any treatment contradicts TeX's spacing
algorithms. So it seems, one should not use these characters and blame
unicode OR treat these characters specially.
bye
Toscho
Am 13.11.2011 21:36, schrieb Mike Maxwell:
On 11/13/2011 11:09 AM, Tobias Schoel wrote:
How much text flow control mechanism should be done by none-ASCII
characters? Unicode has different codepoints for signs with the same
meaning but different text flow control (space vs. non-break space). So
text flow could be controled via Unicode codepoints. But should it? Or
should text flow be controled via commands and active characters?
One opinion says, that using (La)TeX is programming. Consequently, each
character used should be visually well distinguishable. This is not the
case with all the Unicode white space characters.
One opinion says, that using (La)TeX is transforming plain text (like
.txt) in well formatted text. Consequently, the plain text may contain
as much (meta)-information as possible and these information should be
used when transforming it to well formatted text. So Unicode white space
characters are allowed and should be valued by their specific meaning.
And on the third hand, XeTeX could allow both.
> How would you visually differentiate between all
> the white space characters (space vs. non-break space, thin space
> (u2009) vs. narrow no-break space (u202f), … ) such that the text
> remains readable?
Of course, there's precedent for this kind of problem: tab characters.
For that matter, many text editors display Unicode combining diacritics
over or under the base character that they go with, which is already
getting away from a straightforward display of the underlying characters.
At any rate, there are lots of ways non-ASCII space characters could be
distinguished; Philip Taylor mentions color coding, which is certainly
possible. Another would be to display some kind of code for non-ASCII
spaces. There's one font which displays all characters as nothing but
their Unicode code points (in hex) inside some kind of box. A tex(t)
editor could certainly be programmed to display control characters
(which these space characters essentially are) differently from the
"regular" characters (which would continue to be displayed with an
ordinary font).
The editor I use, jEdit, provides yet another option: a command
(bindable to a keystroke) that tells me the Unicode code point of any
character, on the editor's status line.
--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex