2011/11/15 Ross Moore <[email protected]>: > Hi Phil, > > On 16/11/2011, at 8:45 AM, Philip TAYLOR wrote: > >> Ross Moore wrote: >>> >>> On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: >>> >>>> Given that TeX (and XeTeX too) deal wit a non-breakble space already >>>> (where we usually use the ~ to represent that space) it seems to me that >>>> XeTeX should treat that the same way. >>> >>> No, I disagree completely. >>> >>> What if you really want the Ux00A0 character to be in the PDF? >>> That is, when you copy/paste from the PDF, you want that character >>> to come along for the ride. >> >> I'm not sure I entirely go along with this argument, Ross. >> "What if you really want the \ character to be in the PDF", >> or the "^" character, or the "$" character, or any character >> that TeX currently treats specially ? > > TeX already provides \$ \_ \# etc. for (most of) the other special > characters it uses, but does not for ^^A0 --- but it does not > need to if you can generate it yourself on the keyboard. > ^^^^00a0 > >> Whilst I can agree >> that there is considerable merit in extending XeTeX such >> that it treats all of these "new", "special" characters >> specially (by creating new catcodes, new node types and so >> on), in the short term I can see no fundamental problem with >> treating U+00A0 in such a way that it behaves indistinguishably >> from the normal expansion of "~". > > How do you explain to somebody the need to do something really, > really special to get a character that they can type, or copy/paste? > > There is no special role for this character in other vital aspects > of how TeX works, such as there is for $ _ # etc. > > >>> >>> In TeX ~ *simulates* a non-breaking space visually, but there is >>> no actual character inserted. >> >> And I don't agree that a space is a character, non-breaking or not ! > > In this view you are against most of the rest of the world. > TeX NEVER outputs a space as a glyph. Text extraction tools usually interpret horizontal spaces of sufficient size as U+0020.
(The exception to the above mentioned "never" is the verbatim mode.) > If the output is intended to be PDF, as it really has to be with > XeTeX, then the specifications for the modern variants of PDF > need to be consulted. > > With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7) > there is a requirement that the included content should explicitly > provide word boundaries. Having a space character inserted is by > far the most natural way to meet this specification. A space character is a fixed-width glyph. If you insist in it, you will never be able to typeset justified paragraphs, you will move back to the era of mechanical typewriters. > (This does not mean that having such a character in the output > need affect TeX's view of typesetting.) > > Before replying to anything in the above paragraph, please > watch the video of my recent talk at TUG-2011. > > http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/ > > or similar from earlier years where I also talk a bit about such things. > >> >> ** Phil. > > > Hope this helps, > > Ross > > ------------------------------------------------------------------------ > Ross Moore [email protected] > Mathematics Department office: E7A-419 > Macquarie University tel: +61 (0)2 9850 8955 > Sydney, Australia 2109 fax: +61 (0)2 9850 8114 > ------------------------------------------------------------------------ > > > > > > > -------------------------------------------------- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
