2011/11/18 Philip TAYLOR <[email protected]>: > Is it safe to assume that these "code listings" > are restricted to the ASCII character set ? If > so, yes, spaces are likely to be a problem, but > if the code listing can also include ligature- > digraphs, then these are likely to prove even > more problematic. > If the code listing is typeset in a fixed width font, it is usually no problem. I copied a few code samples from books in PDF, most of them were typeset by TeX. If I want to copy text in Devanagari, it is almost impossible. If I take just a simple Hindi work किताब, the best result I can get will be िकताब (you should se a dotted circle which is not visible in PDF). The reason is that the first two letters are U+0915, U+093F but visually the latter is displayed first. After copying you get the reversed order U+093F, U+0915. This is just one of many problems with Devanagari. The toUnicode map does not help much with Indian scripts. I have never tried to copy Arabic from PDF. Or even the combination of LTR and RTL within a paragraph.
> ** Phipl. > -------- > Ulrike Fischer wrote: > >> One question which pops up regularly in the TeX-groups is "how can I >> insert a code listing in my pdf so that it can be copied and pasted >> reliably". >> >> Currently this is not easy as the heuristics of the readers can >> easily loose spaces, you can't encode tabs or a specific number of >> spaces. >> >> Real space characters in the pdf (instead of only visible space) >> would help here a lot. > > > -------------------------------------------------- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
