If you need an example of the problem, see the XeTeX manual at http://tug.ctan.org/tex-archive/info/xetexref/XeTeX-reference.pdf
On page 4 there are two examples of small caps usage. On my computer, at least, the first one (Warnock Pro in italic+small caps) cannot be copied correctly. The second example (in Hoefler Text, bold+small caps) however does work. I suspect Hoefler Text uses a different font file for the small caps rather than feature tags in a font with normal minuscules. --Bogdan Butnaru On Thu, Oct 28, 2010 at 16:18, Bogdan Butnaru <[email protected]> wrote: > Hello! > > I’m having a problem with the way the advanced font features of XeTeX > interact with PDF reader programs. I’m not exactly sure where exactly > is the culprit, so I apologize if this is not the right place to ask > for help; (re-)directions are welcome if such is the case. > > I’ve been writing my CV (I think the more correct US term is resume) > in LaTeX, using xelatex to compile it to PDF. I managed to get it to > look pretty much exactly as I wanted. (I’m not quite a typography > expert, but I’m quite pleased with the result if I may say so.) > > The document uses a nice font with many OpenType features like small > and titling capitals, lining and old-style numerals, and superscripts > and the like. (Those are the ones I use, there are others.) Therein > lies the problem: as far as I can tell “variant” characters, like > small-caps or superscript letters, are represented as additional > (private) code-points within the font, rather than as separate fonts. > For display and printing, this is not a problem: the font is embedded > in the PDF, and everywhere I tried it it seems to look as it should. > > However, when copying and pasting the contents in another program—big > failure. Everything that isn’t displayed in the “normal” variant is > copied to the clipboard as a set of (what I believe to be) private > codepoints rather than the “semantic” Unicode codepoints it > represents. > > This is a big problem for this document, as I expect a potential > employer might try to copy&paste parts of it (e.g., address) and fail > unexpectedly (getting gibberish). > > I’ve tried searching for solutions or workarounds, with little > success. If (as I assume) this is a well-known problem, don’t hesitate > to just point me towards a document that explains it. > > I’ve seen PDF documents that seemed to have a kind of “text overlay”: > these were all scanned documents with (I assume) some kind of OCR > processing. For display and printing purposes, only the scanned image > was used (i.e., the OCRed text was invisible). However, when selecting > (and copy/pasting), a text layer was used. > > I’ve no idea what PDF feature this used and if it’s accessible via > LaTeX. I was hoping there was a way to add a “replacement” text for > affected areas (and I searched fruitlessly the hyperref documentation > for it), such that on copy-paste the replacement is used rather than > just private characters. Since it’s a one-page document it wouldn’t be > a lot of work to add the replacements. > > The only alternative I could think of was to take FontForge and > manually split the font in pieces (e.g., one for small caps, one for > superscripts, etc.), such that each variant glyph is encoded in its > “semantic” position. But it’s a big and complex font, so that would > take a lot more work than just “hinting” the document. I also worry > that messing around with it in FontForge will cause me to loose > hinting and other features I (or it) may not be aware of. > > I welcome all ideas, and thank you in advance. > > --Bogdan Butnaru > > PS. What I’m using identifies itself as “XeTeX 3.1415926-2.2-0.9995.2 > (TeX Live 2009/Debian)” on Ubuntu. Fontspec reports itself as > “2008/08/09 v1.18”. The problem manifests itself on every PDF viewer I > tried (about one each for Linux, Windows and Mac OS X, and also Google > Docs’ viewer). > -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
