Re: [XeTeX] how to do (better) searchable PDFs in xelatex?

Peter Baker Mon, 15 Oct 2012 07:22:00 -0700

Here's an example file:

%&program=xelatex
%&encoding=UTF-8 Unicode
\documentclass{book}
\usepackage[silent]{fontspec}
\usepackage{xltxtra}
\setromanfont{Junicode}
\begin{document}
\noindent You can search for these:


\noindent first flat office afflict\\

\noindent But you cannot search for these:

\noindent after fifty front\\

\noindent You can search for these words because small caps have beenmoved out

of the PUA in recent versions of Junicode:

\noindent\textsc{first flat office afflict after fifty front}
\end{document}

Here's a link to an uncompressed (using pdftk) PDF:

https://dl.dropbox.com/u/35611549/test_uncompressed.pdf

I honestly have no idea what I'm looking at when I open that in Emacs.Here is info about the Junicode ligatures that can't be searched:


glyph name f_t, encoding U+EECB
glyph name f_t_y, encoding U+EED0
glyph name f_r, encoding U+EECA

Small caps are named like "a.sc" and they are unencoded. The font isgenerated by FontForge. The PDF is generated by XeTeX (XeLaTeXactually). I don't know if another program (e.g. LuaTeX) would yielddifferent results.


Peter

On 10/14/12 10:56 PM, Ross Moore wrote:

Any chance of providing example PDFs of this? (preferably usinguncompressed streams, to more easily examine the raw PDF content) Dothe documents also have CMap resources for the fonts, or is the solemeans of identifying the meaning of the ligature characters comingfrom their names only? Have these difficulties been reported to Adoberecently? If not, would you mind me doing so?



--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] how to do (better) searchable PDFs in xelatex?

Reply via email to