Hi Khaled and Michiel, On 18/08/2010, at 6:58 AM, Khaled Hosny wrote:
> On Tue, Aug 17, 2010 at 01:16:02PM -0700, Michiel Kamermans wrote: >> Khaled, >> >>> AFAIK, epup is just a subset of xhtml with a subset of css2, so IMO not a >>> kind of output format that is very well suited for TeX (well, I hardly >>> consider html an output format at all, the output is what the browser >>> renders out of it). >> For print media the epub format is, of course, nonsense. Hence the >> desire for parallel format generation. > > I understand the benefits of EPUB, what I don't understand is the need > for TeX at all. To me the problem is not about using TeX for formatting, it is about obtaining different output formats from the same (La)TeX sources --- especially when math formulas, and other 2-dimensional layouts, are involved. Since ePub, and similar, are XML- or XHTML-based, you want the detailed structure of the tagging to be produced automatically, without having to make edits on each output result, to "get it right". You want to enter your information in just one place, in a language that the author already understands and can use effectively. Software should then do the rest, modulo possible minor tweaking at the end. This is not just simply a matter of redefining macros, because the structure rules for the markup can be quite different for different output formats. So some kind of knowledge about what macros are being used for, and what kinds of things will follow after, is required of any translation software. Since LaTeX, processing to PDF as a major form of output, figures to be the comfortable input format, this is desirable for encoding the author's work --- though some may say it ought to be in XML. And since TeX already understands the expansion of macros and their arguments, it is attractive to want to use it as a starting point for generating other formats; but certainly it cannot be the whole shebang. For instance, in my work for Tagged PDF, an XML version will be able to be exported (using Adobe Acrobat Pro) from the complete PDF. Mathematics will be fully tagged as MathML, in this view. Other PDF readers may only see the rendered pages, but others may be able to use the tagging to extract an alternative view suitable to their own display screen. > (X)HTML is dynamic by nature, you should be able to > resize or change text size and the layout will re-flow, forcing a rigid, > box based layout that is a direct translation of TeX output just does > not make much sense to me. I agree that it is not the TeX *output* that needs to be further processed, but the input source --- or something intermediate that can be generated and written to a file as a by-product of LaTeX processing, with extra packages loaded to achieve this. TeX4Ht works by putting extra information into the .dvi file, to encode the required tagging. An extra post-processor is required to extract this information, producing HTML or XML or whatever. That is very similar to what I do for Tagged PDF, where the extra post-processor is Acrobat Pro. This is even more flexible than TeX4HT, since Acrobat can export into a range of formats, whereas TeX4ht only produces the format that was specified when the .dvi was being created. > I've the feeling that you are looking for the > wrong solution to the problem. One of the strengths of TeX that I mis > in almost all HTML renderers is decent line breaking and hyphenation > algorithms. While I don't know any any HTML engines, especially > browsers, that have given much attention to this, there are JavaScript > implementations of TeX's line breaking and hyphenation algorithms, > assuming EPUB readers can execute JavaScript, I think this is a good > compromise. See [1] for example (some interesting links near the end, > too). > > [1] http://typophile.com/node/71247 > > Regards, > Khaled Hope this helps, Ross ------------------------------------------------------------------------ Ross Moore [email protected] Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 ------------------------------------------------------------------------ -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
