2012/3/9 Shiva Shankar <[email protected]>: > Hi, > > I have a doubt regarding usage of hyphenation rules in LaTeX. Based on > generic rules or patterns written > in XeLaTeX for Kannada language I want to write them for Kanlel package. > Kanlel package is > not for UTF8 data. It is like Velthuis devanagari package for typesetting > Sanskrit. My question is after writing patterns for > Kannada language how can I test them? and where should I need to specify > lefthyphenminchar and > righthyphenminchar? Should I need to follow the route of babel or I is there > anyway that we can test them directly? > It is a bit difficult answer. Generally, the hyphenation pattern should be loaded when the LaTeX format is being generated. when doing it you have to assign a proper number to the \language register. Later in your document you need to set the same value to \language in order to use these patterns. This is also the place for setting \lefthyphenmin and \righthyphenmin. When loading a font, you have to define \hyphenchar unless the hyphen is available in the standard slot. The advantage of babel is that you can define a symbolic name to the language and the module will set the correct values of \lefthyphenmin and \righthyphenmin automatically when the language is selected. Remember that different users may have different languages installed so that the \language value for a given language may vary but the symbolic name will be the same.
Now the problem of hyphenation. TeX hyphenates words. The word is defined as a sequence of characters with \catcode=11 and nonzero \lccode. At least in Velthuis Devanagari some characters are build from pieces, conjuncts do not encode virama which means that the patterns written in UTF-8 would be unusable. Moreover, some matras are typeset by macros that will create false word boundaries. Simply said, hyphenation patterns are unusable with Velthuis Devanagari and situation with Kanlel will most probably be exactly the same. Hyphenation in Velthuis Devanagari can be optionally generated by the preprocessor. It is achieved by putting \- to all feasible hyphenation points. The last question is whether such a system is still needed now when we have XeTeX. I understand that people may not have UTF-8 keyboard for non-latin scripts or may have old files written in some transliteration and they have to process them. I would recommend another method. It is possible to generate TECkit map. TeX Live contains package xetex-devanagari with several such mappings, ArabXeTeX is another example of such maps. The advantage is that you specify this mapping when loading the font. The text is then converted automatically, no preprocessing is needed. I would now go this way. > -- > Regards > Shivashankar > Srirangapatna > > > > -------------------------------------------------- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
