On Mon, Dec 09, 2013 at 08:16:03AM -0600, msk...@ansuz.sooke.bc.ca wrote:
> On Mon, 9 Dec 2013, Philip Taylor wrote:
> > Keith -- could you possible supply an example of
> > "a properly encoded utf-8 string" from which it
> > can be unambiguously determined whether the string
> > "sang" is an English word (the past tense of "sing")
> 
> I'll probably regret pointing this out, and the characters involved have
> been deprecated since Unicode 5, but:
> 
>    U+E0001 U+E0065 U+E006E U+0073 U+0061 U+006E U+0067

And it is a kind of tagging, so beyond the scope of identifying the
language of *untagged* text (which is the claim that spurred all this
discussion).

Regards,
Khaled


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Reply via email to