UTF-8 can represent any Unicode character... but it does so by turning some of them into multiple-byte sequences, and in order to do so it has to reserve the bytes above 0x7F for that purpose. If you try to use those bytes as characters themselves, UTF-8 conversion will fail. See the RFC for more detail; it's not hard to find with a websearch.
There is probably an encoding that would work for your files -- but you'll have to determine what it is and explicitly specify it. ______________________________________ Joe Kesselman / IBM Research --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
