I do not have an immediate need for Unicode Char literals. I just want to nudge you in the right direction :). We write statistical software with a heavy UI component. We need to revisit our statistical algorithms to work in parallel computing environments. X10 is one good option for doing this. I don't see this happening quickly or without missteps. My hope is that our efforts mature along with those of the X10 working group.
Generally, the issue with Char literals is a codpage problem and has little to do with concurrency and parallel computing. So, it may not be part of your core mandate. However, it is an extremely important issue for production code and I believe your intent is to encourage use of X10 in production environments. I cannot over-stress how important codepage problems become. My rough estimate is a third of bug-fixing for our venerable flagship application is related to codepage and transcoding issues. First, X10 Char and String hide the internal representation of characters and strings so programmers should not care whether it is UTF-8, UTF-16BE, etc. These implementations are counting characters and not bytes of storage. That is a good thing. Eventually, I expect X10 will be augmented to provide rudimentary transcoding between Unicode and other codepages. Or better, that task could be delegated to other components like ICU, for example, leaving X10 String as a simple container for Unicode. The issue with Char literals is somewhat separate and easier to address. It depends on the source code parser. For Java, the source code parser can parse ANSI codepage, UTF-8, UTF-16BE, etc. There is a switch for the Java compiler that identifies the codepage of the source. Choosing UTF-8 for example, the source can be plain ASCII (because ASCII is a subset of UTF-8) or it can be Unicode with Char literals that contain Unicode characters that are not escaped in any way. This is very useful when string literals are localized. The strings are readable in their native languages and not a jumble of escaped sequences. This makes it much easier to catch and fix mistakes. With regard to C/C++, there seems to be little that can be done to fix them at this stage. We consider std::string to be a misnamed container for immutable byte arrays with many equally misnamed and some dubious methods. (Ditto for std::wstring.) This is something we constantly reinforce and yet it still causes problems when programmers forget it. We rely heavily on ICU for codepage support. With all that, we also interface with native file systems, consoles, and other applications on the ten or so platforms we support. As you can imagine, Unicode support varies widely so we often transcode among various ANSI and Unicode codepages. Jeff Sweeney > It's both. We have not implemented Unicode support in C++, and the > parser does not yet understand Unicode in literals and identifiers. > There are plans to reimplement x10.lang.String in X10, which would > let us pick an encoding for Strings and the representation of Chars > that is independent of that in Java. We believe that UTF-8 is such > an encoding for Strings, and restricting current Strings to ASCII > will let us later add such support in a backward-compatible manner. > Jeff, as Vijay asked, is there an immediate need for Unicode support, > or are you simply curious? > Igor > Nate Nystrom <n...@nanocow.com> wrote on 08/24/2010 08:24:18 PM: > > For compatibility with Java, wouldn't we support Unicode rather than > > ASCII. I think maybe we don't support Unicode because of the C++ > > translation (representing Char as a C++ char). Or perhaps it's just > > that the parser was never implemented to support Unicode. > > > > Nate > > > > > > On Tue, Aug 24, 2010 at 19:18, Vijay Saraswat <vi...@saraswat.org> > wrote: > > > Indeed, currently Char is so restricted. The primary reason is > > > compatibility with Java, so that x10.lang.String can essentially be > > > implemented as java.lang.String. > > > > > > It does make sense to have a "RichString/RichChar" class as well which > > > supports permits UTF-8. Is there some particular interest in getting > > > this done sooner rather than later...? > > > > > > Best, > > > Vijay > > > > > > Jeff Sweeney wrote: > > >> I am reading the X10 Specification and it seems Char literals are > > >> restricted to ASCII. Is that correct and if so why? > -- > Igor Peshansky (note the spelling change!) > IBM T.J. Watson Research Center > X10: Parallel Productivity and Performance (http://x10-lang.org/) > XJ: No More Pain for XML's Gain (http://www.research.ibm.com/xj/) > "I hear and I forget. I see and I remember. I do and I understand" -- > Confucius ------------------------------------------------------------------------------ Sell apps to millions through the Intel(R) Atom(Tm) Developer Program Be part of this innovative community and reach millions of netbook users worldwide. Take advantage of special opportunities to increase revenue and speed time-to-market. Join now, and jumpstart your future. http://p.sf.net/sfu/intel-atom-d2d _______________________________________________ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users