Re: Can we print UTF-8 chars in Wx::TextCtrl fields?

Mark Dootson Wed, 01 May 2013 09:34:59 -0700

Hi,

On 01/05/2013 16:49, steveco.1...@gmail.com wrote:

Well all this just serves to deepen my confusion.

1) What is the difference between:

$line = decode( 'UTF-8', $orig );

and

$line = decode( 'utf8', $orig );


Always use
decode( 'UTF-8', $orig );

'UTF-8' means what it says.

In my opinion, 'utf8' means "something really quite like utf8 in all buta few respects but which isn't UTF-8 and is a left over from the dog'sbreakfast of Perl Unicode string handling, encoding and source codehandling that took a decade to fix."

Perl's documents currently refer to UTF-8 as 'strict UTF-8'. There's nosanity to it. Why the docs don't just say "'utf8' is really a left overfrom an era of big mistakes", I don't know.

Why did the latter not work for Octavian?

It isn't the difference between 'utf8' and 'UTF-8' that causedOctavian's code to fail.


2) Mark, your earlier logic seemed clear and unassailable, yet now you seem
to change your mind.

I got worn down. It is, after all a community project. The logic seemedclear and unassailable to me too. When faced with an argument thatsimply ignores everything you say you are left with the option ofrepeating yourself for ever, ignoring the opposite argument, or givingup and agreeing. Life is short so I gave up and agreed. I always try totake the approach that even if the other fellow is wrong in principle,what exactly would be the downside to agreeing. It leaves you with thetime and energy available to go on repeating yourself forever on theimportant stuff.


It won't break much I don't think.

I use:  $line = decode( 'utf8', $orig );

and I never have a problem, but according to this logic that is luck.

I accept this and I am happy to use utf8::upgrade($string);

I think we should assume that in the general case there will always be some
Perl processing before wxWidgets sees the string.

The general case is:

1 - Retrieve data from file or database (this maybe automatically decoded or
not, depending on the database and the driver);
2 - Do something to it (thus may be a null operation);
3 - Pass to wxWidgets to display to user.

To conserve string lengths and string processing (eg a simple alphabetical
sort in utf8).  If there is to be decoding, it must take place at between 1
and 2 above.

When you say:

So, my thinking is that I'll change it for builds against wxWidgets
2.9.x and above


What does "it" mean?  That you will include utf8::upgrade($string) in the
interface?

No, the code will just assume that the string passed is valid UTF-8 andattempt to convert it to a wxString accordingly. It will never call thelibc option.


I can't see any harm in this.  Just setting a character bit to 1 before an
operation and again later at worst just seems redundant.

But if we have the position where decode is called twice, this will create
problems for me.  A doubly decoded value gets corrupted and becomes a
diamond with a question mark in it, or some such value.


Hope above assures you this won't happen. (We won't be double decoding.)

Cheers

Mark

Re: Can we print UTF-8 chars in Wx::TextCtrl fields?

Reply via email to