Fri Jun 07 09:44:56 2013: Request 85943 was acted upon.
Transaction: Correspondence added by MDOOTSON
       Queue: Wx
     Subject: Re: [rt.cpan.org #85943] utf8 handling bug
   Broken in: (no value)
    Severity: (no value)
       Owner: Nobody
  Requestors: j...@pavlovsky.eu
      Status: new
 Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=85943 >


Hi,

Thanks for the report.
wxPli_copy_string is used in one place only and that is parsing command 
line arguments during wxWidgets initialisation.

So, unless you are passing in your values on the command line, this 
particular helper function cannot be your problem.

Recently I did look at some of the wxPerl utf8 handling and after some 
help on the wxPerl mailing list implemented a change for Wx 0.9922. One 
thing that was made clear during the process is that the Perl docs 
concerning utf8 handling are confused and in some places simply wrong.

That being the case, I think I need clear test cases to demonstrate any 
utf8 related bug that I can then test / fix.

For information, the code that converts your text for use by wxWidgets 
is the macro WXSTRING_INPUT which is in  cpp/helpers.h at line
78 or 109 in the Wx 0.9922 source.

Note that this changed for version Wx 0.9922. In previous versions the 
code looked more like your example from wxPli_copy_string.

If I understand your report, are you saying that if an array of values 
passed to a Wx::ComboBox constructor contains members with valid UTF-8 
and multi-byte characters, these characters are displayed incorrectly?

If yes, I can construct my own test case for this. If not, I'll probably 
need some code from you that demonstrates the problem.

Regards

Mark

On 07/06/2013 10:09, Jiří Pavlovský via RT wrote:
> Fri Jun 07 05:09:50 2013: Request 85943 was acted upon.
> Transaction: Ticket created by j...@pavlovsky.eu
>         Queue: Wx
>       Subject: utf8 handling bug
>     Broken in: (no value)
>      Severity: (no value)
>         Owner: Nobody
>    Requestors: j...@pavlovsky.eu
>        Status: new
>   Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=85943 >
>
>
> Hi,
>
> I had a following problem:
>
> I pass an array of object with stringification overload. Combobox
> displays stringified values and returns selected object. Works great
> unless the stringified value contains accented characters.  Then the
> displayed value is messed up.
>
> I thought I located the problem to a bug in stringification/utf8 and
> reported it to perl-bug.
>
> But I got a reply suggesting it's a bug in Wx::Perl. See for details below:
>
> I think this is a bug in Wx::Perl.
>
> I just downloaded Wx-0.9922 from CPAN and did a quick scan.
> cpp/helpers.cpp contains this, which I assume is a utility function used
> by various parts of Wx::Perl:
>
> #if wxUSE_UNICODE
> static wxChar* wxPli_copy_string( SV* scalar, wxChar** )
> {
>      dTHX;
>      STRLEN length;
>      wxWCharBuffer tmp = ( SvUTF8( scalar ) ) ?
>        wxConvUTF8.cMB2WX( SvPVutf8( scalar, length ) ) :
>        wxWCharBuffer( wxString( SvPV( scalar, length ),
>                                 wxConvLocal ).wc_str() );
>
>      wxChar* buffer = new wxChar[length + 1];
>      memcpy( buffer, tmp.data(), length * sizeof(wxChar) );
>      buffer[length] = wxT('\0');
>      return buffer;
> }
> #endif
>
> Checking SvUTF8(scalar) before any stringification is incorrect.  What
> it should be doing is something like this:
>
>      dTHX;
>      STRLEN length;
>      char * const s = SvPV( scalar, length );
>      wxWCharBuffer tmp = ( SvUTF8( scalar ) ) ?
>        wxConvUTF8.cMB2WX( s ) :
>        wxWCharBuffer( wxString( s,
>                                 wxConvLocal ).wc_str() );
>
> I don’t know what the wxConvLocal does, but if it does anything other
> than treat the string as Latin1, then that is also incorrect, and this
> would be better:
>
>      dTHX;
>      STRLEN length;
>      wxWCharBuffer tmp =
>        wxConvUTF8.cMB2WX( SvPVutf8( scalar, length ) );
>
>
> This aspect of SvUTF8 is nothing new, as has been documented since 2006
> (commit cd028baaa4):
>
>         SvUTF8  Returns a U32 value indicating the UTF-8 status of an SV.  If
>                 things are set-up properly, this indicates whether or not the
>                 SV contains UTF-8 encoded data.  You should use this after a
>                 call to SvPV() or one of its variants, in case any call to
>                 string overloading updates the internal flag.
>
> (The current wording is of recent provenance and comes from commit
> fd1423831.)
>
> I don’t know enough about Wx to write a test case, so could you report
> this to bug...@rt.cpan.org?
>
>
>


Reply via email to