On Mon, Jan 20, 2014 at 8:27 PM, Guy Harris <[email protected]> wrote: > > On Jan 20, 2014, at 1:49 PM, Martin Kaiser <[email protected]> wrote: > >> I committed the change to tvb_get_string() in r54864. > > I've changed that *not* to map bytes with the 8th bit set to REPLACEMENT > CHARACTER for UTF-8 strings. For UTF-8 strings, we need to do a more > complicated check and map invalid octet sequences to REPLACEMENT CHARACTER. > (We also need to do some more stuff for UCS-2, UTF-16, and UCS-4.) > > tvb_get_string() still treats the string as ASCII.
In which case is dumb search-and-replace of tvb_get_string with tvb_get_string_enc and ENC_ASCII an easy way to make (part of) the API transition? We'll still have to audit for dissectors that really meant ENC_SOMETHING_ELSE (probably ENC_UTF8 in most cases) but it'll be easy progress without any behavioural changes. >> I'll have a look at tvb_get_stringz() tomorrow. > > I've added that (with the same change *not* to do it for UTF-8 strings). > tvb_get_stringz() treats the string as ASCII. > > ___________________________________________________________________________ > Sent via: Wireshark-dev mailing list <[email protected]> > Archives: http://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev > mailto:[email protected]?subject=unsubscribe ___________________________________________________________________________ Sent via: Wireshark-dev mailing list <[email protected]> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:[email protected]?subject=unsubscribe
