On Mon, Jan 20, 2014 at 2:52 PM, Jakub Zawadzki <[email protected]> wrote: > Hi, > > On Mon, Jan 20, 2014 at 06:22:37PM +0100, Martin Kaiser wrote: >> if I have a tvbuff that starts with 0x86 and I call >> >> a = tvb_get_string_enc(tvb, 0, ENC_ASCII) >> proto_tree_add_string(..., a); >> >> I can trigger the DISSECTOR_ASSERT since a is not a valid unicode string. >> >> Comments in the code suggest that tvb_get_string() should replace >> chars>=0x80 with the unicode replacement char, which is two bytes long. >> This would look like >> [...] >> >> The resulting string would still contain len+1 chars but not necessarily >> len+1 bytes. Would that be a problem, i.e. is it ok to do sth like >> >> b = tvb_get_string(NULL, tvb, offset, len_b); >> copy_of_b = g_malloc(len_b+1); >> memcpy(copy_of_b, b, len_b+1); > > If you just want to duplicate string you should definitely use g_strdup() ;-)
As long as you can guarantee there won't be embedded nulls. >> If that should work, we'd need a separate function for get string & >> replace 8bit chars. > > I think we don't need, tvb_get_string_enc(, ENC_ASCII) should return valid > UTF-8 string, > and all callers assuming it's just 1:1 copy are buggy. > > Maybe we should add: ENC_STRING_DONT_CONVERT, if people want just to > have NUL terminated string? > > > btw. I really wonder if current way of using a replacement character is good > one. > Maybe we should escape it to some: \x86. > ___________________________________________________________________________ > Sent via: Wireshark-dev mailing list <[email protected]> > Archives: http://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev > mailto:[email protected]?subject=unsubscribe ___________________________________________________________________________ Sent via: Wireshark-dev mailing list <[email protected]> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:[email protected]?subject=unsubscribe
