On Fri, Apr 4, 2014 at 4:04 PM, Guy Harris <[email protected]> wrote:
>
> On Apr 4, 2014, at 7:30 AM, Hadriel Kaplan <[email protected]> wrote:
>
>> I might be overlooking something, but I don't see a tvb_get_* function to 
>> get a uint8/16/32/64 that was encoded as a ascii or utf-8 string in the 
>> packet. Is there such a thing?
>
> No.
>
> I've occasionally also thought there should be such a routine.
>
> Note, though, that, whilst tvb_get_guint8() and tvb_get_{n,le}tohXXX() can 
> never fail, because every possible sequence of octets is a valid 2's 
> complement integral value, routines to get a number encoded as a string *can* 
> fail, e.g. 0123xyzw is not a valid number in bases 8, 10, or 16.
>
> There are other cases where a tvb_get_ routine can return "you lose", e.g. 
> tvb_get_string_enc() can fail if there are invalid octet sequences (about the 
> only encodings I know of where *every* octet sequence is a valid string are 
> some of the ISO 8859-n encodings), and at least some floating-point formats 
> probably have invalid values (I guess an IEEE NaN is "valid", at least to the 
> extent that if we try to format it it'll show up as "NaN", but if we try to 
> do calculations with it we might get a floating-point exception.
>
>> Instead, it seems the dissectors that deal with string messages do a 
>> tvb_get_string_enc() or tvb_format_text(), and then a strtol() or atoi(). 
>> But in my way of thinking, the fact that it's in a string-encoded form in 
>> the tvb isn't that much different from it being encoded as little-endian vs. 
>> network-order.
>>
>> Likewise, it's not clear if there's a way to define a protocol field that is 
>> encoded as a string in the packet but is internally a uint8/16/32/64 (e.g., 
>> for filtering purposes, val_string lookup, etc.). For example such that 
>> proto_tree_add_item() would work. Instead, it seems some dissectors use the 
>> returned strtol/atoi to then add the field to the tree as a true uint type, 
>> or add it as a FT_STRING field type.
>
> One advantage of that is that, if the routine to fetch the value also adds an 
> item to the protocol tree, it could, in the cases where the value is invalid, 
> also add an expert item indicating that the value isn't valid.
>
> And I'd like to see proto_tree_add_XXX_item() routines that add an item with 
> a particular type *and* take a pointer argument and return the value for the 
> item through that pointer; that could replace
>
>         xxx = tvb_get_XXX();
>         proto_tree_add_XXX(..., xxx);
>
> combinations and
>
>         xxx = tvb_get_XXX();
>         proto_tree_add_item(...);       /* re-fetches the item value */
>
> with
>
>         proto_tree_add_XXX_item(..., &xxx);

That would be neat, though we would have to be careful with our
fast-path handling, since we should return the value regardless.

>> And if we had common functions handle ascii and utf-8 string-encoded 
>> numbers, they could avoid creating temporary strings as well.
>
> The only real encoding issues are "ASCII superset" (so that "0123456789", for 
> example, is encoded the same as in ASCII) vs. "2 or more bytes per ASCII 
> character" (e.g., UCS-2, UTF-16, and UCS-4) vs. "one of those 7-bit GSM 
> character encodings" vs. "EBCDIC".
> ___________________________________________________________________________
> Sent via:    Wireshark-dev mailing list <[email protected]>
> Archives:    http://www.wireshark.org/lists/wireshark-dev
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
>              mailto:[email protected]?subject=unsubscribe
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <[email protected]>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:[email protected]?subject=unsubscribe

Reply via email to