Hi, 2013/12/28 Guy Harris <[email protected]>
> > On Dec 24, 2013, at 2:43 AM, Pascal Quantin <[email protected]> > wrote: > > > r54428 introduced a ENC_3GPP_TS_23_038 encoding type so as to be able to > use proto_tree_add_item directly instead of manually decoding the string > with gsm_sms_char_7bit_unpack() / gsm_sms_chars_to_utf8() functions. > > While it is a very good idea (much more easier to use) it raises an > interesting issue. With this 7 bits encoding a payload of 7 bytes will hold > either 7 or 8 characters. This is handled by gsm_sms_char_7bit_unpack() > function thanks to an extra parameter specifying the number of characters. > > Presumably that's the out_length parameter (which doesn't appear to be > checked before every character is written to the output string); the > in_length parameter counts input octets, not output characters. However, > out_length appears primarily to be used when extracting into a fixed-length > buffer, with the buffer length passed as the out_length argument. > As you said the purpose of out_length is to give the maximum number of characters to be unpacked. In packet-gsm_sms.c, this parameter is begin set with udl value (with a protection in case udl variable would be bigger than the output buffer). In packet-ansi_637.c, num_fields represents the number of characters to be decoded. > > GSM MAP is encoded using ASN.1 BER, and USSD-String is an OCTET STRING, so > BER gives its length in octets, not characters, and it's preceded by > lengthInCharacters, giving its length in characters. > Yes. > > In that case, we need to make sure we don't process more than the > specified number of bytes and don't process more than the specified number > of characters. If ({number of characters}*7 + 7)/8 > {number of bytes}, > there should probably be an expert info reporting an error; we might want > to dissect all the characters we can extract from the specified number of > bytes, at least. If {number of bytes} < {number of characters}*7 + 7)/8, > we might also want to warn that there are too many padding bytes, and > dissect {number of characters} characters. In both those cases, a "number > of characters" count is all that needs to be passed to the string-extractor > or item-adder routine; if ({number of characters}*7 + 7)/8 > {number of > bytes}, the "number of characters" count should be ({number of bytes}*8)/7 > rather than {number of characters}. > > For the ETSI TS 102 223 v10.0.0/3GPP TS 11.14 v8.17.0/3GPP TS 31.111 > v9.7.0 smart card stuff, however, the text string appears to just be a TLV, > so you only get a length in bytes; presumably padding should be ignored in > that case, and we can just use proto_tree_add_item() or > tvb_get_string_enc(). > The specification defines a rule where the originator must explicitly add a <CR> if needed to avoid the padding bits: "If the total number of characters in the text string equals (8n-1) where n = 1, 2, 3, etc. then there are 7 spare bits at the end of the message. To avoid the situation where the receiving entity confuses 7 binary zero pad bits as the @ character, the carriage return (i.e. <CR>) character shall be used for padding in this situation, as defined in TS 123 038", So proto_tree_add_item is fine (probably the only case). > > Are there cases where only the length in characters is given? > 3GPP/3GPP2 SMS (packet_gsm_sms.c and packet_ansi_637.c). The Network Name information element in packet-gsm_a_dtap.c gives the number of padding bits in the last octet so it can be easily compute the number of characters. I did not check GMR1 and SMS Cell Broadcast specs yet. Pascal.
___________________________________________________________________________ Sent via: Wireshark-dev mailing list <[email protected]> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:[email protected]?subject=unsubscribe
