On Jan 7, 2017, at 9:39 AM, Jaap Keuter <[email protected]> wrote:

> There has been a steady stream of MPLS PW related comments and bugs over time,
> and things haven't improved enough, apparently. This text tries to give some
> insight in the issues so that possible solutions cover all cases involved.

I'll start here with a broader discussion of how a protocol specifies the 
protocol running above it, and how the dissector for the first protocol selects 
the dissector for the next protocol.

Some protocols have a "next protocol" field or fields (Ethernet, IEEE 802.2, 
SNAP, IPv4, IPv6, anything with an IANA media type string, ...).  For those, 
it's easy - use a dissector table, and it will handle 99.99999999999999999% of 
the cases correctly.  "Decode As" would be necessary only in cases where two 
protocols are using the same value, either because the old protocol's 
assignment was revoked and a new protocol given the assignment (which doesn't 
sound like good practice, unless you have a *really* small set of possible 
values) or because somebody's not playing by the rules.  Heuristics should only 
be necessary in cases where the field is optional, such as IANA media type 
strings.

Some protocols have a field or fields that indicate a source or destination 
port, or a circuit number, which might be usable *hint* for identifying the 
next protocol, but which is not sufficient to indicate it (TCP and UDP are the 
canonical examples of this; ATM's VPI/VCI are another example).  For those, you 
use a dissector table with dissectors that do checks and reject packets, 
heuristics, mechanisms for other dissectors to make port-to-protocol 
assignments if that can be done(e.g., SDP and RTCP setting up RTP sessions), 
and, when all else fails - which could be fairly common - Decode As.

For the latter category of protocols, the fewer "well known" field values there 
are, the more you depend on heuristics to avoid Decode As.  MPLS is a protocol 
with *very* few "well known" label values.

MPLS is a very good example of the last category of protocols; RFC 3032 gives 
only 16 reserved label values.  That's the problem here; at least with TCP 
ports, for example, a lot of the well-known ports help.

The protocols we support atop MPLS are:

        I-TDM (Internal TDM)
        Y.1711 (has a reserved label)
        ATM pseudo-wires of various sorts (RFC 4717)
        CESoPSN pseudo-wire (RFC 5086)
        Ethernet pseudo-wire (RFC 4448)
        Frame Relay pseudo-wire (RFC 4619)
        PPP/HDLC pseudo-wires (RFC 4618)
        SAToP (RFC 4553)
        IPv4
        IPv6
        Pseudo-wire Associated Channel Header dissection (RFC 4385)

For most of these, we require Decode As.

The exceptions are:

        IPv4, IPv6, Associated Channel Header, Ethernet

for which, for frames with no explicit binding to a label, we use the 
first-nibble heuristic, possibly combined with other heuristics.

In addition, the Ethernet pseudo-wire dissector also uses heuristics to 
determine whether there's a control word or not.  It *looks* as if the ATM 
pseudo-wire dissector always assumes a control work, even though RFC 4717 says

   The features that the control word provides may not be needed for a
   given ATM PW.  For example, ECMP may not be present or active on a
   given MPLS network, strict frame sequencing may not be required, etc.
   If this is the case, and the control word is not REQUIRED by the
   encapsulation mode for other functions (such as length or the
   transport of ATM protocol specific information), the control word
   provides little value and is therefore OPTIONAL.  Early ATM PW
   implementations have been deployed that do not include a control word
   or the ability to process one if present.  To aid in backwards
   compatibility, future implementations MUST be able to send and
   receive frames without a control word present.

If the control word were *always* present, we wouldn't be having these 
problems, and people wouldn't be filing bugs.  Thus, the bugs demonstrate that, 
at least for Ethernet, the control word isn't always present.

Bug 11849 was due to an "is this Ethernet?" heuristic being too strong, by 
accepting only a small number of Ethertypes; it was fixed by weakening the 
heuristic not to look at the type/length field at all.

Bug 13039 is due to the "is this Ethernet?" heuristic being too strong, by not 
accepting frames with local MAC addresses.

Bug 13295 is due to the "is this Ethernet?" heuristic being too weak, by 
accepting frames with unknown Ethernet types.

Bug 13301 is due to the "is this IPv4? and "is this IPv6?" heuristics being too 
strong, by accepting, respectively, every frame with 4 in the first nibble as 
IPv4 and every frame with 6 in the first nibble as IPv6.

There are a number of ways to solve this:

        1) Make the Ethernet dissector like the other pseudo-wire dissectors, 
and require "Decode As".

           Presumably this was not done because Ethernet pseudo-wires are 
popular enough that this would require too much "Decode As".  (And, presumably, 
the other pseudo-wires are *not* popular enough for this to be an issue.)

        2) Fix the heuristics for Ethernet-without-control-word.

           This would address bugs 13039 and 13295, by weakening the heuristic 
where it needs to be weaker and strengthening where it needs to be stronger 
(the latter also makes the former less likely to break things).

           It doesn't address bug 13301, however.

        3) Fix the heuristics for Ethernet-without-control-word and even hand 
frames with a first nibble of 4 or 6 to an "is this Ethernet without a control 
word?" heuristic dissector - if that dissector says "no", dissect the packets 
as IPv4 or IPv6.

           That would also fix 13301, but would run the risk of mis-dissecting 
some IPv4 or IPv6 frames as Ethernet-without-control-word.

        4) Fix the heuristics for Ethernet-without-control-word and strengthen 
the first-nibble checks for IPv4 and IPv6 to also check some other fields, such 
as the "protocol" and "next header" fields.

           That would also fix 13301, but would run the risk of mis-dissecting 
some IPv4 or IPv6 frames as Ethernet-without-control-word, although that risk 
might be lower than with 3).

We might also want to have a preference to deal with the "first nibble of the 
MAC address is 4 or 6" issue.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <[email protected]>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev
             mailto:[email protected]?subject=unsubscribe

Reply via email to