Hey Toke, Thanks for fixing this. I wasn't aware there was a newer ECN RFC. A few comments below:
On Mon, Apr 27, 2020 at 8:47 AM Toke Høiland-Jørgensen <[email protected]> wrote: > RFC6040 also recommends dropping packets on certain combinations of > erroneous code points on the inner and outer packet headers which shouldn't > appear in normal operation. The helper signals this by a return value > 1, > so also add a handler for this case. This worries me. In the old implementation, we propagate some outer header data to the inner header, which is technically an authenticity violation, but minor enough that we let it slide. This patch here seems to make that violation a bit worse: namely, we're now changing the behavior based on a combination of outer header + inner header. An attacker can manipulate the outer header (set it to CE) in order to learn whether the inner header was CE or not, based on whether or not the packet gets dropped, which is often observable. That's some form of an oracle, which I'm not too keen on having in wireguard. On the other hand, we pretty much already _explicitly leak this bit_ on tx side -- in send.c: PACKET_CB(skb)->ds = ip_tunnel_ecn_encap(0, ip_hdr(skb), skb); // inner packet ... wg_socket_send_skb_to_peer(peer, skb, PACKET_CB(skb)->ds); // outer packet We considered that leak a-okay. But a decryption oracle seems slightly worse than an explicit and intentional leak. But maybe not that much worse. I wanted to check with you: is the analysis above correct? And can you somehow imagine the ==2 case leading to different behavior, in which the packet isn't dropped? Or would that ruin the "[de]congestion" part of ECN? I just want to make sure I understand the full picture before moving in one direction or another. > + if (INET_ECN_decapsulate(skb, PACKET_CB(skb)->ds, > + ip_hdr(skb)->tos) > 1) > + if (INET_ECN_decapsulate(skb, PACKET_CB(skb)->ds, > + ipv6_get_dsfield(ipv6_hdr(skb))) > 1) The documentation for the function says: * returns 0 on success * 1 if something is broken and should be logged (!!! above) * 2 if packet should be dropped Would it be more clear to explicitly check for ==2 then? > +ecn_decap_error: > + net_dbg_ratelimited("%s: Non-ECT packet from peer %llu (%pISpfsc)\n", > + dev->name, peer->internal_id, > &peer->endpoint.addr); All the other error messages in this block are in the form of: "Packet .* from peer %llu (%pISpfsc)\n", so better text here might be "Packet is non-ECT from peer %llu (%pISpfsc)\n". However, do you think we really need to log to the console for this? Does it add much in the way of wireguard internals debugging? Can't network congestion induce it in complicated scenarios? Maybe it'd be best just to increment those error counters instead. > + ++dev->stats.rx_errors; > + ++dev->stats.rx_length_errors; This should use stats.rx_frame_errors instead of length_errors, which is also what net/ipv6/sit.c and drivers/net/geneve.c do on ECN-related drops. Thanks, Jason
