Hey Wei, On Sat, Sep 10, 2022 at 06:34:42AM -0500, Wei Chen wrote: > Hi, > > Jason once suggested use a netfilter module for obfuscation. Here is one. > > https://github.com/infinet/xt_wgobfs > > It uses SipHash 1-2 to generate pseudo-random numbers in a reproducible way. > Sender and receiver share a siphash secret key. Sender creates and receiver > re-creates identical siphash output, if input is same. These siphash outputs > are used for obfuscation. > > - The first 16 bytes of WG message is obfuscated. > - The mac2 field is also obfuscated if it is all zeros. > - Padding WG message with random bytes, which also has random length. They are > from kernel get_random_bytes_wait() though. > - Drop 80% of keepalive message at random. Again randomness is from kernel. > - Change the Diffserv field to zero. > > Tested working on Alpine linux kernel 5.15 and CentOS 7 kernel 3.10. > > Performance test in two Alpine VMs running on same host. Each VM has 1 CPU and > 256 MB RAM. Iperf3 results 1.1Gbits/s without,vs 860Mbits/s with obfuscation.
This is super cool! I'm very glad to see that you've made this. A couple considerations for improvement (take them or leave them): - Instead of using siphash, if you can make use of 64 bytes of randomness at a time, you might be able to get away with chacha8 (or even lower). The input to chacha20 is typically a 256 bit key and a nonce, but because we don't care about the cryptographic security here -- wireguard handles that part -- we can play fast and lose, and make our threat model, "would be too computationally complex to detect in real time". Things become quite fun when you don't need real crypto. To that end, we could perhaps get away with using chacha8 instead of chacha20, and doing so with a 128-bit key. This then provides lots of input to chacha: * 16 bytes, where the second half of that key was * 16 bytes nonce (since it doesn't look like you need more than one block) * If you really want to play fast and loose: 32-byte constant... Again, this is awful cryptographic advice, but from a traffic analysis point of view, I doubt it makes a difference. On the other hand, if all you need is 16 bytes output, then I guess siphash gets the job done. - get_random_bytes() is slow if all you need is a byte at a time. That computes 96 bytes and then throws away 88 bytes of it. Instead, you can use get_random_u32(), which batches, and throw away 3 bytes. Or, I think I'll add to kernel 6.1 get_random_u8(), which will waste nothing. But actually, do you really need to do that? Can't you just run chacha or siphash or whatever super fast non-cryptographic thing you have, and just have an incrementing nonce? Or, better, since those keepalive messages already have a suitably random poly1305 tag, just run siphash on that, and discard if the resultant first byte is high/low/whatever. - If this is to ever go upstream, you might want to add a `--obfs-type N` parameter to the XT userspace library and the IPC struct, and make it mandatory. To begin, everybody would use `--obfs-type 1`, since that's all there is. But maybe overtime, you'll add a fake TCP mode or a fake QUIC mode or a fake HTML mode, and then the types will grow. This way, maintenance wise, you only have to send updates to the netfilter module in the kernel, and don't need to update the libxt part. Jason