Hi,
I could at least (temporary) fix this issue by adding the correct src IP to the
routes like shown in the following example.
Now I just don't fully understand, what causes wireguard to select the IP from
the wrong interface. Or based on what it selects the IP in the first place.
Even ping gives the warning: "ping: Warning: source address might be selected on
device other than wg0."
That warning goes away when the routes have the correct IP set as src.
-> But I can definitely say that wireguard somehow selects the wrong IP for
outgoing packets.
Just that I don't know why this happens only on 5 out of over 20 devices with
same configuration..
ip route del <NET>
ip route add <NET> dev <ALIAS_DEV> src <SRC_IP>
information about interfaces:
root@zi1-router:~# ip a sh wg0
18: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN
group default qlen 1000
link/none
root@zi1-router:~# ip -4 a sh br0
11: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
group default qlen 1000
inet 78.41.x.y/32 scope global br0
valid_lft forever preferred_lft forever
root@zi1-router:~# ip -4 a sh br1
12: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
group default qlen 1000
inet 10.34.0.100/24 brd 10.34.0.255 scope global br1
valid_lft forever preferred_lft forever
root@zi1-router:~# ip r sh dev wg0
10.5.44.0/24 scope link
172.27.0.0/24 scope link
root@zi1-router:~# ip r d 172.27.0.0/24
root@zi1-router:~# ip r d 10.5.44.0/24
root@zi1-router:~# ip r a 172.27.0.0/24 dev wg0 src 10.34.0.100
root@zi1-router:~# ip r a 10.5.44.0/24 dev wg0 src 10.34.0.100
root@zi1-router:~# ip r sh dev wg0
10.5.44.0/24 scope link src 10.34.0.100
172.27.0.0/24 scope link src 10.34.0.100
root@zi1-router:~# ping 172.27.0.1
PING 172.27.0.1 (172.27.0.1) 56(84) bytes of data.
64 bytes from 172.27.0.1: icmp_seq=1 ttl=64 time=13.1 ms
Kind regards,
Christoph
Am 19.11.2021 um 01:11 schrieb Christoph Loesch:
if relevant, some more details about interface and routes from good and bad
example to compare:
root@eng196-router:~# ip a sh wg0
46: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN
group default qlen 1000
link/none
root@eng196-router:~# ip r sh dev wg0
10.5.44.0/24 scope link
172.27.0.0/24 scope link
root@eng196-router:~# ip a sh br1
11: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
group default qlen 1000
link/ether 44:d9:e7:x:y:z brd ff:ff:ff:ff:ff:ff
inet 10.29.85.100/24 brd 10.29.85.255 scope global br1
valid_lft forever preferred_lft forever
inet6 fe80::7c4c:1dff:fe84:fece/64 scope link
valid_lft forever preferred_lft forever
root@zi1-router:~# ip a sh wg0
18: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN
group default qlen 1000
link/none
root@zi1-router:~# ip r sh dev wg0
10.5.44.0/24 scope link
172.27.0.0/24 scope link
root@zi1-router:~# ip a sh br1
12: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
group default qlen 1000
link/ether 74:83:c2:x:y:z brd ff:ff:ff:ff:ff:ff
inet 10.34.0.100/24 brd 10.34.0.255 scope global br1
valid_lft forever preferred_lft forever
inet6 fe80::2c2e:76ff:fedc:d8e/64 scope link
valid_lft forever preferred_lft forever
Am 19.11.2021 um 00:40 schrieb Christoph Loesch:
Hi,
I am using wireguard on about 20 EdgeRouters (based on Debian stretch).
Each router has exact same configuration (apart from router ip addresses and
wireguard keys/passphrases).
Works very well on most of them but on five routers wireguard uses the wrong ip
address for outgoing connections over the tunnel.
All routers use kernel 4.14.54-UBNT and wireguard-tools v1.0.20210914
Wireguard debian package is from github/WireGuard/wireguard-vyatta-ubnt
On the problematic routers the public ip address is used for the tunnel instead
the private ip address.
Interestingly even in the bad example the wg tunnel is running and the server
can reach the routers(=wg clients), but not the other way round.
In the following examples 172.27.0.1 is the wireguard server internal ip
address.
Routers use ip addresses in the 10.0.0.0/8 range for the wg tunnel which are
allowed on the server.
I already even debugged this with tcpdump where I found out it uses the wrong
ip.
But looking at a simple ping you also notice the wrong ip after the word "from".
Good example:
eng196-router:~$ \ping -I wg0 -c1 172.27.0.1
ping: Warning: source address might be selected on device other than wg0.
PING 172.27.0.1 (172.27.0.1) from 10.29.85.100 wg0: 56(84) bytes of data.
64 bytes from 172.27.0.1: icmp_seq=1 ttl=64 time=6.82 ms
--- 172.27.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 6.826/6.826/6.826/0.000 ms
Bad example:
zi1-router:~$ \ping -I wg0 -c1 172.27.0.1
ping: Warning: source address might be selected on device other than wg0.
PING 172.27.0.1 (172.27.0.1) from 78.41.x.y wg0: 56(84) bytes of data.
--- 172.27.0.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Configurations:
eng196-router:~# wg
interface: wg0
public key: SoV2obcH0qWfCRY3gZbkLNeMa1QRcnhNDCeiI9weszA=
private key: (hidden)
listening port: 58205
peer: 1syRMYD1jIVFMUMm5hF/j0MzjMQmuC5mlcT1VVugIkU=
preshared key: (hidden)
endpoint: 86.59.x.y:1024
allowed ips: 172.27.0.0/24, 10.5.44.0/24
latest handshake: 53 seconds ago
transfer: 24.57 MiB received, 26.48 MiB sent
persistent keepalive: every 25 seconds
zi1-router:~# wg
interface: wg0
public key: aYtVhblpR0XSsAb/dXF3zM9Hu+LxlvrR5RWFU2psF3M=
private key: (hidden)
listening port: 45514
peer: 1syRMYD1jIVFMUMm5hF/j0MzjMQmuC5mlcT1VVugIkU=
preshared key: (hidden)
endpoint: 86.59.x.y:51820
allowed ips: 172.27.0.0/24, 10.5.44.0/24
latest handshake: 13 seconds ago
transfer: 1.79 MiB received, 6.26 MiB sent
persistent keepalive: every 25 seconds
What could cause the wrong selection?
Why does that work for most routers but for some not? There must be some
difference or something gets confused up by specific ip addresses I guess?
How could I debug this further to find the difference and/or cause for this
problem?
Thanks for any hints and kind regards,
Christoph