Hmmm. I'm suspecting DNS because sometimes I have nodes that timeouts during ssh for like 20-30s. And by experience that always ended up being DNS. After consecutive queries it will eventually solves by itself.
When connecting with debug it hangs here: debug1: Authentication succeeded (publickey). Authenticated to cn001 ([172.28.0.1]:22). debug1: channel 0: new [client-session] debug1: Requesting [email protected] debug1: Entering interactive session. debug1: pledge: network nslookup was just a way to test if everything is fine. Another issue with confluent is that if I use an external NS on /etc/resolv.conf it will obviously not resolve anymore internal names. On xCAT, since it shipped bind9, it always worked without any issues, internal and external queries. What is the expectation on Confluent? Does my setup is correct? It seems pretty much unreliable to be honest. Thanks Jarrod, On 22 Oct 2024, at 11:35, Jarrod Johnson <[email protected]> wrote: Oh, I see. [root@r3u20 ~]# nslookup -query=A r3u21.devcluster.net<http://r3u21.devcluster.net/> Server: 127.0.0.1 Address: 127.0.0.1#53 Name: r3u21.devcluster.net<http://r3u21.devcluster.net/> Address: 172.30.193.21 [root@r3u20 ~]# nslookup -query=AAAA r3u21.devcluster.net<http://r3u21.devcluster.net/> Server: 127.0.0.1 Address: 127.0.0.1#53 ** server can't find r3u21.devcluster.net<http://r3u21.devcluster.net/>: REFUSED Nslookup by default is doing an A, then a AAAA query, and dnsmasq is refusing the AAAA. If you don't want that, then -query=A. [root@r3u20 ~]# nslookup r3u21 Server: 127.0.0.1 Address: 127.0.0.1#53 Name: r3u21.devcluster.net<http://r3u21.devcluster.net/> Address: 172.30.193.21 ** server can't find r3u21.devcluster.net<http://r3u21.devcluster.net/>: REFUSED [root@r3u20 ~]# fg vim /etc/hosts [root@r3u20 ~]# systemctl restart dnsmasq [root@r3u20 ~]# nslookup r3u21 Server: 127.0.0.1 Address: 127.0.0.1#53 Name: r3u21.devcluster.net<http://r3u21.devcluster.net/> Address: 172.30.193.21 Name: r3u21.devcluster.net<http://r3u21.devcluster.net/> Address: fdec:46f7:9b7f:3001::3:21 ________________________________ From: Vinícius Ferrão <[email protected]> Sent: Tuesday, October 22, 2024 9:33 AM To: Jarrod Johnson <[email protected]> Cc: xCAT Users Mailing list <[email protected]> Subject: Re: [External] [xcat-user] Confluent dnsmasq not resolving local names Hi Jarrod, [root@mmgt01 ~]# grep cn003 /etc/hosts 172.28.0.3 cn003 cn003.cluster.domain.com 172.27.0.3 cn003-ib0 cn003-ib0.cluster.domain.com On 22 Oct 2024, at 08:44, Jarrod Johnson <[email protected]> wrote: What does: grep cn003 /etc/hosts Show? ________________________________ From: Vinícius Ferrão via xCAT-user <[email protected]<mailto:[email protected]>> Sent: Monday, October 21, 2024 11:40 PM To: xCAT Users Mailing list <[email protected]<mailto:[email protected]>> Cc: Vinícius Ferrão <[email protected]<mailto:[email protected]>> Subject: [External] [xcat-user] Confluent dnsmasq not resolving local names Hello, I'm running Confluent com a cluster and it seems that dnsmasq does not work as expected: [root@mmgt01 etc]# !nslookup nslookup cn003 Server: 127.0.0.1 Address: 127.0.0.1#53 Name: cn003.cluster.domain.com<http://cn003.cluster.domain.com/> Address: 172.28.0.3 ** server can't find cn003.cluster.domain.com<http://cn003.cluster.domain.com/>: REFUSED Any ideia why it get's refused but resolves? This issue many times slows down name resolution. Here's /var/log/messages: Oct 22 00:29:02 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:29:02 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - local interface Oct 22 00:29:02 mmgt01 dnsmasq[400928]: read /etc/hosts - 181 addresses Oct 22 00:32:14 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:32:14 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - local interface Oct 22 00:32:15 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:32:15 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - local interface Oct 22 00:32:53 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:32:53 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - local interface Oct 22 00:32:53 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:32:53 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - local interface Oct 22 00:34:32 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:34:32 mmgt01 dnsmasq[400928]: ignoring nameserver 127.0.0.1 - local interface Oct 22 00:34:32 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf Oct 22 00:34:32 mmgt01 dnsmasq[400928]: ignoring nameserver 127.0.0.1 - local interface Oct 22 00:34:38 mmgt01 dnsmasq[400928]: exiting on receipt of SIGTERM Oct 22 00:34:38 mmgt01 systemd[1]: dnsmasq.service: Succeeded. Oct 22 00:34:38 mmgt01 dnsmasq[405484]: started, version 2.79 cachesize 150 Oct 22 00:34:38 mmgt01 dnsmasq[405484]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect inotify Oct 22 00:34:38 mmgt01 dnsmasq[405484]: reading /etc/resolv.conf Oct 22 00:34:38 mmgt01 dnsmasq[405484]: ignoring nameserver 127.0.0.1 - local interface Oct 22 00:34:38 mmgt01 dnsmasq[405484]: read /etc/hosts - 181 addresses And here's /etc/resolv.conf: [root@mmgt01 etc]# cat /etc/resolv.conf # Generated by NetworkManager search cluster.domain.com<http://cluster.domain.com/> nameserver 127.0.0.1 dnsmasq settings is default by confluent instructions: [root@mmgt01 etc]# cat /etc/dnsmasq.conf | grep -v \# |uniq user=dnsmasq group=dnsmasq conf-dir=/etc/dnsmasq.d,.rpmnew,.rpmsave,.rpmorig mmgt01 is the management node. Ideas? Regards, _______________________________________________ xCAT-user mailing list [email protected]<mailto:[email protected]> https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fxcat-user&data=05%7C02%7Cjjohnson2%40lenovo.com%7C338146a50cbb47ad641708dcf24d8e19%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638651662158200105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=3OIWGoiWH73quH9qh5YkrmqmiHPvNgA5k%2FLOWBBEVfY%3D&reserved=0<https://lists.sourceforge.net/lists/listinfo/xcat-user>
_______________________________________________ xCAT-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/xcat-user
