By default (at least on EL based repos) dnsmasq reads /etc/resolv.conf and 
forwards external requests to your configured upstream DNS.
The behavior should be quite similar to xCAT. You can resolve local and 
external addresses via dnsmasq.

Furthermore, you could also configure your dnsmasq to zone transfer your 
internal DNS records to your upstream DNS.

This would look something like this:

interface=ens123
auth-server=confluent.cluster.domain.com,ens123
auth-soa=0,hostmas...@cluster.domain.com,21600,3600,2419200
auth-ttl=1200
auth-sec-servers=dns1.cluster.domain.com,dns2.cluster.domain.com
auth-peer=10.10.0.1,10.10.0.2
auth-zone=cluster.domain.com,10.20.0.0/24,10.30.0.0/24


?Mit freundlichen Grüßen / Kind regards


Markus Hilger



HPC Engineer



MEGWARE Computer Vertrieb und Service GmbH

Tel:          +49 3722 528-47



Nordstraße 19

markus.hil...@megware.com<mailto:markus.hil...@megware.com>



09247 Chemnitz-Röhrsdorf, Germany

www.megware.com<http://www.megware.com/>



Geschäftsführer: André Singer, Dr. Axel Auweter





Amtsgericht: Chemnitz HRB 584

________________________________
Von: Vinícius Ferrão via xCAT-user <xcat-user@lists.sourceforge.net>
Gesendet: Mittwoch, 23. Oktober 2024 00:56
An: Jarrod Johnson <jjohns...@lenovo.com>
Cc: Vinícius Ferrão <fer...@versatushpc.com.br>; xCAT Users Mailing list 
<xcat-user@lists.sourceforge.net>
Betreff: Re: [xcat-user] [External] Confluent dnsmasq not resolving local names

Hmmm.

I'm suspecting DNS because sometimes I have nodes that timeouts during ssh for 
like 20-30s. And by experience that always ended up being DNS. After 
consecutive queries it will eventually solves by itself.

When connecting with debug it hangs here:
debug1: Authentication succeeded (publickey).
Authenticated to cn001 ([172.28.0.1]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessi...@openssh.com
debug1: Entering interactive session.
debug1: pledge: network

nslookup was just a way to test if everything is fine.

Another issue with confluent is that if I use an external NS on 
/etc/resolv.conf it will obviously not resolve anymore internal names. On xCAT, 
since it shipped bind9, it always worked without any issues, internal and 
external queries.

What is the expectation on Confluent? Does my setup is correct? It seems pretty 
much unreliable to be honest.

Thanks Jarrod,


On 22 Oct 2024, at 11:35, Jarrod Johnson <jjohns...@lenovo.com> wrote:

Oh, I see.

[root@r3u20 ~]# nslookup -query=A 
r3u21.devcluster.net<http://r3u21.devcluster.net/>
Server:         127.0.0.1
Address:        127.0.0.1#53

Name:   r3u21.devcluster.net<http://r3u21.devcluster.net/>
Address: 172.30.193.21

[root@r3u20 ~]# nslookup -query=AAAA 
r3u21.devcluster.net<http://r3u21.devcluster.net/>
Server:         127.0.0.1
Address:        127.0.0.1#53

** server can't find r3u21.devcluster.net<http://r3u21.devcluster.net/>: REFUSED

Nslookup by default is doing an A, then a AAAA query, and dnsmasq is refusing 
the AAAA.  If you don't want that, then -query=A.

[root@r3u20 ~]# nslookup r3u21
Server:         127.0.0.1
Address:        127.0.0.1#53

Name:   r3u21.devcluster.net<http://r3u21.devcluster.net/>
Address: 172.30.193.21
** server can't find r3u21.devcluster.net<http://r3u21.devcluster.net/>: REFUSED

[root@r3u20 ~]# fg
vim /etc/hosts
[root@r3u20 ~]# systemctl restart dnsmasq
[root@r3u20 ~]# nslookup r3u21
Server:         127.0.0.1
Address:        127.0.0.1#53

Name:   r3u21.devcluster.net<http://r3u21.devcluster.net/>
Address: 172.30.193.21
Name:   r3u21.devcluster.net<http://r3u21.devcluster.net/>
Address: fdec:46f7:9b7f:3001::3:21




________________________________
From: Vinícius Ferrão <fer...@versatushpc.com.br>
Sent: Tuesday, October 22, 2024 9:33 AM
To: Jarrod Johnson <jjohns...@lenovo.com>
Cc: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Subject: Re: [External] [xcat-user] Confluent dnsmasq not resolving local names

Hi Jarrod,

[root@mmgt01 ~]# grep cn003 /etc/hosts
172.28.0.3 cn003 cn003.cluster.domain.com
172.27.0.3    cn003-ib0   cn003-ib0.cluster.domain.com

On 22 Oct 2024, at 08:44, Jarrod Johnson <jjohns...@lenovo.com> wrote:

What does:
grep cn003 /etc/hosts

Show?
________________________________

From: Vinícius Ferrão via xCAT-user 
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Sent: Monday, October 21, 2024 11:40 PM
To: xCAT Users Mailing list 
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Cc: Vinícius Ferrão 
<fer...@versatushpc.com.br<mailto:fer...@versatushpc.com.br>>
Subject: [External] [xcat-user] Confluent dnsmasq not resolving local names

Hello,

I'm running Confluent com a cluster and it seems that dnsmasq does not work as 
expected:

[root@mmgt01 etc]# !nslookup
nslookup cn003
Server: 127.0.0.1
Address: 127.0.0.1#53

Name: cn003.cluster.domain.com<http://cn003.cluster.domain.com/>
Address: 172.28.0.3
** server can't find 
cn003.cluster.domain.com<http://cn003.cluster.domain.com/>: REFUSED


Any ideia why it get's refused but resolves? This issue many times slows down 
name resolution.

Here's /var/log/messages:

Oct 22 00:29:02 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:29:02 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - 
local interface
Oct 22 00:29:02 mmgt01 dnsmasq[400928]: read /etc/hosts - 181 addresses
Oct 22 00:32:14 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:32:14 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - 
local interface
Oct 22 00:32:15 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:32:15 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - 
local interface
Oct 22 00:32:53 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:32:53 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - 
local interface
Oct 22 00:32:53 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:32:53 mmgt01 dnsmasq[400928]: ignoring nameserver 172.28.253.1 - 
local interface
Oct 22 00:34:32 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:34:32 mmgt01 dnsmasq[400928]: ignoring nameserver 127.0.0.1 - local 
interface
Oct 22 00:34:32 mmgt01 dnsmasq[400928]: reading /etc/resolv.conf
Oct 22 00:34:32 mmgt01 dnsmasq[400928]: ignoring nameserver 127.0.0.1 - local 
interface
Oct 22 00:34:38 mmgt01 dnsmasq[400928]: exiting on receipt of SIGTERM
Oct 22 00:34:38 mmgt01 systemd[1]: dnsmasq.service: Succeeded.
Oct 22 00:34:38 mmgt01 dnsmasq[405484]: started, version 2.79 cachesize 150
Oct 22 00:34:38 mmgt01 dnsmasq[405484]: compile time options: IPv6 GNU-getopt 
DBus no-i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC 
loop-detect inotify
Oct 22 00:34:38 mmgt01 dnsmasq[405484]: reading /etc/resolv.conf
Oct 22 00:34:38 mmgt01 dnsmasq[405484]: ignoring nameserver 127.0.0.1 - local 
interface
Oct 22 00:34:38 mmgt01 dnsmasq[405484]: read /etc/hosts - 181 addresses


And here's /etc/resolv.conf:

[root@mmgt01 etc]# cat /etc/resolv.conf
# Generated by NetworkManager
search cluster.domain.com<http://cluster.domain.com/>
nameserver 127.0.0.1



dnsmasq settings is default by confluent instructions:
[root@mmgt01 etc]# cat /etc/dnsmasq.conf  | grep -v \# |uniq

user=dnsmasq
group=dnsmasq

conf-dir=/etc/dnsmasq.d,.rpmnew,.rpmsave,.rpmorig




mmgt01 is the management node.


Ideas?

Regards,





_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fxcat-user&data=05%7C02%7Cjjohnson2%40lenovo.com%7C338146a50cbb47ad641708dcf24d8e19%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C638651662158200105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=3OIWGoiWH73quH9qh5YkrmqmiHPvNgA5k%2FLOWBBEVfY%3D&reserved=0<https://lists.sourceforge.net/lists/listinfo/xcat-user>

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to