Hi Sandra,
 
I define mgt0 node in my MN, delete `mgt0-pub:172.16.13.99... ...` in `otherinterfaces` from hosts table,  and add `nicaliases.eth0=mgt0-mgt` in node definition, then execute `makehosts mgt0`, it can generate `mgt0-pub` and `mgt0-data` in /etc/hosts file, the domain `cluster.com` is coming from my `site` table. I curious that you need to define mgt0-pub and mgt0-data in otherinterfaces from `hosts` table.
 
BTW: is there  DHCP server for eth1 before the node provision?
 
My example here:
 
[root@bybc0607 ~]# lsdef mgt0
Object name: mgt0
    arch=x86_64
    authdomain=mcri.edu.au
    chain=standby
    conserver=xcat
    currchain=boot
    currstate=boot
    domaintype=activedirectory
    groups=mgt,vm
    hostnames=mgt0
    ip=10.40.113.99
    mac=<snip>
    mgt=esx
    netboot=pxe
    nfsdir=/install
    nfsserver=xcat
    nicaliases.eth0=mgt0-mgt
    nichostnamesuffixes.eth2=-data
    nichostnamesuffixes.eth1=-pub
    nicips.eth2=10.50.113.99
    nicips.eth1=172.16.13.99
    nicips.eth0=10.40.113.99
    nicnetworks.eth2=Data
    nicnetworks.eth1=Public
    nicnetworks.eth0=Management
    nictypes.eth2=Ethernet
    nictypes.eth1=Ethernet
    nictypes.eth0=Ethernet
    os=centos7.5
    ou=<snip>
    postbootscripts=otherpkgs,<snip>
    postscripts=syslog,remoteshell,syncfiles,setupntp,confignics,<snip>
    profile="">    provmethod=centos7-mgt
    routenames=14NetRoute,MySQLUCSCRoute
    servicenode=xcat
    status=failed
    statustime=10-29-2018 14:14:01
    updatestatus=failed
    updatestatustime=10-29-2018 13:53:40
[root@bybc0607 ~]# makehosts mgt0
 
[root@bybc0607 ~]# grep mgt0 /etc/hosts
10.40.113.99 mgt0 mgt0.cluster.com mgt0-mgt
10.50.113.99 mgt0-data mgt0-data.cluster.com
172.16.13.99 mgt0-pub mgt0-pub.cluster.com

[root@bybc0607 ~]# lsxcatd -v
Version 2.14.4 (git commit 51bd7fea2746d1812aa0eba3d655d63e16b718e2, built Wed Oct 17 06:15:55 EDT 2018)
 
 
Best Regards
--------------------------------------------------
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193
 
 
----- Original message -----
From: Sandra Maksimovic <sandra.maksimo...@mcri.edu.au>
To: 'xCAT Users Mailing list' <xcat-user@lists.sourceforge.net>
Cc:
Subject: Re: [xcat-user] unexpected hostname
Date: Thu, Nov 1, 2018 1:29 PM
 

Btw I managed to work around this issue by setting eth1 to use DHCP and eth0 to send DHCP_HOSTNAME using a postscript.

 

Cheers,

Sandra

 

From: Sandra Maksimovic
Sent: Tuesday, 30 October 2018 6:30 PM
To: 'xCAT Users Mailing list' <xcat-user@lists.sourceforge.net>
Subject: RE: [xcat-user] unexpected hostname

 

Hi Yuan,

 

Just to let you know, it seems that when I remove otherinterfaces=”mgt0-pub:172.16.13.99,mgt0-data:10.50.113.99” from the mgt0 definition, the /etc/hosts file does not regenerate with mgt0-pub or mgt0-data entries, only mgt0 and its fqdn is listed.

 

The xcat servicenode should be managing nodes over the 10.40.0.0/24 network, however, I don’t think this has been setup properly because the servicenode table is blank. A lot of this new cluster’s configuration has been carried over from our current prod iteration so I’m not sure whether some of these definitions are still relevant.

 

The /var/lib/dhclient directory is missing the dhclient.leases file but contains the following:

 

# cat chrony.servers.eth0

10.40.115.100 iburst

 

# cat ntp.conf.predhclient.eth0

<blank>

 

The IP 10.40.115.100 is the management NIC on my xCAT server, which seems to indicate the correct provisioning network…

 

I’ve just noticed that when I run ‘dhclient’ manually on the ‘mgt0-pub’ node the leases file appears along with some others…

 

dhcp-server-identifier on eth0 (which is the mgt/provisioning NIC on the 10.40.0.0 net) is 10.40.115.100

host-name is “mgt0”

 

I’m now wondering what would have stopped this information from being generated during deployment? And would this have managed to impact the hostname?

 

Many thanks,

Sandra

 

From: Yuan Y Bai <by...@cn.ibm.com>
Sent: Monday, 29 October 2018 4:40 PM
To: xcat-user@lists.sourceforge.net
Cc: xcat-user@lists.sourceforge.net
Subject: Re: [xcat-user] unexpected hostname

 

Hi Sandra

 

From your node definition,  `nichostnamesuffixes.eth1=-pub nicips.eth1=172.16.13.99` will generate `172.16.13.99 mgt0-pub ......` entry in /etc/hosts file. No need to `mgt0-pub:172.16.13.99` in otherinterfaces. 

 

And you use service node,  `servicenode=xcat`, which network service node use?  

 

Could you login `mgt0-pub` and check lease file under directory `/var/lib/dhclient` to see what are  `dhcp-server-identifier`  and `host-name`?

It seems `mgt0` node get hostname `mgt0-pub` from 172.xx.xx.xx DHCP server. The provision network should 10.xx.xx.xx network.

 

 

Best Regards
--------------------------------------------------
Yuan Bai (
白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM
环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

 

 

----- Original message -----
From: Sandra Maksimovic <sandra.maksimo...@mcri.edu.au>
To: "'xcat-user@lists.sourceforge.net'" <xcat-user@lists.sourceforge.net>
Cc:
Subject: Re: [xcat-user] unexpected hostname
Date: Mon, Oct 29, 2018 12:15 PM
 

Hi Bin,

 

Thanks for your response.

 

mgt0 and mgt0-pub do not point to the same IP address nor are they in the same subnet. Please see the output below:

 

Object name: mgt0

    arch=x86_64

    authdomain=mcri.edu.au

    chain=standby

    conserver=xcat

    currchain=boot

   currstate=boot

    domaintype=activedirectory

    groups=mgt,vm

    hostnames=mgt0

    ip=10.40.113.99

    mac=<snip>

    mgt=esx

    netboot=pxe

    nfsdir=/install

    nfsserver=xcat

    nichostnamesuffixes.eth0=-mgmt

    nichostnamesuffixes.eth1=-pub

    nichostnamesuffixes.eth2=-data

    nicips.eth0=10.40.113.99

    nicips.eth1=172.16.13.99

    nicips.eth2=10.50.113.99

    nicnetworks.eth0=Management

    nicnetworks.eth1=Public

    nicnetworks.eth2=Data

    nictypes.eth0=Ethernet

    nictypes.eth1=Ethernet

    nictypes.eth2=Ethernet

    os=centos7.5

    otherinterfaces=mgt0-pub:172.16.13.99,mgt0-data:10.50.113.99

    ou=<snip>

    postbootscripts=otherpkgs,<snip>

    postscripts=syslog,remoteshell,syncfiles,setupntp,confignics,<snip>

    profile="">

    provmethod=centos7-mgt

    routenames=14NetRoute,MySQLUCSCRoute

    servicenode=xcat

    status=failed

    statustime=10-29-2018 14:14:01

    updatestatus=failed

    updatestatustime=10-29-2018 13:53:40

 

FYI some of our postscripts are failing during deployment which is why the updatestatus=failed.

 

Also, thanks Brian for your suggestion, I shall look into this further regarding the NIC setup. I did a quick test and this doesn’t appear to be what I’m after at this stage since the deployed node’s hostname is unaffected when specifying the nicaliases.

 

Thanks,

Sandra

 

 

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Friday, October 26, 2018 5:17 PM, Bin XA Xu <bx...@cn.ibm.com> wrote:

 

Hi Sandra,

 

    Is the mgt0 and mgt0-pub pointing to the same IP address, or in the same subnet?  And what's your `mgt01` definition, you can use `lsdef mgt01` to get the information and hide the sensitive attributes.

 

    And Yuan, do you have more suggestions?

 

Bin Xu

HPC Software Development
Software Defined Infrastructure, IBM Systems

Phone: 86-010-82454067

 

 

----- Original message -----

From: Sandra Maksimovic via xCAT-user <xcat-user@lists.sourceforge.net>

Cc: Sandra Maksimovic <sm....@pm.me>

Subject: [xcat-user] unexpected hostname

Date: Thu, Oct 25, 2018 11:35 PM

 

Hi all,

 

xCAT/HPC/list newbie here!

 

I have recently configured an xCAT node and am attempting to provision a separate management node, but for some reason xCAT is sort of not applying the expected hostname.

 

I'd like the resulting hostname on the node to just be "mgt0", but instead it's tacking on the public NIC suffix as well as the FQDN, i.e. mgt0-pub.meerkat.mcri.edu.au

 

The cluster is entirely CentOS7 based and will be eventually utilising MOAB and PBS/Torque for scheduling and resource management. The version of xCAT for this particular build is v2.14.4.

 

I've trawled through the debug enabled build logs and stepped through post.rh.common and from what I can tell the node should just be named "mgt0" (sans all suffixes).

 

Also, the DNS on the xCAT node contains entries for "mgt0", "mgt0-data", "mgt0-pub", but (if this is indeed the issue) I'm not sure why xCAT would have selected "mgt0-pub" to hand out when the node is being provisioned via its management IP which is actually associated with "mgt0" (as opposed to its public one which is associated with "mgt0-pub").

 

Any ideas on other avenues that might be worth investigating?

 

Also, please feel free recommend some useful resources for learning xCAT and/or HPC in general! I'm already heavily utilising the official xCAT docs and the Sourceforge Wiki/mailing list search...

 

Cheers,

Sandra

 

Sent from ProtonMail, encrypted email based in Switzerland.

 

 

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

 

 

This e-mail and any attachments to it (the "Communication") are, unless otherwise stated, confidential, may contain copyright material and is for the use only of the intended recipient. If you receive the Communication in error, please notify the sender immediately by return e-mail, delete the Communication and the return e-mail, and do not read, copy, retransmit or otherwise deal with it. Any views expressed in the Communication are those of the individual sender only, unless expressly stated to be those of Murdoch Children’s Research Institute (MCRI) ABN 21 006 566 972 or any of its related entities. MCRI does not accept liability in connection with the integrity of or errors in the Communication, computer virus, data corruption, interference or delay arising from or in respect of the Communication.

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

 

 

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
 

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to