Regarding "genesis" functionality:

If you have a dynamic range defined in your provisioning network (dynamic range 
field of networks table), then xCAT will try to discover anything that PXE 
boots on that network. This assumes you have site.dhcpinterfaces defined to 
limit DHCP to the provisioning net. To disable automatic discovery, remove the 
dynamic range, and run "makedhcp -n".

If you are trying to get recently installed cluster nodes to not PXE, the 
easiest way is modifying the bootorder. If you are using UEFI install, the OS 
should do this for you by inserting its own boot entry at the top of the list 
(e.g., "Red Hat Enterprise Linux"). Once that entry is there, the node should 
not be PXE booting unless the boot entry is corrupted. If you're using legacy 
boot, you can put the HDD before the network (pasu <node> set 
BootOrder.BootOrder "Legacy Only=Hard Disk=PXE Network"). In either case, if 
you want to reinstall, you can use the rsetboot command (or nodeboot in 
confluent) to PXE on the next reboot. If recently installed nodes are still 
being discovered, my guess would be that the MAC is not discovered. A mac might 
be discovered, but notnecessarily the one it's trying to boot from. You can 
enter multiple macs for a node and specify NOIP for macs you don't want it to 
boot from (mac table entry: node,,"1:2:3:4|5:6:7:8!NOIP"). Make sure to run 
makedhcp after doing this. If all this doesn't work, what does "nodeset <node> 
status" return? Perhaps the node's install status is not getting updated 
correctly.

Regarding the ssh permissions, I'm not sure. It looks like the remoteshell 
postscript is slated to run. Perhaps check /var/log/xcat/ on the node for 
postscript logs to get a hint? What does /install/postscripts/_ssh look like on 
the management node? How about root's .ssh dir also on the management node?

Regarding otherpkgs:

The otherpkgs postscript should be under /install/postscripts with all 
postscripts. Things to remember when using otherpkgs:

The otherpkgdir is a repo. Make sure you run createrepo on the dir after adding 
packages.
The otherpkglist is a list of package names, not file names. So you would want 
"ssh-server" rather than "ssh-server-1.2.3-4.el7.x86_64.rpm". If you want the 
32-bit version of a package, you can specify "ssh-server.i686" for example.

Regards,
Christian Caruthers
Lenovo Professional Services
Mobile: 757-289-9872

-----Original Message-----
From: James Ault via xCAT-user <xcat-user@lists.sourceforge.net> 
Sent: Sunday, August 9, 2020 23:50
To: xcat-user@lists.sourceforge.net
Cc: James Ault <aul...@yahoo.com>
Subject: [External] [xcat-user] XCAT 2.16 Discovery Mode - Genesis

Hello. 

I am running XCAT 2.16 on CentOS 7 on an air-gapped network and I have a few 
questions:

1) how may I disable the "genesis" discovery functionality?

A previous node management system had a default behavior of booting from 
network, and only the nodes that were marked for install would be installed, 
but all other nodes would boot from local disk with a slight delay. 

Many of these nodes are getting stuck in a boot loop where they boot genesis 
instead of their own OS from local disk, and the genesis process never 
completes, it just loops forever.

I need to make this discovery process work properly or shut it off completely.

One relevant error message on console: 
 "xcat.genesis.minixcatd: The request is already processed by xCAT master, but 
not matched."
(repeat every 5 minutes for ever)


2) I have successfully installed an OS (CentOS 7.8) on a few nodes, but the 
nodes somehow are configured with ssh key files that have permissions "640" 
which cause the sshd to fail and exit with an error, which means I cannot login 
remotely.   This does not seem like a reasonable default.  If there is a 
configuration setting in XCAT that will help me fix this before the OS install 
is finished that would be very helpful.

3) I want to install other packages during the post install process, but any 
attempts so far have not succeeded:

Example typed from printed logs:

xcatmn# lsdef node1

node1:
   arch=x86_64
   bmc=10.10.10.10
   bmcpassword=(enter_password_here)
   cons=ipmi
   consoleenabled=1
   groups=all,x86_64
   mac=1:2:3:4:5:6
   mgt=ipmi
   netboot=xnba
   os=centos7.8
   postbootscripts=otherpkgs 
   postscripts=systlog,remoteshell,syncfiles
   profile=compute
   provmethod=centos7.8-x86_64-install-compute
   routenames=defaultroute
   status=failed
   status=(insert date here)
xcatmn# lsdef -t osimage centos7.8-x86_64-install-compute
  imagetype=linux
  osarch=x86_64
  osdistroname=centos7.8-x86_64
  osname=Linux
  osvers=centos7.8
  otherpkgdir=/install/post/otherpkgs/centos7.8/x86_64
  otherpkglist=/install/custom/pkglist/compute.otherpkglist.txt
  partitionfile=/install/custom/partition/compute-default-partition.txt
  pkgdir=/install/centos7.8/x86_64
  pkglist=/opt/xcat/share/xcat/install/centos/compute.centos7.pkglist
  profile=compute
  provmethod=install
  template=/opt/xcat/share/xcat/install/centos/compute.centos7.tmpl

I cannot find the "otherpkgs" script. 
I attempted to use "chdef" to create a "otherpkgs" attribute on the osimage, 
but it gave an error saying that attribute was only valid for an OS type of 
"NIM" or something that did not match my environment. 
I am trying to install 'ansible" and all the dependent packages have been 
gathered and copied to the "otherpkgdir" above and the list of packages in that 
dir is included in the "otherpkglist".

Sorry if I made typos above, and thanks in advance for your help.
  
-Jim 


_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to