Vinícius,
When I'm working with otherpkgs, I usually do the following:
do a yum install <packages> (but use the "don't actually install
anything" flag") just to get a list of the packages that are going to be
installed, including dependencies. This should also show any updates or
deletions that will happen
If everything looks good, then 'yum install --downloadonly
--downloaddir=<some temp dir or your otherpkgs dir> <package names>
That will download the rpm's with deps locally, then add package names
to otherpkgs list and run createrepo on the otherpkgs dir to pick up the
new rpms. Test with 'updatenode -P otherpkgs <nodename>' or redeploy
(best).
Probably not the only way to do it, but that works consistently for me.
So in short, the extra repos would only be on the head node for the
purposes of downloading rpms locally. The only repo the compute nodes
should look at are the xcat repos on the head node (for os and
otherpkgs). That's just a general HPC best practice, but edit to suit
your taste. For the most part, compute nodes should not talk to the
internet, but you can use firewalls to route requests through the head
node to talk to other repos and still retain some security.
-Brian J
On 6/6/21 01:46, Vinícius Ferrão via xCAT-user wrote:
Alright there's some gotchas.
* pkgdir can only in fact be used for packages available from the
distro itself (copycds); perhaps updates would be viable?
* I've changed to a scheme like Brian has said: otherpkgs, but it's
still confusing.
I wasn't able to use my "dnf reposync" mirrors with otherpkgdir,
instead I've added them just like in the last message, on pkgdir and I
was able to install the packages that are included with otherpkglist,
so here is my definitions right now:
# lsdef -t osimage ol8.4.0-x86_64-install-compute
Object name: ol8.4.0-x86_64-install-compute
imagetype=linux
osarch=x86_64
osdistroname=ol8.4.0-x86_64
osname=Linux
osvers=ol8.4.0
otherpkgdir=/install/post/otherpkgs/ol8.4.0/x86_64
otherpkglist=/install/custom/install/compute.otherpkgs.pkglist
pkgdir=/install/ol8.4.0/x86_64,/install/openhpc-2.2/CentOS_8,/install/epel-8/x86_64
pkglist=/install/custom/install/compute.pkglist
profile=compute
provmethod=install
synclists=/install/custom/install/compute.synclist
template=/opt/xcat/share/xcat/install/ol/compute.ol8.tmpl
/install/post/otherpkgs/ol8.4.0/x86_64 is just an empty folder that
I've run createrepo, with nothing, so dnf/yum will not break in the
first boot.
That's the content of: /install/custom/install/compute.otherpkgs.pkglist:
fping
libconfuse
libunwind
ohpc-base-compute
lmod-ohpc
So it seems OK, but again, I'm not sure if this is correct or not. I
was trying to keep files on otherpkgdir but xCAT can't create proper
repositories because I probably messed up with the otherpkgs.pkglist.
In the first try I created the file with the following contents:
epel-8/x86_64/fping
epel-8/x86_64/libconfuse
epel-8/x86_64/libunwind
openhpc-2.2/CentOS_8/updates/x86_64/ohpc-base-compute
openhpc-2.2/CentOS_8/x86_64/lmod-ohpc
But that was a no go. It just didn't worked with xCAT only creating a
broken repo file that messes the paths of EPEL and OpenHPC in a single
URL.
I still have the issue regarding the online repos, but I just sent a
rm -f on a postscript to "fix" the issue. Definitely no "The Right Way
(tm)" to do it.
Thanks again,
Vinícius.
On 6 Jun 2021, at 00:52, Vinícius Ferrão <fer...@versatushpc.com.br
<mailto:fer...@versatushpc.com.br>> wrote:
Thanks Mark and Brian.
I'm trying to find my way around, right now I've modified the following:
===> observe: pkgdir; pkglist and synclists.
# lsdef -t osimage ol8.4.0-x86_64-install-compute
Object name: ol8.4.0-x86_64-install-compute
imagetype=linux
osarch=x86_64
osdistroname=ol8.4.0-x86_64
osname=Linux
osvers=ol8.4.0
otherpkgdir=/install/post/otherpkgs/ol8.4.0/x86_64
pkgdir=/install/ol8.4.0/x86_64,/install/openhpc-2.2/CentOS_8,/install/epel-8/x86_64
pkglist=/opt/xcat/share/xcat/install/ol/compute.ol8.pkglist,/install/custom/install/compute.pkglist
profile=compute
provmethod=install
synclists=/install/custom/install/compute.synclist
template=/opt/xcat/share/xcat/install/ol/compute.ol8.tmpl
# cat /install/custom/install/compute.pkglist
yum-utils
perl
fping
libconfuse
libunwind
ohpc-base-compute
kernel-uek
lmod-ohpc
@infiniband
# cat /install/custom/install/compute.synclist
MERGE:
/etc/passwd -> /etc/passwd
/etc/group -> /etc/group
/etc/shadow -> /etc/shadow
The issue now is that, it seems that pkglist was ignored. I think I
should have added the extra pacakges to otherpkgs instead. Right?
The postscript seems to be ignored too:
===> observe: postscripts
[root@headnode epel-8]# lsdef -t node node01
Object name: node01
arch=x86_64
bmc=172.25.0.1
bmcpassword=calvin
bmcusername=root
cons=ipmi
consoleenabled=1
currchain=boot
currstate=boot
groups=compute,all
ip=172.26.0.1
mac=bc:97:e1:ca:35:10
mgt=ipmi
netboot=xnba
nicips.ib0=172.27.0.1
nicnetworks.ib0=ib0
nictypes.ib0=Infiniband
os=ol8.4.0
postbootscripts=otherpkgs,confignics
postscripts=syslog,remoteshell,syncfiles,versatushpc/postinstall
profile=compute
provmethod=ol8.4.0-x86_64-install-compute
serialport=0
serialspeed=115200
status=failed
statustime=06-06-2021 00:11:15
# cat /install/postscripts/versatushpc/postinstall
exec 1> >(logger -s -t xCAT -p local4.info <http://local4.info/>) 2>&1
# Create directories
mkdir -p /opt/spack
mkdir -p /opt/intel
# Configure limits
perl -pi -e 's/# End of file/\* soft memlock unlimited\n$&/s'
/etc/security/limits.conf
perl -pi -e 's/# End of file/\* hard memlock unlimited\n$&/s'
/etc/security/limits.conf
# Enable RDMA if it isn't enabled yet and start it
systemctl enable --now rdma
# Configure and enable OpenPBS
perl -pi -e "s/PBS_SERVER=\S+/PBS_SERVER=headnode/" /etc/pbs.conf
echo "PBS_LEAF_NAME=headnode" >> /etc/pbs.conf
/opt/pbs/libexec/pbs_habitat
perl -pi -e "s/\$clienthost \S+/\$clienthost headnode/"
/var/spool/pbs/mom_priv/config
echo "\$usecp *:/home /home" >> /var/spool/pbs/mom_priv/config
systemctl enable --now pbs
I'm not sure if the postscript didn't run because something went
wrong on the install phase of not.
And as for a last question, for stateful nodes, shouldn't internet
repositores be disabled by default? I'm asking this because nodes
don't have internet connection.
Thanks again.
PS: For me stateless mode seem way easier, but this is probably due
to the fact that I'm used to it.
On 4 Jun 2021, at 11:05, Mark Gurevich <gurev...@us.ibm.com
<mailto:gurev...@us.ibm.com>> wrote:
For #4, you can also include pkglist inside pkglist with "INCLUDE":
[]$ cat cudafull.rhels7.ppc64le.pkglist
#INCLUDE:compute.rhels7.pkglist#
#For Cuda
kernel-devel
gcc
pciutils
dkms
cuda
[]$
Mark Gurevich
Poughkeepsie Development Lab
HPC Software Development - xCAT
"If we knew what it was we were doing, it would not be called
research, would it?"
--Albert Einstein
<graycol.gif>Brian Joiner ---06/04/2021 09:32:37 AM---1. Syncfiles
is a default postbootscript, so its supposed to run on every
deploy. If its not runn
From: Brian Joiner <martinitime1...@gmail.com
<mailto:martinitime1...@gmail.com>>
To: xcat-user@lists.sourceforge.net
<mailto:xcat-user@lists.sourceforge.net>
Date: 06/04/2021 09:32 AM
Subject: [EXTERNAL] Re: [xcat-user] Stateful provisioning customization
------------------------------------------------------------------------
1. Syncfiles is a default postbootscript, so its supposed to run on
every deploy. If its not running, check your postscript table to
make sure its there as a default. You can call it again if you
think a package install is overwriting one of your custom files,
just add it to the node postbootscipt line in the desired order
2. Otherpkgs works fine, the difference is it runs as a
postbootscript after the reboot, as if you were running a yum
command from the OS (as opposed to stateless which packages them up
in the image)
3. For stateless there really is no "image" as fas as I know, all
customizations are handled with OS/group/node definitions and
postscripts. I don't like to mess with the osdef too much other
than the syncfiles.list and otherpkgs stuff. Kepp in mind, you can
create any script you want, for example to install extra rpm's after
the main os deploys but before the reboot (like say mellanox
drivers, that may require a reboot)
4. I don't think so, but again if you need other packages just
create a script with a yum command and attach it to the group/node
def postbootscript or posctript line (making sure your order is what
you want).
Thanks,
Brian Joiner
On 6/4/21 00:30, Vinícius Ferrão via xCAT-user wrote:
Hello,
I'm doing an stateful install right now, and I have some
questions to those who use the stateful method. Since I'm
already used to stateless provisioning I'm trying to adapt it's
concepts to stateful.
So here we go:
1. Can I use syncfiles to issue "updatenode all -F" when needed?
The ideia is to have a custom file with the synclist and run a
command similar to: chdef -t osimage -o
ol8.4.0-x86_64-install-compute
synclists="/install/custom/install/compute.synclist"
2. Otherpkgs works in stateful profile?
Can I add otherpkgdir and otherpkglist to -install images? It
will install the packages during the provision phase? There's
any use case for it?
3. Where should I do the customization inside the image?
On stateless I just chroot after "genimage", do whatever I need
to do, change confs, enable/disable service, etc, and them
"packimage". How can I achieve something similar with stateful
nodes?
4. Can I have multiple pkglist and otherpkglist files?
The ideia here is to keep the default ones from xCAT untouched
and just add additional ones separated by commas in osimage
definition.
Thanks all.
PS: I did some reading before, but I was only able to find
precise information, I've only found for hierarchical clusters
and specific cases like CUDA, and finally I'm not sure which is
"The Right Way (tm)" to achieve the functionality mentioned.
_https://xcat-docs.readthedocs.io/en/stable/advanced/hierarchy/provision/diskful_sn.html_
<https://xcat-docs.readthedocs.io/en/stable/advanced/hierarchy/provision/diskful_sn.html>
_https://xcat-docs.readthedocs.io/en/stable/advanced/gpu/nvidia/osimage/rhels.html#diskless-images_
<https://xcat-docs.readthedocs.io/en/stable/advanced/gpu/nvidia/osimage/rhels.html#diskless-images>
_https://myxcat.readthedocs.io/en/latest/advanced/networks/infiniband/mlnxofed_ib_install_v2_diskful.html?highlight=Infiniband%20Support_
<https://myxcat.readthedocs.io/en/latest/advanced/networks/infiniband/mlnxofed_ib_install_v2_diskful.html?highlight=Infiniband%20%20%20%20%20%20%20%20%20%20%20%20Support>
_______________________________________________
xCAT-user mailing list
_xCAT-user@lists.sourceforge.net_
<mailto:xCAT-user@lists.sourceforge.net>
_https://lists.sourceforge.net/lists/listinfo/xcat-user_
<https://lists.sourceforge.net/lists/listinfo/xcat-user>
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
<https://lists.sourceforge.net/lists/listinfo/xcat-user>
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
<https://lists.sourceforge.net/lists/listinfo/xcat-user>
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user