One solution to the package conflicts between MLNX OFED and packages in the
OS repository is to increase the priority of the Mellanox repo to 1 to
prevent those packages from being upgraded by packages from the OS
repository.

Example of what this looks like after the system is installed:
(On compute node)
# cat /etc/yum.repos.d/mlnx-ofed.repo
[mlnx-ofed]
name=mlnx-ofed
baseurl=http://<MN_IP_ADDRESS>:80//install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/4.9-3.1.5.3/repo/ppc64le/MLNX_LIBS
enabled=1
gpgcheck=0
priority=1

MN_IP_ADDRESS needs to be replaced with your specific configuration.
priortity=1 is the important piece to give the older MLNX OFED packages
priority over the OS versions of the packages

# dnf list perftest
...
Available Packages
perftest.ppc64le                      4.5.0.mlnxlibs-0.3.g1121951.49315
mlnx-ofed

# dnf list --showduplicates perftest
...
Available Packages
perftest.ppc64le                      4.5-1.el8
local-rhels8.5.0-ppc64le--install-shared_repo-ISOs-rhels8.5.0-ppc64le-BaseOS
perftest.ppc64le                      4.5.0.mlnxlibs-0.3.g1121951.49315
mlnx-ofed

In the commands, observe that dnf is prioritizing the perftest from the
mlnx-ofed repo over the version from the OS repo, even though the OS repo
is the newer version.

Here is a rough outline of one approach to automating the MLNX OFED install
with xCAT while incorporating the increased REPO priority for MLNX OFED:

(On the management node)
1.)
cp /opt/xcat/share/xcat/ib/scripts/Mellanox/mlnxofed_ib_install  
/install/postscripts/custom


2.) Create a wrapper script similar to this:
cat /install/postscripts/custom/MOFED.postscript
#!/bin/sh

set -x

MLNX_OFED_PATH="/install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le"
MLNX_OFED_ISO="MLNX_OFED_LINUX-4.9-3.1.5.3-rhel8.4-ppc64le.iso"
#get mlnx-ofed version
MLNX_OFED_NAME=$(basename "$MLNX_OFED_ISO" ."iso")
MLNX_OFED_VERSION=$(echo $MLNX_OFED_NAME | awk -F- '{print $2}')-$(echo
$MLNX_OFED_NAME | awk -F- '{print $3}')

custom/mlnxofed_ib_install -p ${MLNX_OFED_PATH}/${MLNX_OFED_ISO} -m
--with-nvmf --add-kernel-support --without-fw-update --force -end-

systemctl enable openibd

#create mlnx-ofed repo
REPO_FILE="/etc/yum.repos.d/mlnx-ofed.repo"
echo "[mlnx-ofed]" > $REPO_FILE
echo "name=mlnx-ofed" >>  $REPO_FILE
echo "baseurl=http://${MASTER}:80/${MLNX_OFED_PATH}/$
{MLNX_OFED_VERSION}/repo/ppc64le/MLNX_LIBS" >>  $REPO_FILE
echo "enabled=1" >>  $REPO_FILE
echo "gpgcheck=0" >>  $REPO_FILE
echo "priority=1" >>  $REPO_FILE

3.) For diskfull install images, include MOFED.postscript in the
postscripts.
lsdef -t osimage rhels8.5.0-ppc64le-install-compute-custom -i postscripts
Object name: rhels8.5.0-ppc64le-install-compute-custom
    postscripts=custom/MOFED.postscript

4.) For diskless netboot images, include the MOFED.postscript logic in the
postinstall script.
# lsdef -t osimage rhels8.5.0-ppc64le-netboot-compute-custom -i postinstall
Object name: rhels8.5.0-ppc64le-netboot-compute-custom
    postinstall=/install/postinstall/compute.postinstall

5.) Copy the MLNX OFED iso to the location referenced by MOFED.postinstall
and make sure it is remotely accessible by the compute nodes.
ls 
/install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/MLNX_OFED_LINUX-4.9-3.1.5.3-rhel8.4-ppc64le.iso

/install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/MLNX_OFED_LINUX-4.9-3.1.5.3-rhel8.4-ppc64le.iso

6.) Copy the MLNX OFED repo packages to the location reference by the
/etc/yum.repos.d/mlnx-ofed.repo file and make sure it is remotely
accessible by the compute nodes.
ls 
/install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/4.9-3.1.5.3/repo/ppc64le/MLNX_LIBS/
ar_mgr-1.0-0.2.MLNX20201014.g8577618.49315.ppc64le.rpm
mlnx-fw-updater-4.9-3.1.5.3.ppc64le.rpm
dapl-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm
mlnx-iproute2-5.4.0-1.49315.ppc64le.rpm
dapl-devel-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm
mlnx-ofa_kernel-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm
dapl-devel-static-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm
mlnx-ofa_kernel-devel-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm
dapl-utils-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm
mlnx-ofed-all-4.9-3.1.5.3.rhel8.4.noarch.rpm
dump_pr-1.0-0.2.MLNX20201014.g8577618.49315.ppc64le.rpm
mlnx-ofed-all-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
hcoll-4.4.2968-1.49315.ppc64le.rpm
mlnx-ofed-basic-4.9-3.1.5.3.rhel8.4.noarch.rpm
ibacm-41mlnx1-OFED.4.3.3.0.0.49315.ppc64le.rpm
mlnx-ofed-basic-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
ibacm-devel-41mlnx1-OFED.4.3.3.0.0.49315.ppc64le.rpm
mlnx-ofed-bluefield-4.9-3.1.5.3.rhel8.4.noarch.rpm
ibdump-6.0.0-1.49315.ppc64le.rpm
mlnx-ofed-bluefield-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
ibsim-0.10-1.49315.ppc64le.rpm
mlnxofed-docs-4.9-3.1.5.3.noarch.rpm
ibutils2-2.1.1-0.121.MLNX20200324.g061a520.49315.ppc64le.rpm
mlnx-ofed-dpdk-4.9-3.1.5.3.rhel8.4.noarch.rpm
infiniband-diags-5.6.0.MLNX20200211.354e4b7-0.1.49315.ppc64le.rpm
mlnx-ofed-dpdk-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
infiniband-diags-compat-5.6.0.MLNX20200211.354e4b7-0.1.49315.ppc64le.rpm
mlnx-ofed-eth-only-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
infiniband-diags-guest-5.6.0.MLNX20200211.354e4b7-0.1.49315.ppc64le.rpm
mlnx-ofed-guest-4.9-3.1.5.3.rhel8.4.noarch.rpm
kernel-mft-mlnx-utils-4.15.1-1.rhel8u4.ppc64le.rpm
mlnx-ofed-guest-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-iser-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm
mlnx-ofed-hpc-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-isert-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm
mlnx-ofed-hpc-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-kernel-mft-mlnx-4.15.1-1.rhel8u4.ppc64le.rpm
mlnx-ofed-hypervisor-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.ppc64le.rpm
mlnx-ofed-hypervisor-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-mlnx-en-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm
mlnx-ofed-kernel-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-mlnx-ofa_kernel-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm
mlnx-ofed-vma-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-rshim-1.18-0.gb99e894.rhel8u4.ppc64le.rpm
mlnx-ofed-vma-eth-4.9-3.1.5.3.rhel8.4.noarch.rpm
kmod-srp-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm
mlnx-ofed-vma-eth-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.ppc64le.rpm
mlnx-ofed-vma-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
libibcm-41mlnx1-OFED.4.1.0.1.0.49315.ppc64le.rpm
mlnx-ofed-vma-vpi-4.9-3.1.5.3.rhel8.4.noarch.rpm
libibcm-devel-41mlnx1-OFED.4.1.0.1.0.49315.ppc64le.rpm
mlnx-ofed-vma-vpi-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm
libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49315.ppc64le.rpm
mpi-selector-1.0.3-1.49315.ppc64le.rpm
libibmad-devel-5.4.0.MLNX20190423.1d917ae-0.1.49315.ppc64le.rpm
mpitests_openmpi-3.2.20-e1a0676.49315.ppc64le.rpm
libibmad-static-5.4.0.MLNX20190423.1d917ae-0.1.49315.ppc64le.rpm
mstflint-4.14.0-3.49315.ppc64le.rpm
libibumad-43.1.1.MLNX20200211.078947f-0.1.49315.ppc64le.rpm
neohost-backend-1.5.0-102.ppc64le.rpm
libibumad-devel-43.1.1.MLNX20200211.078947f-0.1.49315.ppc64le.rpm
neohost-sdk-1.5.0-102.ppc64le.rpm
libibumad-static-43.1.1.MLNX20200211.078947f-0.1.49315.ppc64le.rpm
ofed-scripts-4.9-OFED.4.9.3.1.5.ppc64le.rpm
libibverbs-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm
openmpi-4.0.3rc4-1.49315.ppc64le.rpm
libibverbs-devel-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm
opensm-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm
libibverbs-devel-static-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm
opensm-devel-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm
libibverbs-utils-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm
opensm-libs-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm
libmlx4-41mlnx1-OFED.4.7.3.0.3.49315.ppc64le.rpm
opensm-static-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm
libmlx4-devel-41mlnx1-OFED.4.7.3.0.3.49315.ppc64le.rpm
perftest-4.5.0.mlnxlibs-0.3.g1121951.49315.ppc64le.rpm
libmlx5-41mlnx1-OFED.4.9.0.1.2.49315.ppc64le.rpm
qperf-0.4.11-1.49315.ppc64le.rpm
libmlx5-devel-41mlnx1-OFED.4.9.0.1.2.49315.ppc64le.rpm
repodata
librdmacm-41mlnx1-OFED.4.7.3.0.6.49315.ppc64le.rpm
sharp-2.1.2.MLNX20200428.ddda184-1.49315.ppc64le.rpm
librdmacm-devel-41mlnx1-OFED.4.7.3.0.6.49315.ppc64le.rpm
sockperf-3.7-0.gita1e8e835a689.49315.ppc64le.rpm
librdmacm-utils-41mlnx1-OFED.4.7.3.0.6.49315.ppc64le.rpm
srptools-41mlnx1-5.49315.ppc64le.rpm
libvma-9.0.2-1.ppc64le.rpm
ucx-1.8.0-1.49315.ppc64le.rpm
libvma-devel-9.0.2-1.ppc64le.rpm
ucx-cma-1.8.0-1.49315.ppc64le.rpm
libvma-utils-9.0.2-1.ppc64le.rpm
ucx-devel-1.8.0-1.49315.ppc64le.rpm
mft-4.15.1-9.ppc64le.rpm
ucx-ib-1.8.0-1.49315.ppc64le.rpm
mlnx-en-doc-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm
ucx-ib-cm-1.8.0-1.49315.ppc64le.rpm
mlnx-en-sources-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm
ucx-knem-1.8.0-1.49315.ppc64le.rpm
mlnx-en-utils-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm
ucx-rdmacm-1.8.0-1.49315.ppc64le.rpm
mlnx-ethtool-5.4-1.49315.ppc64le.rpm

Install your nodes the way you normally would. Once the MLNX OFED repo is
created on the compute nodes with higher priority, any future invocation of
dnf should prioritize the package versions
in the mlnx-ofed repo over the OS repositories.

Note: this same principle can be generalized to other situations where
packages in different repositories are in conflict with each other.

I hope this helps,
Nate



From:   "Vinícius Ferrão via xCAT-user"
            <xcat-user@lists.sourceforge.net>
To:     "xCAT Users Mailing list" <xcat-user@lists.sourceforge.net>
Cc:     "Vinícius Ferrão" <fer...@versatushpc.com.br>
Date:   02/01/2022 04:37 PM
Subject:        [EXTERNAL] Re: [xcat-user] Versionlock a given package in
            stateless compute



Hi guys still on this matter. As today MLNX OFED is becoming a pain to
maintain with xCAT since perftest from OS needs libefa.so.1 which is
unavailable with MLNX OFED.

If we have an Stateless image, we cannot run updatenode <node> -S to deploy
packages since dnf will be in a broken state due to perftest requirements.

A solution would be to blacklist perftest from the OS, which is basically
the same question in this thread, to solve the issue.

So if there’s no way to version lock, is there at least a way to blacklist
them?

Thanks.

> On 28 Jan 2022, at 20:38, Vinícius Ferrão <fer...@versatushpc.com.br>
wrote:
>
> Hello,
>
> I would like to know if there’s a way to versionlock a give package in a
stateless compute environment.
>
> Specifically I would like to fix the redhat-release package version. I
tried adding the pinned versions to a pkglist file and running genimage but
that was a no go. It just ignored the older packages and installed the
newest ones.
>
> [root@headnode compute]# lsdef -t osimage
rhels8.5.0-x86_64-netboot-compute -i pkglist
> Object name: rhels8.5.0-x86_64-netboot-compute
>
pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels8.x86_64.pkglist,/install/custom/netboot/compute.pkglist

>
> cat /install/custom/netboot/compute.pkglist
> kernel-4.18.0-305.25.1.el8_4
> kernel-core-4.18.0-305.25.1.el8_4
> kernel-devel-4.18.0-305.25.1.el8_4
> kernel-modules-4.18.0-305.25.1.el8_4
> kernel-modules-extra-4.18.0-305.25.1.el8_4
> kernel-headers-4.18.0-305.25.1.el8_4
> redhat-release-8.4
>
> During the genimage process it downloaded and installed the correct
package, but it was updated later on:
>
>
================================================================================

> Package              Arch   Version                  Repository
Size
>
================================================================================

> Installing:
> kernel               x86_64 4.18.0-348.12.2.el8_5    rhels8.5.0-x86_64-2
7.0 M
> kernel-core          x86_64 4.18.0-348.12.2.el8_5    rhels8.5.0-x86_64-2
38 M
> kernel-devel         x86_64 4.18.0-348.12.2.el8_5    rhels8.5.0-x86_64-2
20 M
> kernel-modules       x86_64 4.18.0-348.12.2.el8_5    rhels8.5.0-x86_64-2
30 M
> kernel-modules-extra x86_64 4.18.0-348.12.2.el8_5    rhels8.5.0-x86_64-2
7.7 M
> Upgrading:
> kernel-headers       x86_64 4.18.0-348.12.2.el8_5    rhels8.5.0-x86_64-2
8.3 M
> redhat-release       x86_64 8.5-0.8.el8              rhels8.5.0-x86_64-1
44 k
> Transaction Summary
>
================================================================================

> Install  5 Packages
> Upgrade  2 Packages
> Total size: 110 M
> Downloading Packages:
> Running transaction check
> Transaction check succeeded.
> Running transaction test
> Transaction test succeeded.
> Running transaction
>  Preparing        :
1/1
>  Running scriptlet: kernel-core-4.18.0-348.12.2.el8_5.x86_64
1/1
>  Installing       : kernel-core-4.18.0-348.12.2.el8_5.x86_64
1/9
>  Running scriptlet: kernel-core-4.18.0-348.12.2.el8_5.x86_64
1/9
>  Installing       : kernel-modules-4.18.0-348.12.2.el8_5.x86_64
2/9
>  Running scriptlet: kernel-modules-4.18.0-348.12.2.el8_5.x86_64
2/9
>  Installing       : kernel-4.18.0-348.12.2.el8_5.x86_64
3/9
>  Installing       : kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64
4/9
>  Running scriptlet: kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64
4/9
>  Upgrading        : kernel-headers-4.18.0-348.12.2.el8_5.x86_64
5/9
>  Upgrading        : redhat-release-8.5-0.8.el8.x86_64
6/9
>  Installing       : kernel-devel-4.18.0-348.12.2.el8_5.x86_64
7/9
>  Running scriptlet: kernel-devel-4.18.0-348.12.2.el8_5.x86_64
7/9
>  Cleanup          : kernel-headers-4.18.0-305.25.1.el8_4.x86_64
8/9
>  Cleanup          : redhat-release-8.4-0.6.el8.x86_64
9/9
>  Running scriptlet: kernel-core-4.18.0-348.12.2.el8_5.x86_64
9/9
> dracut: No '/dev/log' or 'logger' included for syslog logging
> dracut: Turning off host-only mode: '/run' is not mounted!
> dracut: Turning off host-only mode: '/dev' is not mounted!
>  Running scriptlet: redhat-release-8.4-0.6.el8.x86_64
9/9
>  Verifying        : kernel-4.18.0-348.12.2.el8_5.x86_64
1/9
>  Verifying        : kernel-devel-4.18.0-348.12.2.el8_5.x86_64
2/9
>  Verifying        : kernel-modules-4.18.0-348.12.2.el8_5.x86_64
3/9
>  Verifying        : kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64
4/9
>  Verifying        : kernel-core-4.18.0-348.12.2.el8_5.x86_64
5/9
>  Verifying        : redhat-release-8.5-0.8.el8.x86_64
6/9
>  Verifying        : redhat-release-8.4-0.6.el8.x86_64
7/9
>  Verifying        : kernel-headers-4.18.0-348.12.2.el8_5.x86_64
8/9
>  Verifying        : kernel-headers-4.18.0-305.25.1.el8_4.x86_64
9/9
> Installed products updated.
> Upgraded:
>  kernel-headers-4.18.0-348.12.2.el8_5.x86_64
redhat-release-8.5-0.8.el8.x86_64
> Installed:
>  kernel-4.18.0-348.12.2.el8_5.x86_64

>  kernel-core-4.18.0-348.12.2.el8_5.x86_64

>  kernel-devel-4.18.0-348.12.2.el8_5.x86_64

>  kernel-modules-4.18.0-348.12.2.el8_5.x86_64

>  kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64

> Complete!
>
> Any ideias?
>
> Thanks.
>


_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to