One solution to the package conflicts between MLNX OFED and packages in the OS repository is to increase the priority of the Mellanox repo to 1 to prevent those packages from being upgraded by packages from the OS repository.
Example of what this looks like after the system is installed: (On compute node) # cat /etc/yum.repos.d/mlnx-ofed.repo [mlnx-ofed] name=mlnx-ofed baseurl=http://<MN_IP_ADDRESS>:80//install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/4.9-3.1.5.3/repo/ppc64le/MLNX_LIBS enabled=1 gpgcheck=0 priority=1 MN_IP_ADDRESS needs to be replaced with your specific configuration. priortity=1 is the important piece to give the older MLNX OFED packages priority over the OS versions of the packages # dnf list perftest ... Available Packages perftest.ppc64le 4.5.0.mlnxlibs-0.3.g1121951.49315 mlnx-ofed # dnf list --showduplicates perftest ... Available Packages perftest.ppc64le 4.5-1.el8 local-rhels8.5.0-ppc64le--install-shared_repo-ISOs-rhels8.5.0-ppc64le-BaseOS perftest.ppc64le 4.5.0.mlnxlibs-0.3.g1121951.49315 mlnx-ofed In the commands, observe that dnf is prioritizing the perftest from the mlnx-ofed repo over the version from the OS repo, even though the OS repo is the newer version. Here is a rough outline of one approach to automating the MLNX OFED install with xCAT while incorporating the increased REPO priority for MLNX OFED: (On the management node) 1.) cp /opt/xcat/share/xcat/ib/scripts/Mellanox/mlnxofed_ib_install /install/postscripts/custom 2.) Create a wrapper script similar to this: cat /install/postscripts/custom/MOFED.postscript #!/bin/sh set -x MLNX_OFED_PATH="/install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le" MLNX_OFED_ISO="MLNX_OFED_LINUX-4.9-3.1.5.3-rhel8.4-ppc64le.iso" #get mlnx-ofed version MLNX_OFED_NAME=$(basename "$MLNX_OFED_ISO" ."iso") MLNX_OFED_VERSION=$(echo $MLNX_OFED_NAME | awk -F- '{print $2}')-$(echo $MLNX_OFED_NAME | awk -F- '{print $3}') custom/mlnxofed_ib_install -p ${MLNX_OFED_PATH}/${MLNX_OFED_ISO} -m --with-nvmf --add-kernel-support --without-fw-update --force -end- systemctl enable openibd #create mlnx-ofed repo REPO_FILE="/etc/yum.repos.d/mlnx-ofed.repo" echo "[mlnx-ofed]" > $REPO_FILE echo "name=mlnx-ofed" >> $REPO_FILE echo "baseurl=http://${MASTER}:80/${MLNX_OFED_PATH}/$ {MLNX_OFED_VERSION}/repo/ppc64le/MLNX_LIBS" >> $REPO_FILE echo "enabled=1" >> $REPO_FILE echo "gpgcheck=0" >> $REPO_FILE echo "priority=1" >> $REPO_FILE 3.) For diskfull install images, include MOFED.postscript in the postscripts. lsdef -t osimage rhels8.5.0-ppc64le-install-compute-custom -i postscripts Object name: rhels8.5.0-ppc64le-install-compute-custom postscripts=custom/MOFED.postscript 4.) For diskless netboot images, include the MOFED.postscript logic in the postinstall script. # lsdef -t osimage rhels8.5.0-ppc64le-netboot-compute-custom -i postinstall Object name: rhels8.5.0-ppc64le-netboot-compute-custom postinstall=/install/postinstall/compute.postinstall 5.) Copy the MLNX OFED iso to the location referenced by MOFED.postinstall and make sure it is remotely accessible by the compute nodes. ls /install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/MLNX_OFED_LINUX-4.9-3.1.5.3-rhel8.4-ppc64le.iso /install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/MLNX_OFED_LINUX-4.9-3.1.5.3-rhel8.4-ppc64le.iso 6.) Copy the MLNX OFED repo packages to the location reference by the /etc/yum.repos.d/mlnx-ofed.repo file and make sure it is remotely accessible by the compute nodes. ls /install/REPO/software/mellanox/ofed/iso/redhat/8.4/ppc64le/4.9-3.1.5.3/repo/ppc64le/MLNX_LIBS/ ar_mgr-1.0-0.2.MLNX20201014.g8577618.49315.ppc64le.rpm mlnx-fw-updater-4.9-3.1.5.3.ppc64le.rpm dapl-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm mlnx-iproute2-5.4.0-1.49315.ppc64le.rpm dapl-devel-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm mlnx-ofa_kernel-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm dapl-devel-static-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm mlnx-ofa_kernel-devel-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm dapl-utils-2.1.10.1.mlnx-OFED.4.9.0.1.4.49315.ppc64le.rpm mlnx-ofed-all-4.9-3.1.5.3.rhel8.4.noarch.rpm dump_pr-1.0-0.2.MLNX20201014.g8577618.49315.ppc64le.rpm mlnx-ofed-all-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm hcoll-4.4.2968-1.49315.ppc64le.rpm mlnx-ofed-basic-4.9-3.1.5.3.rhel8.4.noarch.rpm ibacm-41mlnx1-OFED.4.3.3.0.0.49315.ppc64le.rpm mlnx-ofed-basic-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm ibacm-devel-41mlnx1-OFED.4.3.3.0.0.49315.ppc64le.rpm mlnx-ofed-bluefield-4.9-3.1.5.3.rhel8.4.noarch.rpm ibdump-6.0.0-1.49315.ppc64le.rpm mlnx-ofed-bluefield-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm ibsim-0.10-1.49315.ppc64le.rpm mlnxofed-docs-4.9-3.1.5.3.noarch.rpm ibutils2-2.1.1-0.121.MLNX20200324.g061a520.49315.ppc64le.rpm mlnx-ofed-dpdk-4.9-3.1.5.3.rhel8.4.noarch.rpm infiniband-diags-5.6.0.MLNX20200211.354e4b7-0.1.49315.ppc64le.rpm mlnx-ofed-dpdk-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm infiniband-diags-compat-5.6.0.MLNX20200211.354e4b7-0.1.49315.ppc64le.rpm mlnx-ofed-eth-only-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm infiniband-diags-guest-5.6.0.MLNX20200211.354e4b7-0.1.49315.ppc64le.rpm mlnx-ofed-guest-4.9-3.1.5.3.rhel8.4.noarch.rpm kernel-mft-mlnx-utils-4.15.1-1.rhel8u4.ppc64le.rpm mlnx-ofed-guest-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-iser-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm mlnx-ofed-hpc-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-isert-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm mlnx-ofed-hpc-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-kernel-mft-mlnx-4.15.1-1.rhel8u4.ppc64le.rpm mlnx-ofed-hypervisor-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.ppc64le.rpm mlnx-ofed-hypervisor-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-mlnx-en-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm mlnx-ofed-kernel-only-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-mlnx-ofa_kernel-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm mlnx-ofed-vma-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-rshim-1.18-0.gb99e894.rhel8u4.ppc64le.rpm mlnx-ofed-vma-eth-4.9-3.1.5.3.rhel8.4.noarch.rpm kmod-srp-4.9-OFED.4.9.3.1.5.1.rhel8u4.ppc64le.rpm mlnx-ofed-vma-eth-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.rhel8u4.ppc64le.rpm mlnx-ofed-vma-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm libibcm-41mlnx1-OFED.4.1.0.1.0.49315.ppc64le.rpm mlnx-ofed-vma-vpi-4.9-3.1.5.3.rhel8.4.noarch.rpm libibcm-devel-41mlnx1-OFED.4.1.0.1.0.49315.ppc64le.rpm mlnx-ofed-vma-vpi-user-only-4.9-3.1.5.3.rhel8.4.noarch.rpm libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49315.ppc64le.rpm mpi-selector-1.0.3-1.49315.ppc64le.rpm libibmad-devel-5.4.0.MLNX20190423.1d917ae-0.1.49315.ppc64le.rpm mpitests_openmpi-3.2.20-e1a0676.49315.ppc64le.rpm libibmad-static-5.4.0.MLNX20190423.1d917ae-0.1.49315.ppc64le.rpm mstflint-4.14.0-3.49315.ppc64le.rpm libibumad-43.1.1.MLNX20200211.078947f-0.1.49315.ppc64le.rpm neohost-backend-1.5.0-102.ppc64le.rpm libibumad-devel-43.1.1.MLNX20200211.078947f-0.1.49315.ppc64le.rpm neohost-sdk-1.5.0-102.ppc64le.rpm libibumad-static-43.1.1.MLNX20200211.078947f-0.1.49315.ppc64le.rpm ofed-scripts-4.9-OFED.4.9.3.1.5.ppc64le.rpm libibverbs-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm openmpi-4.0.3rc4-1.49315.ppc64le.rpm libibverbs-devel-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm opensm-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm libibverbs-devel-static-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm opensm-devel-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm libibverbs-utils-41mlnx1-OFED.4.9.3.0.0.49315.ppc64le.rpm opensm-libs-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm libmlx4-41mlnx1-OFED.4.7.3.0.3.49315.ppc64le.rpm opensm-static-5.7.2.MLNX20201014.9378048-0.1.49315.ppc64le.rpm libmlx4-devel-41mlnx1-OFED.4.7.3.0.3.49315.ppc64le.rpm perftest-4.5.0.mlnxlibs-0.3.g1121951.49315.ppc64le.rpm libmlx5-41mlnx1-OFED.4.9.0.1.2.49315.ppc64le.rpm qperf-0.4.11-1.49315.ppc64le.rpm libmlx5-devel-41mlnx1-OFED.4.9.0.1.2.49315.ppc64le.rpm repodata librdmacm-41mlnx1-OFED.4.7.3.0.6.49315.ppc64le.rpm sharp-2.1.2.MLNX20200428.ddda184-1.49315.ppc64le.rpm librdmacm-devel-41mlnx1-OFED.4.7.3.0.6.49315.ppc64le.rpm sockperf-3.7-0.gita1e8e835a689.49315.ppc64le.rpm librdmacm-utils-41mlnx1-OFED.4.7.3.0.6.49315.ppc64le.rpm srptools-41mlnx1-5.49315.ppc64le.rpm libvma-9.0.2-1.ppc64le.rpm ucx-1.8.0-1.49315.ppc64le.rpm libvma-devel-9.0.2-1.ppc64le.rpm ucx-cma-1.8.0-1.49315.ppc64le.rpm libvma-utils-9.0.2-1.ppc64le.rpm ucx-devel-1.8.0-1.49315.ppc64le.rpm mft-4.15.1-9.ppc64le.rpm ucx-ib-1.8.0-1.49315.ppc64le.rpm mlnx-en-doc-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm ucx-ib-cm-1.8.0-1.49315.ppc64le.rpm mlnx-en-sources-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm ucx-knem-1.8.0-1.49315.ppc64le.rpm mlnx-en-utils-4.9-3.1.5.0.g7e619ca.rhel8u4.ppc64le.rpm ucx-rdmacm-1.8.0-1.49315.ppc64le.rpm mlnx-ethtool-5.4-1.49315.ppc64le.rpm Install your nodes the way you normally would. Once the MLNX OFED repo is created on the compute nodes with higher priority, any future invocation of dnf should prioritize the package versions in the mlnx-ofed repo over the OS repositories. Note: this same principle can be generalized to other situations where packages in different repositories are in conflict with each other. I hope this helps, Nate From: "Vinícius Ferrão via xCAT-user" <xcat-user@lists.sourceforge.net> To: "xCAT Users Mailing list" <xcat-user@lists.sourceforge.net> Cc: "Vinícius Ferrão" <fer...@versatushpc.com.br> Date: 02/01/2022 04:37 PM Subject: [EXTERNAL] Re: [xcat-user] Versionlock a given package in stateless compute Hi guys still on this matter. As today MLNX OFED is becoming a pain to maintain with xCAT since perftest from OS needs libefa.so.1 which is unavailable with MLNX OFED. If we have an Stateless image, we cannot run updatenode <node> -S to deploy packages since dnf will be in a broken state due to perftest requirements. A solution would be to blacklist perftest from the OS, which is basically the same question in this thread, to solve the issue. So if there’s no way to version lock, is there at least a way to blacklist them? Thanks. > On 28 Jan 2022, at 20:38, Vinícius Ferrão <fer...@versatushpc.com.br> wrote: > > Hello, > > I would like to know if there’s a way to versionlock a give package in a stateless compute environment. > > Specifically I would like to fix the redhat-release package version. I tried adding the pinned versions to a pkglist file and running genimage but that was a no go. It just ignored the older packages and installed the newest ones. > > [root@headnode compute]# lsdef -t osimage rhels8.5.0-x86_64-netboot-compute -i pkglist > Object name: rhels8.5.0-x86_64-netboot-compute > pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels8.x86_64.pkglist,/install/custom/netboot/compute.pkglist > > cat /install/custom/netboot/compute.pkglist > kernel-4.18.0-305.25.1.el8_4 > kernel-core-4.18.0-305.25.1.el8_4 > kernel-devel-4.18.0-305.25.1.el8_4 > kernel-modules-4.18.0-305.25.1.el8_4 > kernel-modules-extra-4.18.0-305.25.1.el8_4 > kernel-headers-4.18.0-305.25.1.el8_4 > redhat-release-8.4 > > During the genimage process it downloaded and installed the correct package, but it was updated later on: > > ================================================================================ > Package Arch Version Repository Size > ================================================================================ > Installing: > kernel x86_64 4.18.0-348.12.2.el8_5 rhels8.5.0-x86_64-2 7.0 M > kernel-core x86_64 4.18.0-348.12.2.el8_5 rhels8.5.0-x86_64-2 38 M > kernel-devel x86_64 4.18.0-348.12.2.el8_5 rhels8.5.0-x86_64-2 20 M > kernel-modules x86_64 4.18.0-348.12.2.el8_5 rhels8.5.0-x86_64-2 30 M > kernel-modules-extra x86_64 4.18.0-348.12.2.el8_5 rhels8.5.0-x86_64-2 7.7 M > Upgrading: > kernel-headers x86_64 4.18.0-348.12.2.el8_5 rhels8.5.0-x86_64-2 8.3 M > redhat-release x86_64 8.5-0.8.el8 rhels8.5.0-x86_64-1 44 k > Transaction Summary > ================================================================================ > Install 5 Packages > Upgrade 2 Packages > Total size: 110 M > Downloading Packages: > Running transaction check > Transaction check succeeded. > Running transaction test > Transaction test succeeded. > Running transaction > Preparing : 1/1 > Running scriptlet: kernel-core-4.18.0-348.12.2.el8_5.x86_64 1/1 > Installing : kernel-core-4.18.0-348.12.2.el8_5.x86_64 1/9 > Running scriptlet: kernel-core-4.18.0-348.12.2.el8_5.x86_64 1/9 > Installing : kernel-modules-4.18.0-348.12.2.el8_5.x86_64 2/9 > Running scriptlet: kernel-modules-4.18.0-348.12.2.el8_5.x86_64 2/9 > Installing : kernel-4.18.0-348.12.2.el8_5.x86_64 3/9 > Installing : kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64 4/9 > Running scriptlet: kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64 4/9 > Upgrading : kernel-headers-4.18.0-348.12.2.el8_5.x86_64 5/9 > Upgrading : redhat-release-8.5-0.8.el8.x86_64 6/9 > Installing : kernel-devel-4.18.0-348.12.2.el8_5.x86_64 7/9 > Running scriptlet: kernel-devel-4.18.0-348.12.2.el8_5.x86_64 7/9 > Cleanup : kernel-headers-4.18.0-305.25.1.el8_4.x86_64 8/9 > Cleanup : redhat-release-8.4-0.6.el8.x86_64 9/9 > Running scriptlet: kernel-core-4.18.0-348.12.2.el8_5.x86_64 9/9 > dracut: No '/dev/log' or 'logger' included for syslog logging > dracut: Turning off host-only mode: '/run' is not mounted! > dracut: Turning off host-only mode: '/dev' is not mounted! > Running scriptlet: redhat-release-8.4-0.6.el8.x86_64 9/9 > Verifying : kernel-4.18.0-348.12.2.el8_5.x86_64 1/9 > Verifying : kernel-devel-4.18.0-348.12.2.el8_5.x86_64 2/9 > Verifying : kernel-modules-4.18.0-348.12.2.el8_5.x86_64 3/9 > Verifying : kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64 4/9 > Verifying : kernel-core-4.18.0-348.12.2.el8_5.x86_64 5/9 > Verifying : redhat-release-8.5-0.8.el8.x86_64 6/9 > Verifying : redhat-release-8.4-0.6.el8.x86_64 7/9 > Verifying : kernel-headers-4.18.0-348.12.2.el8_5.x86_64 8/9 > Verifying : kernel-headers-4.18.0-305.25.1.el8_4.x86_64 9/9 > Installed products updated. > Upgraded: > kernel-headers-4.18.0-348.12.2.el8_5.x86_64 redhat-release-8.5-0.8.el8.x86_64 > Installed: > kernel-4.18.0-348.12.2.el8_5.x86_64 > kernel-core-4.18.0-348.12.2.el8_5.x86_64 > kernel-devel-4.18.0-348.12.2.el8_5.x86_64 > kernel-modules-4.18.0-348.12.2.el8_5.x86_64 > kernel-modules-extra-4.18.0-348.12.2.el8_5.x86_64 > Complete! > > Any ideias? > > Thanks. > _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user