I don't know if this is your issue, but I had similar symptoms recently while testing RHEL/CentOS 7 images on our cluster that has worked for years just fine with CentOS 6.x. I found that my problem had something to do with console redirection. For some reason systemd was hanging while trying to send console messages to the SOL. If I disabled that redirection by deleting the nodehm.cons attribute for the node that I was testing then the node booted fine. If you can do that and find that it works then you'll know what is wrong and you can probably find out from somebody at Lenovo how that is supposed to be configured. The hardware where I'm having this problem is from SuperMicro so I'm on my own for figuring out how it is supposed to work.

Mike

On 7/3/18 2:26 PM, Sam Davis wrote:

I have left it trying to boot overnight with no success.  I did find earlier I was getting an NTP error during the boot cycle.  I configure ntpd on the management node and the client node reports syncing early in the boot process now.

*From:* david_john...@brown.edu <david_john...@brown.edu>
*Sent:* Tuesday, July 03, 2018 4:01 PM
*To:* xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
*Subject:* Re: [xcat-user] New XCAT installtion PXE boot issue.

So a quick question is how long is never?  I had a similar situation today and the setupntp script was taking a really long time but finally gave up. The problem for me was that chronyd was not configured on the management node to respond on any network interfaces. Chronyd has replaced ntpd on redhat 7.

  -- ddj

Dave Johnson


On Jul 3, 2018, at 2:55 PM, Sam Davis <aractha...@gmail.com <mailto:aractha...@gmail.com>> wrote:

    Hello,

      I am trying to setup a new HPC cluster using XCAT (2.14).  I
    have installed the management node and created the boot image
    (RHEL 7.5). The node has been discover and PXE boots, downloading
    the image file.  But the boot process stalls and never finishes. 
    I have even copied over a working RHEL 7.3 image from our other
    cluster to see if that is the issue. I’ve tried disabling and
    enabling hyperthreading in the client machine.  I’ve also updated
    the firmware on the client machine.  Does anyone have any ideas of
    what I might try next?



    Node Hardware

    IBM x3850 X5

    256 GB RAM

    Machine Type 7143 AC1

    4 x Intell Xeon E7 4820

    <shcpboot.jpg>

    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org <http://Slashdot.org>!
    http://sdm.link/slashdot

    _______________________________________________
    xCAT-user mailing list
    xCAT-user@lists.sourceforge.net
    <mailto:xCAT-user@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/xcat-user


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to