This might be easier for me to look at interactively…
We do have x3850 x5s, but nothing familiar off hand…
From: Song BJ Yang <yang...@cn.ibm.com>
Sent: Monday, July 09, 2018 3:14 AM
To: xcat-user@lists.sourceforge.net
Cc: xcat-user@lists.sourceforge.net; Jarrod B Johnson <johns...@us.ibm.com>
Subject: [External] Re: [xcat-user] New XCAT installtion PXE boot issue.
hi Jarrod,
Did your team verified rhels7.5 diskless on x3850? Do you have any thought on
how to debug this? thanks
Hi Sam,
From the http log, the root image tarball bas been downloaded successfully, the
system boot up hang while switching root to the uncompressed root image
directory, not just lost console.
have you tried the suggestion from Michael Robber and David Johnson?
>
the “hard” needs to be nulled out, that is setting hardware flow control…
when you "
tried removing the cons setting, to no avail
", did you rerun nodeset to apply the change?
Now we do not have any IBM x3850 X5 machine on hand, let's ask the lenovo staff
in the list to see whether they have any suggestion
------------------------------------------------------------------------------
YANG Song (杨嵩)
IBM China System Technology Laboratory
Tel: 86-10-82452903
Email: yang...@cn.ibm.com<mailto:yang...@cn.ibm.com>
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC
北京市海淀区东北旺西路8号中关村软件园28号楼
邮编: 100193
----- Original message -----
From: "Sam Davis" <aractha...@gmail.com<mailto:aractha...@gmail.com>>
To: "'xCAT Users Mailing list'"
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Cc:
Subject: Re: [xcat-user] New XCAT installtion PXE boot issue.
Date: Fri, Jul 6, 2018 11:43 PM
SSH Attempt:
ssh: connect to host node01 port 22: No route to host
I did a fresh reboot to ensure I got everything and only that which is
applicable.
172.30.101.3 - - [06/Jul/2018:10:36:50 -0400] "GET
/tftpboot/xcat/xnba/nodes/node01.uefi HTTP/1.1" 200 105 "-" "iPXE/1.0.3-131028
(d603e)"
172.30.101.3 - - [06/Jul/2018:10:36:50 -0400] "GET /tftpboot/xcat/elilo-x64.efi
HTTP/1.1" 200 242929 "-" "iPXE/1.0.3-131028 (d603e)"
172.30.101.3 - - [06/Jul/2018:10:36:50 -0400] "GET
/tftpboot/xcat/xnba/nodes/node01.elilo HTTP/1.1" 200 365 "-" "iPXE/1.0.3-131028
(d603e)"
172.30.101.3 - - [06/Jul/2018:10:36:50 -0400] "GET
/tftpboot/xcat/osimage/KSU-rhels7.3-netboot-compute/kernel HTTP/1.1" 200
5391264 "-" "iPXE/1.0.3-131028 (d603e)"
172.30.101.3 - - [06/Jul/2018:10:36:50 -0400] "GET
/tftpboot/xcat/osimage/KSU-rhels7.3-netboot-compute/initrd-stateless.gz
HTTP/1.1" 200 34271424 "-" "iPXE/1.0.3-131028 (d603e)"
172.30.101.3 - - [06/Jul/2018:10:53:06 -0400] "GET
//install/netboot/rhels7.3/x86_64/compute/rootimg.cpio.gz HTTP/1.1" 200
1519404946 "-" "Wget/1.14 (linux-gnu)"
From: Song BJ Yang <yang...@cn.ibm.com<mailto:yang...@cn.ibm.com>>
Sent: Thursday, July 05, 2018 10:51 PM
To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>
Cc: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>
Subject: Re: [xcat-user] New XCAT installtion PXE boot issue.
hi Sam,
the status is "netbooting", means the provision process has stepped over the
initrd boot phase and in the root image boot up process. If only console is
lost, the following boot up steps can still continue, what happens if you try
to ssh the node?
and what is the last entries you see in the /var/log/httpd/access_log on MN
when the provision hang?
thanks
------------------------------------------------------------------------------
YANG Song (杨嵩)
IBM China System Technology Laboratory
Tel: 86-10-82452903
Email: yang...@cn.ibm.com<mailto:yang...@cn.ibm.com>
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC
北京市海淀区东北旺西路8号中关村软件园28号楼
邮编: 100193
----- Original message -----
From: "Sam Davis" <aractha...@gmail.com<mailto:aractha...@gmail.com>>
To: "'xCAT Users Mailing list'"
<xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Cc:
Subject: Re: [xcat-user] New XCAT installtion PXE boot issue.
Date: Fri, Jul 6, 2018 2:42 AM
~]# lsdef node01 -i status
Object name: node01
status=netbooting
Also, when we try to use rcons on the machine we get the following error:
~]# rcons node01
[Enter `^Ec?' for help]
Acquiring startup lock...done
Info: SOL payload already de-activated
Info: SOL payload disabled
I have tried removing the cons setting, to no avail. The console settings are
“115200”.
Group entry from the nodehm table:
“"ipmi","ipmi","ipmi",,,,,"0","115200","hard",,,,,,”
Thank you all,
Sam
From: Song BJ Yang <yang...@cn.ibm.com<mailto:yang...@cn.ibm.com>>
Sent: Tuesday, July 03, 2018 11:35 PM
To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>
Subject: Re: [xcat-user] New XCAT installtion PXE boot issue.
hi Sam,
is the screenshot captured during the rootimg boot up? or during initrd boot up
before rootimage tarball is download? please check the status of the node by
`lsdef <XX> -i status`
------------------------------------------------------------------------------
YANG Song (杨嵩)
IBM China System Technology Laboratory
Tel: 86-10-82452903
Email: yang...@cn.ibm.com<mailto:yang...@cn.ibm.com>
Address: Building 28, ZhongGuanCun Software Park,
No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC
北京市海淀区东北旺西路8号中关村软件园28号楼
邮编: 100193
----- Original message -----
From: "Sam Davis" <aractha...@gmail.com<mailto:aractha...@gmail.com>>
To: <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>>
Cc:
Subject: [xcat-user] New XCAT installtion PXE boot issue.
Date: Wed, Jul 4, 2018 2:56 AM
Hello,
I am trying to setup a new HPC cluster using XCAT (2.14). I have installed
the management node and created the boot image (RHEL 7.5). The node has been
discover and PXE boots, downloading the image file. But the boot process
stalls and never finishes. I have even copied over a working RHEL 7.3 image
from our other cluster to see if that is the issue. I’ve tried disabling and
enabling hyperthreading in the client machine. I’ve also updated the firmware
on the client machine. Does anyone have any ideas of what I might try next?
Node Hardware
IBM x3850 X5
256 GB RAM
Machine Type 7143 AC1
4 x Intell Xeon E7 4820
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user