You can nodeset <nodes> shell
That'll get you an environment that should boot in them regardless, complete
with ssh and all.
From: David D Johnson [mailto:[email protected]]
Sent: Thursday, June 25, 2015 2:00 PM
To: xCAT Users Mailing list
Subject: Re: [xcat-user] NextScale deployment kernel crash
I may have jumped to conclusions about the reason, but in any case the two new
M5 machines don't boot.
This is the line specifying drivers from our build script:
./genimage -i eth0 -n dca,8021q,igb,bnx2,tg3 -o centos6.5 -k
2.6.32-358.23.2.el6.x86_64 -p comp
As to the ethernet interfaces, from M4 machine the relevant ASU output looks
like
PXE.NicPortMacAddress.1=6C:AE:8B:08:94:ED
PXE.NicPortMacAddress.2=6C:AE:8B:08:94:EE
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.LinkStatus=Connected
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.AlternateMACAddress=6C:AE:8B:08:94:ED
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.LinkSpeed=AutoNeg
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.WakeonLAN=Enabled
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.LinkStatus=Disconnected
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.AlternateMACAddress=6C:AE:8B:08:94:EE
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.LinkSpeed=AutoNeg
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.WakeonLAN=Enabled
On the new M5 machines, there are only two ports on the front (no dedicated IMM
port), but there are now four PXE Mac lines, #3 is shared port.
PXE.NicPortMacAddress.1=E4:1D:2D:73:56:01
PXE.NicPortMacAddress.2=E4:1D:2D:73:56:02
PXE.NicPortMacAddress.3=40:F2:E9:C5:51:12
PXE.NicPortMacAddress.4=40:F2:E9:C5:51:13
Now I realize the first two are from the dual port FDR IB mezzanine card
(ConnectX-3 Pro). They can be used as 10/40/56 GbE, I suppose, but we want to
use one of them for FDR IB only, and the other one isn't connected to anything.
The other two are BroadCom / Tg3
I wish I could ssh to the machine so I could poke around and see what the NICs
are called. Maybe I will have to boot off a USB key. No disks in these hosts.
-- ddj
On Jun 25, 2015, at 1:33 PM, Jarrod Johnson
<[email protected]<mailto:[email protected]>> wrote:
What nic driver was built in the initrd? m4 was igb, m5 uses tg3.
" extra unusable Ethernet ports on the motherboard that mess up the interface
naming. Is there a workaround for this???"
I'm interested in what this means and if I can help on that.
From: David Johnson [mailto:[email protected]]
Sent: Thursday, June 25, 2015 11:30 AM
To: xCAT Users Mailing list
Subject: Re: [xcat-user] NextScale deployment kernel crash
Yes, we are seeing exactly the same problem. 300 nodes from nehalem to
nextscale m4 all work fine with the same centos 6.5 image, but not so for the
the Lenovo nextscale M5 nodes. They seem to have extra unusable Ethernet ports
on the motherboard that mess up the interface naming. Is there a workaround for
this???
-- ddj
Dave Johnson
On Jun 25, 2015, at 10:49 AM, Damir Krstic
<[email protected]<mailto:[email protected]>> wrote:
We are trying to boot NextScale nodes with our RedHat 6.4 stateless image. They
are crashing during the initrd boot process with following error:
dracut Warning: No root device "1" found
dracut Warning: Boot has failed. To debug this issue add "rdshell" to the
kernel command line.
dracut Warning: Signal caught!
dracut Warning: Boot has failed. To debug this issue add "rdshell" to the
kernel command line.
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: init Tainted: G --------------- H
2.6.32-358.el6.x86_64 #1
Call Trace:
[<ffffffff8150cfc8>] ? panic+0xa7/0x16f
[<ffffffff81073ae2>] ? do_exit+0x862/0x870
[<ffffffff81182885>] ? fput+0x25/0x30
[<ffffffff81073b48>] ? do_group_exit+0x58/0xd0
[<ffffffff81073bd7>] ? sys_exit_group+0x17/0x20
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:117 native_smp_send_reschedule+0x5c/0x60()
(Tainted: G --------------- H )
Hardware name: IBM NeXtScale nx360 M5: -[5465AC1]-
Modules linked in: sd_mod crc_t10dif ahci mlx4_core [last unloaded:
scsi_wait_scan]
Pid: 1, comm: init Tainted: G --------------- H
2.6.32-358.el6.x86_64 #1
Call Trace:
<IRQ> [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0
[<ffffffff8106e33a>] ? warn_slowpath_null+0x1a/0x20
[<ffffffff8102dd9c>] ? native_smp_send_reschedule+0x5c/0x60
[<ffffffff8105ae28>] ? scheduler_tick+0x208/0x260
[<ffffffff810a7fd0>] ? tick_sched_timer+0x0/0xc0
[<ffffffff810811de>] ? update_process_times+0x6e/0x90
[<ffffffff810a8036>] ? tick_sched_timer+0x66/0xc0
[<ffffffff8109b38e>] ? __run_hrtimer+0x8e/0x1a0
[<ffffffff810a182f>] ? ktime_get_update_offsets+0x4f/0xd0
[<ffffffff8107700f>] ? __do_softirq+0x11f/0x1e0
[<ffffffff8109b6f6>] ? hrtimer_interrupt+0xe6/0x260
[<ffffffff81516d7b>] ? smp_apic_timer_interrupt+0x6b/0x9b
[<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
<EOI> [<ffffffff8150d06d>] ? panic+0x14c/0x16f
[<ffffffff8150cffa>] ? panic+0xd9/0x16f
[<ffffffff81073ae2>] ? do_exit+0x862/0x870
[<ffffffff81182885>] ? fput+0x25/0x30
[<ffffffff81073b48>] ? do_group_exit+0x58/0xd0
[<ffffffff81073bd7>] ? sys_exit_group+0x17/0x20
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Any help would be appreciated.
Thanks,
Damir
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
xCAT-user mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o_______________________________________________
xCAT-user mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user