You can nodeset <nodes> shell

That'll get you an environment that should boot in them regardless, complete 
with ssh and all.

From: David D Johnson [mailto:[email protected]]
Sent: Thursday, June 25, 2015 2:00 PM
To: xCAT Users Mailing list
Subject: Re: [xcat-user] NextScale deployment kernel crash

I may have jumped to conclusions about the reason, but in any case the two new 
M5 machines don't boot.

This is the line specifying drivers from our build script:
./genimage -i eth0 -n dca,8021q,igb,bnx2,tg3 -o centos6.5 -k 
2.6.32-358.23.2.el6.x86_64 -p comp


As to the ethernet interfaces, from M4 machine the relevant ASU output looks 
like
PXE.NicPortMacAddress.1=6C:AE:8B:08:94:ED
PXE.NicPortMacAddress.2=6C:AE:8B:08:94:EE
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.LinkStatus=Connected
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.AlternateMACAddress=6C:AE:8B:08:94:ED
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.LinkSpeed=AutoNeg
IntelRI350GigabitNetworkConnection-6CAE8B0894ED.WakeonLAN=Enabled
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.LinkStatus=Disconnected
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.AlternateMACAddress=6C:AE:8B:08:94:EE
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.LinkSpeed=AutoNeg
IntelRI350GigabitNetworkConnection-6CAE8B0894EE.WakeonLAN=Enabled

On the new M5 machines, there are only two ports on the front (no dedicated IMM 
port), but there are now four PXE Mac lines, #3 is shared port.
PXE.NicPortMacAddress.1=E4:1D:2D:73:56:01
PXE.NicPortMacAddress.2=E4:1D:2D:73:56:02
PXE.NicPortMacAddress.3=40:F2:E9:C5:51:12
PXE.NicPortMacAddress.4=40:F2:E9:C5:51:13

Now I realize the first two are from the dual port FDR IB mezzanine card 
(ConnectX-3 Pro). They can be used as 10/40/56 GbE, I suppose, but we want to 
use one of them for FDR IB only, and the other one isn't connected to anything.
The other two are BroadCom / Tg3

I wish I could ssh to the machine so I could poke around and see what the NICs 
are called. Maybe I will have to boot off a USB key.  No disks in these hosts.

 -- ddj

On Jun 25, 2015, at 1:33 PM, Jarrod Johnson 
<[email protected]<mailto:[email protected]>> wrote:


What nic driver was built in the initrd?  m4 was igb, m5 uses tg3.

" extra unusable Ethernet ports on the motherboard that mess up the interface 
naming. Is there a workaround for this???"

I'm interested in what this means and if I can help on that.

From: David Johnson [mailto:[email protected]]
Sent: Thursday, June 25, 2015 11:30 AM
To: xCAT Users Mailing list
Subject: Re: [xcat-user] NextScale deployment kernel crash

Yes, we are seeing exactly the same problem. 300 nodes from nehalem to 
nextscale m4 all work fine with the same centos 6.5 image, but not so for the 
the Lenovo nextscale M5 nodes. They seem to have extra unusable Ethernet ports 
on the motherboard that mess up the interface naming. Is there a workaround for 
this???

  -- ddj
Dave Johnson

On Jun 25, 2015, at 10:49 AM, Damir Krstic 
<[email protected]<mailto:[email protected]>> wrote:
We are trying to boot NextScale nodes with our RedHat 6.4 stateless image. They 
are crashing during the initrd boot process with following error:


dracut Warning: No root device "1" found



dracut Warning: Boot has failed. To debug this issue add "rdshell" to the 
kernel command line.



dracut Warning: Signal caught!





dracut Warning: Boot has failed. To debug this issue add "rdshell" to the 
kernel command line.

Kernel panic - not syncing: Attempted to kill init!

Pid: 1, comm: init Tainted: G           --------------- H  
2.6.32-358.el6.x86_64 #1

Call Trace:

 [<ffffffff8150cfc8>] ? panic+0xa7/0x16f

 [<ffffffff81073ae2>] ? do_exit+0x862/0x870

 [<ffffffff81182885>] ? fput+0x25/0x30

 [<ffffffff81073b48>] ? do_group_exit+0x58/0xd0

 [<ffffffff81073bd7>] ? sys_exit_group+0x17/0x20

 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b

------------[ cut here ]------------

WARNING: at arch/x86/kernel/smp.c:117 native_smp_send_reschedule+0x5c/0x60() 
(Tainted: G           --------------- H )

Hardware name: IBM NeXtScale nx360 M5: -[5465AC1]-

Modules linked in: sd_mod crc_t10dif ahci mlx4_core [last unloaded: 
scsi_wait_scan]

Pid: 1, comm: init Tainted: G           --------------- H  
2.6.32-358.el6.x86_64 #1

Call Trace:

 <IRQ>  [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0

 [<ffffffff8106e33a>] ? warn_slowpath_null+0x1a/0x20

 [<ffffffff8102dd9c>] ? native_smp_send_reschedule+0x5c/0x60

 [<ffffffff8105ae28>] ? scheduler_tick+0x208/0x260

 [<ffffffff810a7fd0>] ? tick_sched_timer+0x0/0xc0

 [<ffffffff810811de>] ? update_process_times+0x6e/0x90

 [<ffffffff810a8036>] ? tick_sched_timer+0x66/0xc0

 [<ffffffff8109b38e>] ? __run_hrtimer+0x8e/0x1a0

 [<ffffffff810a182f>] ? ktime_get_update_offsets+0x4f/0xd0

 [<ffffffff8107700f>] ? __do_softirq+0x11f/0x1e0

 [<ffffffff8109b6f6>] ? hrtimer_interrupt+0xe6/0x260

 [<ffffffff81516d7b>] ? smp_apic_timer_interrupt+0x6b/0x9b

 [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20

 <EOI>  [<ffffffff8150d06d>] ? panic+0x14c/0x16f

 [<ffffffff8150cffa>] ? panic+0xd9/0x16f

 [<ffffffff81073ae2>] ? do_exit+0x862/0x870

 [<ffffffff81182885>] ? fput+0x25/0x30

 [<ffffffff81073b48>] ? do_group_exit+0x58/0xd0

 [<ffffffff81073bd7>] ? sys_exit_group+0x17/0x20

 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b



Any help would be appreciated.



Thanks,

Damir
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
xCAT-user mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/xcat-user
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o_______________________________________________
xCAT-user mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/xcat-user

------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical & virtual servers, alerts via email & sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to