Evidently though something in his xCAT setup it creating the files in
/tftpboot/pxelinux.cfg/ with reference to xnba just like my
installation. Where does xCAT grab the configuration for that? Maybe
it was because I didn't do a completely clean install and did an
in-place upgrade, but my cluster actually works perfectly with both
xnba & genesis installed because it uses xnba first to bootstrap and
then requests the Genesis image. xCAT must support that scenario else
I haven't the slightest idea by what miracle my installation is
running with such a configuration. :-)

-Josh

On Tue, Jan 21, 2014 at 2:58 PM, Russell Jones
<russell-l...@jonesmail.me> wrote:
> xNBA is a customized gpxe image that xCAT uses.
>
> NBFS is the older maintenance image that was used for if you set your
> node to boot to shell, or booted a runimage script. NBFS is deprecated,
> and Genesis replaced NBFS as the maintenance image for these tasks.
>
> In a standard 2.8 install, there should no longer be any nbk/nbfs RPMs
> installed - Genesis replaced them.
>
> perl-xCAT-2.8.3-snap201311122316.noarch
> xCAT-2.8.3-snap201311122318.x86_64
> xCAT-client-2.8.3-snap201311122316.noarch
> xCAT-genesis-base-x86_64-2.8-snap201308090229.noarch
> elilo-xcat-3.14-4.noarch
> xCAT-server-2.8.3-snap201311122316.noarch
> xCAT-genesis-scripts-x86_64-2.8.3-snap201311122318.noarch
> ipmitool-xcat-1.8.11-3.x86_64
> conserver-xcat-8.1.16-10.x86_64
> xCAT-buildkit-2.8.3-snap201311122318.noarch
> syslinux-xcat-3.86-2.noarch
>
>
>
> On 1/21/2014 2:38 PM, Josh Nielsen wrote:
>> Hi Jonathan,
>>
>> Yes, I definitely think that would cause a problem. This is jogging my
>> memory because I think that when the new Genesis boot loader was
>> rolled out in the first version of xCAT that supported it that I faced
>> a similar problem. I had assumed that only Genesis was needed but xNBA
>> is still used an an intermediate image even if it is no longer the
>> final image. I will check my yum repos as soon as I can - but by some
>> unfortunate coincidence I just discovered that YUM is not working
>> since our RHEL license expired three days ago (unbeknownst to me until
>> 10 minutes ago). Do you have xCAT-genesis-x86_64 and elilo-xCAT? You
>> may even have to pull xNBA images from an older install(?) and then
>> run mknb to build the images.
>>
>> I remember downloading the tarred files with the RPM manually and
>> creating a local repo for xCAT. Whenever I get YUM back I'll give you
>> more specifics if I can.
>>
>> -Josh
>>
>> On Tue, Jan 21, 2014 at 1:54 PM, Jonathan Mills <jonmi...@renci.org> wrote:
>>> Josh,
>>>
>>> I don't doubt that you're on to something.  But if this is the case, it
>>> means my systems are missing some files, namely:
>>>
>>> /tftpboot/xcat/nbk.x86_64
>>> /tftpboot/xcat/nbfs.x86_64.gz
>>>
>>> Can you tell me what RPM installed those files on your system?  They
>>> don't exist on mine, and even a 'yum provides' doesn't find them.
>>>
>>>
>>> On 01/21/2014 11:51 AM, Josh Nielsen wrote:
>>>> Hi Jonathan,
>>>>
>>>> It is my understanding, from extensive debugging and notes that I have
>>>> taken about the xCAT netbooting process in the past, that xCAT uses a
>>>> two-stage image deployment method. It will first come up with a more
>>>> "generic" boot image (normally xnba or sometimes yaboot) which - when it
>>>> contacts the xCAT headnode (or the node handling DHCP requests) - the
>>>> headnode will then recognize the current image on the client that is
>>>> sending requests to DHCP for further boot instructions, and will tell
>>>> the client to then load another image based on the subnet and image type
>>>> it is currently using. For example my headnode's /etc/dhcpd.conf file
>>>> has an entry that looks like this:
>>>>
>>>> hared-network eth0 {
>>>>     subnet 10.20.0.0 netmask 255.255.0.0 {
>>>>       max-lease-time 43200;
>>>>       min-lease-time 43200;
>>>>       default-lease-time 43200;
>>>>       next-server  10.20.0.1;
>>>>       option log-servers 10.20.0.1;
>>>>       option ntp-servers 10.20.0.1;
>>>>       option domain-name "xxxxxxxxx";
>>>>       option domain-name-servers  10.20.0.1;
>>>>       if option user-class-identifier = "xNBA" and option
>>>> client-architecture = 00:00 { #x86, xCAT Network Boot Agent
>>>>          always-broadcast on;
>>>>          filename = 
>>>> "http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16";;
>>>>       } else if option user-class-identifier = "xNBA" and option
>>>> client-architecture = 00:09 { #x86, xCAT Network Boot Agent
>>>>          filename =
>>>> "http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16.uefi";;
>>>>       } else if option client-architecture = 00:00  { #x86
>>>>         filename "xcat/xnba.kpxe";
>>>>       } else if option vendor-class-identifier = "Etherboot-5.4"  { #x86
>>>>         filename "xcat/xnba.kpxe";
>>>>       } else if option client-architecture = 00:07 { #x86_64 uefi
>>>>          filename "xcat/xnba.efi";
>>>>       } else if option client-architecture = 00:09 { #x86_64 uefi
>>>> alternative id
>>>>          filename "xcat/xnba.efi";
>>>>       } else if option client-architecture = 00:02 { #ia64
>>>>          filename "elilo.efi";
>>>>       } else if substring(filename,0,1) = null { #otherwise, provide
>>>> yaboot if the client isn't specific
>>>>          filename "/yaboot";
>>>>       }
>>>>       range dynamic-bootp 10.20.200.254 10.20.254.254;
>>>>     } # 10.20.0.0/255.255.0.0 <http://10.20.0.0/255.255.0.0> subnet_end
>>>>
>>>> So if it boots with the xNBA image it then directs it to the
>>>> http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16 which has the
>>>> genesis boot instructions in it:
>>>>
>>>> #!gpxe
>>>> imgfetch -n kernel
>>>> http://${next-server}/tftpboot/xcat/genesis.kernel.x86_64 quiet
>>>> xcatd=10.20.0.1:3001 <http://10.20.0.1:3001>  BOOTIF=01-${netX/machyp}
>>>> imgfetch -n nbfs http://${next-server}/tftpboot/xcat/genesis.fs.x86_64.gz
>>>> imgload kernel
>>>> imgexec kernel
>>>>
>>>> So first it boots with xnba (first stage of boot), it contacts the DHCP
>>>> server which gives it a "next-server" option of itself (saying to the
>>>> client: request the next image from me - the headnode - again), and then
>>>> gives it a boot file with instructions for the next image, then it
>>>> executes it and finally loads genesis. You will also notice that the
>>>> very last options (if it matches nothing else) is yaboot, which is
>>>> another generic image, which will in turn probably request the next
>>>> image. Try watching your log for the tftp daemon messages to see what is
>>>> being sent.
>>>>
>>>> I wonder if you are having problems at the first stage DHCP redirecting
>>>> stage though. Check your options statements in /etc/dhcpd.conf to see
>>>> where it is directing xNBA images.
>>>>
>>>> Regards,
>>>> Josh Nielsen
>>>>
>>>>
>>>> On Tue, Jan 21, 2014 at 10:26 AM, Jonathan Mills <jonmi...@renci.org
>>>> <mailto:jonmi...@renci.org>> wrote:
>>>>
>>>>      Wang,
>>>>
>>>>      Thank you for your response.  I did some digging and here is what I
>>>>      found.
>>>>
>>>>      cat /tftpboot/xcat/xnba/nets/10.100.0.0_24
>>>>      #!gpxe
>>>>      imgfetch -n kernel
>>>>      http://${next-server}/tftpboot/xcat/genesis.kernel.x86_64 quiet
>>>>      xcatd=10.100.0.1:3001 <http://10.100.0.1:3001>  
>>>> BOOTIF=01-${netX/machyp}
>>>>      imgfetch -n nbfs
>>>>      http://${next-server}/tftpboot/xcat/genesis.fs.x86_64.lzma
>>>>      imgload kernel
>>>>      imgexec kernel
>>>>
>>>>
>>>>
>>>>      cat /tftpboot/pxelinux.cfg/0A6400
>>>>      DEFAULT xCAT
>>>>          LABEL xCAT
>>>>          KERNEL xcat/nbk.x86_64
>>>>          APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.100.0.1:3001
>>>>      <http://10.100.0.1:3001>
>>>>
>>>>
>>>>
>>>>      So, clearly, those things don't match up.  That strikes me as an xCAT
>>>>      issue, but nevermind.  I manually modified 
>>>> /tftpboot/pxelinux.cfg/0A6400
>>>>      to make it look like:
>>>>
>>>>      DEFAULT xCAT
>>>>          LABEL xCAT
>>>>          KERNEL xcat/genesis.kernel.x86_64
>>>>          APPEND initrd=xcat/genesis.fs.x86_64.lzma quiet
>>>>      xcatd=10.100.0.1:3001 <http://10.100.0.1:3001>
>>>>      BOOTIF=eth0
>>>>
>>>>
>>>>      (It is safe, in this case, to designate BOOTIF as 'eth0' -- with Cisco
>>>>      UCS hardware, and using vNICs, the first interface will always show up
>>>>      in Linux as eth0 -- at least, that is my experience).
>>>>
>>>>      After this change, I was indeed able to PXE boot the first node, and I
>>>>      was hopeful that node discovery would then take place.  However, this
>>>>      still did not occur.  On console, I dug into the running genesis image
>>>>      on the first node, and I found that it had no ethernet interfaces
>>>>      whatsoever, because the genesis kernel has no driver support for Cisco
>>>>      UCS hardware.
>>>>
>>>>      For example, this is the ethtool output of a Cisco UCS vNIC:
>>>>
>>>>      [root@ncsu-hn nets]# ethtool -i eth0
>>>>      driver: enic
>>>>      version: 2.1.1.39
>>>>      firmware-version: 2.0(4b)
>>>>      bus-info: 0000:06:00.0
>>>>      supports-statistics: yes
>>>>      supports-test: no
>>>>      supports-eeprom-access: no
>>>>      supports-register-dump: no
>>>>      supports-priv-flags: no
>>>>
>>>>
>>>>      You can see it requires the 'enic' kernel module, usually located at:
>>>>      /lib/modules/`uname -r`/kernel/drivers/net/enic/enic.ko
>>>>
>>>>      This module isn't found within the genesis image, so the node PXE 
>>>> boots,
>>>>      and then can do no more.  Node discovery fails.
>>>>
>>>>      On 01/20/2014 09:19 PM, Xiao Peng Wang wrote:
>>>>       > xCAT is using genesis (an xCAT customized pxe tool) to function the
>>>>       > discovery process. The configuration for genesis is put in
>>>>       > /tftpboot/xcat/xnba/nets/ for a specific network. Could you check
>>>>      your
>>>>       > specific xnba configuration file for your deployment network has 
>>>> been
>>>>       > put in /tftpboot/xcat/xnba/nets/?
>>>>       >
>>>>       > The prerequisite for booting of genesis is to make the node has a
>>>>       > dynamic IP address. Did you configure the dynamic IP range for your
>>>>       > deployment network? Could you take a look of your syslog to see
>>>>      whether
>>>>       > the node has sent out dhcp request and what did your dhcp server
>>>>      replied
>>>>       > to them?
>>>>       >
>>>>       > Thanks
>>>>       > Best Regards
>>>>       >
>>>>      ----------------------------------------------------------------------
>>>>       > Wang Xiaopeng (王晓朋)
>>>>       > IBM China System Technology Laboratory
>>>>       > Tel: 86-10-82453455
>>>>       > Email: w...@cn.ibm.com <mailto:w...@cn.ibm.com>
>>>>       > Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West 
>>>> Road,
>>>>       > Haidian District Beijing P.R.China 100193
>>>>       >
>>>>       > Inactive hide details for Jonathan Mills ---2014/01/19 
>>>> 06:24:02---I'm
>>>>       > running xCAT 2.8.3 and CentOS 6.4 atop of Cisco UCS-C harJonathan
>>>>      Mills
>>>>       > ---2014/01/19 06:24:02---I'm running xCAT 2.8.3 and CentOS 6.4
>>>>      atop of
>>>>       > Cisco UCS-C hardware.  I'm  attempting to do a sequent
>>>>       >
>>>>       > From: Jonathan Mills <jonmi...@renci.org 
>>>> <mailto:jonmi...@renci.org>>
>>>>       > To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net
>>>>      <mailto:xcat-user@lists.sourceforge.net>>,
>>>>       > Date: 2014/01/19 06:24
>>>>       > Subject: [xcat-user] Frustrating time with sequential node 
>>>> discovery
>>>>       >
>>>>       >
>>>>      
>>>> ------------------------------------------------------------------------
>>>>       >
>>>>       >
>>>>       >
>>>>       > I'm running xCAT 2.8.3 and CentOS 6.4 atop of Cisco UCS-C
>>>>      hardware.  I'm
>>>>       > attempting to do a sequential nodediscovery.  I've pre-populated 
>>>> the
>>>>       > nodelist table with the nodenames, so I shouldn't need to do 
>>>> anything
>>>>       > more than
>>>>       >
>>>>       > nodediscoverystart noderange=node[1-15]
>>>>       >
>>>>       > However, none of the nodes ever gets discovered.
>>>>       >
>>>>       > Digging deeper, it seems that none of them ever successfully PXE
>>>>      boot at
>>>>       > all.  They should be PXE booting off of the genesis netboot image 
>>>> and
>>>>       > speaking back to the xcatmaster, correct?
>>>>       >
>>>>       > When I run 'mknb x86_64', it populates /tftpboot/pxelinux.cfg with
>>>>       > entries to non-existent netboot images.  Watch:
>>>>       >
>>>>       > [root@ncsu-hn ~]# rpm -qf /opt/xcat/sbin/mknb
>>>>       > xCAT-client-2.8.3-snap201311122316.noarch
>>>>       > [root@ncsu-hn ~]# mknb x86_64
>>>>       > Creating genesis.fs.x86_64.lzma in /tftpboot/xcat
>>>>       > [root@ncsu-hn ~]# cd /tftpboot/pxelinux.cfg/
>>>>       > [root@ncsu-hn pxelinux.cfg]# ls
>>>>       > 0A6400  0A6500  0A6600  7F  98300D  98300DE6  98300DE7  C0A86B
>>>>       > [root@ncsu-hn pxelinux.cfg]# cat *
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.100.0.1:3001
>>>>      <http://10.100.0.1:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.101.0.1:3001
>>>>      <http://10.101.0.1:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.102.0.1:3001
>>>>      <http://10.102.0.1:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=127.0.0.1:3001
>>>>      <http://127.0.0.1:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=152.48.13.3:3001
>>>>      <http://152.48.13.3:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet
>>>>      xcatd=152.48.13.230:3001 <http://152.48.13.230:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet
>>>>      xcatd=152.48.13.231:3001 <http://152.48.13.231:3001>
>>>>       > DEFAULT xCAT
>>>>       >    LABEL xCAT
>>>>       >    KERNEL xcat/nbk.x86_64
>>>>       >    APPEND initrd=xcat/nbfs.x86_64.gz quiet
>>>>      xcatd=192.168.107.10:3001 <http://192.168.107.10:3001>
>>>>       > [root@ncsu-hn pxelinux.cfg]# cd ../xcat/
>>>>       > [root@ncsu-hn xcat]# ls -la
>>>>       > total 21528
>>>>       > drwxr-xr-x  4 root root     4096 Jan 17 13:06 .
>>>>       > drwxr-xr-x. 7 root root     4096 Jan 18 22:02 ..
>>>>       > -rwxr-xr-x  1 root root   242929 Jan 15  2012 elilo-x64.efi
>>>>       > -rw-r--r--  1 root root 17573621 Jan 18 22:03 
>>>> genesis.fs.x86_64.lzma
>>>>       > -rwxr-xr-x  1 root root  3986608 Aug  9 06:29 genesis.kernel.x86_64
>>>>       > drwxr-xr-x  3 root root     4096 Jan 17 13:06 osimage
>>>>       > drwxr-xr-x  3 root root     4096 Dec 23 07:42 xnba
>>>>       > -rw-r--r--  1 root root   139200 Oct 28 16:16 xnba.efi
>>>>       > -rw-r--r--  1 root root    74792 Oct 28 16:16 xnba.kpxe
>>>>       >
>>>>       >
>>>>       >
>>>>       > As you can see....it ought to be netbooting the genesis kernel, but
>>>>       > instead all my pxelinux.cfg/* files are instructing clients to
>>>>      boot the
>>>>       > non-existent "nbk.x86_64" image.
>>>>       >
>>>>       > Your advice is appreciated.
>>>>       >
>>>>       > --
>>>>       > Jonathan Mills
>>>>       > Systems Administrator
>>>>       > Renaissance Computing Institute
>>>>       > UNC-Chapel Hill
>>>>       >
>>>>       >
>>>>      
>>>> ------------------------------------------------------------------------------
>>>>       > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>>>       > Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>>>       > Critical Workloads, Development Environments & Everything In 
>>>> Between.
>>>>       > Get a Quote or Start a Free Trial Today.
>>>>       >
>>>>      
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>>>       > _______________________________________________
>>>>       > xCAT-user mailing list
>>>>       > xCAT-user@lists.sourceforge.net
>>>>      <mailto:xCAT-user@lists.sourceforge.net>
>>>>       > https://lists.sourceforge.net/lists/listinfo/xcat-user
>>>>       >
>>>>       >
>>>>
>>>>      --
>>>>      Jonathan Mills
>>>>      Systems Administrator
>>>>      Renaissance Computing Institute
>>>>      UNC-Chapel Hill
>>>>
>>>>      
>>>> ------------------------------------------------------------------------------
>>>>      CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>>>      Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>>>      Critical Workloads, Development Environments & Everything In Between.
>>>>      Get a Quote or Start a Free Trial Today.
>>>>      
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>>>      _______________________________________________
>>>>      xCAT-user mailing list
>>>>      xCAT-user@lists.sourceforge.net 
>>>> <mailto:xCAT-user@lists.sourceforge.net>
>>>>      https://lists.sourceforge.net/lists/listinfo/xcat-user
>>>>
>>>>
>>> --
>>> Jonathan Mills
>>> Systems Administrator
>>> Renaissance Computing Institute
>>> UNC-Chapel Hill
>>>
>>> ------------------------------------------------------------------------------
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> Critical Workloads, Development Environments & Everything In Between.
>>> Get a Quote or Start a Free Trial Today.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> xCAT-user mailing list
>> xCAT-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/xcat-user
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> xCAT-user mailing list
> xCAT-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xcat-user

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to