Thanks for the help! This is the last thing I have to get working to
complete this engagement!

Ran xdsh service -K but no hostkeys were copied. The /etc/xcat/hostkeys
directory exists on the MN, and the noderes.xcatmaster  and
noderes.servicenode are the IP of the MN. I verified the root password on
both nodes by logging in through the console. The xdsh command came back
with:

/usr/bin/ssh setup is complete
return code = 0

Running xdsh service date and xdsh service tabdump site come back with the
correct results.

sn01: rpm -qa | grep xCAT
xCAT-nbroot-oss-x86-2.0-snap200804021050
xCAT-nbroot-oss-ppc64-2.0-snap200801291320
perl-xCAT-2.5.2-snap201103041120
xCAT-nbkernel-x86-2.6.18_164-8
xCAT-nbroot-oss-x86_64-2.0-snap200801291344
xCAT-nbroot-core-x86_64-2.5.1-snap201011121008
xCAT-nbroot-core-x86-2.5.1-snap201011121008
xCAT-nbroot-core-ppc64-2.5.1-snap201011121008
xCAT-client-2.5.2-snap201102251840
xCAT-nbkernel-x86_64-2.6.18_164-8
xCAT-nbkernel-ppc64-2.6.18_92-4
xCATsn-2.5.1-snap201011101325
xCAT-server-2.5.2-snap201103041121
xCATsn-2.5.1-snap201011101325

sn02: rpm -qa | grep xCAT
xCAT-nbroot-oss-x86-2.0-snap200804021050
xCAT-nbroot-oss-ppc64-2.0-snap200801291320
perl-xCAT-2.5.2-snap201103041120
xCAT-nbkernel-x86-2.6.18_164-8
xCAT-nbroot-oss-x86_64-2.0-snap200801291344
xCAT-nbroot-core-x86_64-2.5.1-snap201011121008
xCAT-nbroot-core-x86-2.5.1-snap201011121008
xCAT-nbroot-core-ppc64-2.5.1-snap201011121008
xCAT-client-2.5.2-snap201102251840
xCAT-nbkernel-x86_64-2.6.18_164-8
xCAT-nbkernel-ppc64-2.6.18_92-4
xCATsn-2.5.1-snap201011101325
xCAT-server-2.5.2-snap201103041121
xCATsn-2.5.1-snap201011101325

As for the DHCP, I gathered from an earlier email from Linda Mellor on the
recent thread on SNs:

"And the network possibilities with all of this can start to be
mind-boggling, but we try to address the most common ones as best we can:
      - the entire cluster on one flat network. This means there will be
      multiple DHCP servers (and tftp servers), and xCAT needs to configure
      any DHCP server to respond correctly to a broadcast request on the
      network, so all dhcpd.leases files will need to be identical, with
      the "next-server" value set to the designated tftpserver for a given
      node."

When I do nodeset node001 install, the tftpboot/pxelinux.cfg/node001 file
shows sn02 as the host it's pulling its files from. Also
in /install/autoinst on the MN & both SNs, the kickstart file shows
url=http://sn02/... and all that appears to work. When I manually nodeset
node001 boot and it boots up to the OS, I can see that DNS is working
properly. Could it be a matter of DNS is supposed to be configured on the
SNs (I have servicenode.nameserver=1) and it's not configured despite the
fact that the site.nameservers shows the IP for the xCAT MN (I edited that
out of my earlier email). Should site.nameservers have the SN IPs in there
as well for DNS to work? What about site.dhcpinterfaces? Will adding the
SNs in there cause makedhcp to build working DHCP configs on the SNs?

Christian D. Caruthers
Linux HPC Consultant
STG Lab Services
757-656-9675


                                                                                
                                                                   
  From:       Lissa Valletta/Poughkeepsie/IBM@IBMUS                             
                                                                   
                                                                                
                                                                   
  To:         xCAT Users Mailing list <xcat-user@lists.sourceforge.net>         
                                                                   
                                                                                
                                                                   
  Cc:         xcat-user@lists.sourceforge.net                                   
                                                                   
                                                                                
                                                                   
  Date:       07/21/2011 01:26 PM                                               
                                                                   
                                                                                
                                                                   
  Subject:    Re: [xcat-user] service nodes node working correctly              
                                                                   
                                                                                
                                                                   





So first I think we need to make sure your /etc/xcat/hostkeys are setup
correctly on the Service Nodes.  That should have been done during the
install of the Service Nodes.  Not sure what happened there.   You can run
xdsh <service>  -K  and that will correctly setup all the credentials and
keys on the Service Nodes.   Assuming you have the service  as your
servicenode group.   You will be prompted for root's password.

A couple of things to check after this.  Can you ssh to the service node
with no password prompt and can you access the database from the
servicenode.
  xdsh sn01,sn02 date
  xdsh sn01,sn02 tabdump site

One other thing  rpm -qa | grep xCAT for each service node and return the
output.   Just checking that the xCATsn-* rpm is on the service nodes not
the xCAT-* rpm which is for the Management Node.   This has happened
before.

It  looks like you are setup to have a dhcp server running on both service
nodes. Isn't this a bad on a flat network.  I thought we could only have
one dhcp server.   This is not my strong area.

Having the install directory mounted is typical.

Looking at   your definition of node001.  You do have servicenode name the
same as xcatmaster.   The servicenode is suppose to be the ip address as
known by the management node.   The xcatmaster is suppose to be the
servicenode address as known by node0001.   Is it correct to be the same?
Also will the xcatmaster name be resolved correctly during install.
servicenode=sn02
xcatmaster=sn02

Lissa K. Valletta
2-3/T12
Poughkeepsie, NY 12601
(tie 293) 433-3102





From:            Christian Caruthers/Richmond/IBM@IBMUS
To:              xcat-user@lists.sourceforge.net
Date:            07/21/2011 10:50 AM
Subject:                 [xcat-user] service nodes node working correctly




I have a number of compute nodes, 2 service nodes and an xCAT MN (xCAT
2.5.2) on a flat network. I would like to have all available services
running off the service nodes, but I am running into some problems.

When a compute node PXE boots for stateful install (rhel 6.1) it gets a
DHCP response from the xCAT MN rather than the SNs. Looking on the SNs, I
see an empty dhcpd.leases file. Running makedhcp doesn't resolve this.

After the DHCP response, the node pulls its boot image and installation
from the correct SN, but it fails in updating its status and reinstalls
after rebooting. If I run nodeset node boot from the MN, some of the
postscripts don't appear to run correctly. For example, remoteshell doesn't
run, and when I run it using updatenode from the MN, I get an error:

<error>Unable to read private DSA key from /etc/xcat/hostkeys</error>
<error>Unable to read private RSA key from /etc/xcat/hostkeys</error>

Looking on the SNs, I don't see any /etc/xcat/hostkeys directory. What's
supposed to set this up?

Sharing the /install directory. Currently, my SNs are configured to
NFS-mount the /install directory from the MN on boot. Is this correct or
should they be syncing that directory? I may have missed it, but the wiki
page was unclear on this to me.

Finally, Looking on the node that was installed by the SN, I see syslog is
configured to log to the SN, but I don't see that happening.

nodels sn01 servicenode
qservice01: servicenode.dhcpserver: 1
qservice01: servicenode.tftpserver: 1
qservice01: servicenode.node: sn01
qservice01: servicenode.nameserver: 1
qservice01: servicenode.nimserver: 1
qservice01: servicenode.ftpserver: 1
qservice01: servicenode.conserver: 1
qservice01: servicenode.monserver: 1
qservice01: servicenode.nfsserver: 1
qservice01: servicenode.comments:
qservice01: servicenode.ldapserver:
qservice01: servicenode.ntpserver:
qservice01: servicenode.ipforward:
qservice01: servicenode.disable:

tabdump site
#key,value,comments,disable
"xcatdport","3001",,
"xcatiport","3002",,
"tftpdir","/tftpboot",,
"master","mn01",,
"domain","cluster.net",,
"installdir","/install",,
"timezone","America/Chicago",,
"forwarders","XXX",,
"dhcpinterfaces","bond0",,
"ntpservers","mn01",,
"consoleondemand","yes",,
"sharedtftp","0",,
"nameservers","mn01",,
"installloc","/install",,

nodels node0001 noderes
qgpu0001: noderes.primarynic: eth0
qgpu0001: noderes.xcatmaster: sn02
qgpu0001: noderes.installnic: eth0
qgpu0001: noderes.netboot: pxe
qgpu0001: noderes.servicenode: sn02
qgpu0001: noderes.node: node0001
qgpu0001: noderes.nfsserver: sn02
qgpu0001: noderes.tftpserver:
qgpu0001: noderes.comments:
qgpu0001: noderes.nfsdir:
qgpu0001: noderes.disable:
qgpu0001: noderes.discoverynics:
qgpu0001: noderes.nimserver:
qgpu0001: noderes.cmdinterface:
qgpu0001: noderes.next_osimage:
qgpu0001: noderes.current_osimage:
qgpu0001: noderes.monserver:

lsdef sn02

Object name: sn02
    arch=x86_64
    bmc=sn02-bmc
    bmcport=0
    currchain=boot
    currstate=boot
    groups=service,ipmi,bnt102-service,x3650m2,all
    initrd=xcat/rhels5.4/x86_64/initrd.img
    installnic=eth0
    interface=eth0
    ip=XXXXXX
    kcmdline=nofb utf8 ks=http://mn01/install/autoinst/qservice02
ksdevice=eth0 console=tty0 console=ttyS0,115200 noipv6
    kernel=xcat/rhels5.4/x86_64/vmlinuz
    mac=E4:1F:13:44:F5:9C
    mgt=ipmi
    mtm=7945AC1
    netboot=pxe
    nfsserver=mn01
    os=rhels5.4
    postbootscripts=otherpkgs,setupntp,setupntp

postscripts=syslog,remoteshell,syncfiles,nwu.service,servicenode,xcatserver,xcatclient


    primarynic=eth0
    profile=service
    provmethod=install
    serial= 06GA470
    serialport=0
    serialspeed=115200
    servicenode=mn01
    setupconserver=1
    setupdhcp=1
    setupftp=1
    setupnameserver=1
    setupnfs=1
    setupnim=1
    setuptftp=1
    status=booting
    statustime=07-20-2011 16:25:39
    switch=bnt102
    switchport=8
    tftpserver=mn01
    xcatmaster=mn01

lsdef node0001

Object name: node0001
    arch=x86_64
    bmc=node0001-bmc
    bmcport=0
    chain=runcmd=bmcsetup,standby
    currchain=boot
    currstate=boot
    groups=gpu,ipmi,dx360m3,gpubnt01,gpurack01,all,allgpu
    initrd=xcat/rhels6.1/x86_64/initrd.img
    installnic=eth0
    interface=eth0
    ip=XXXXXX
    kcmdline=nofb utf8 ks=http://sn02/install/autoinst/qgpu0001
ksdevice=eth0 console=tty0 console=ttyS0,115200n8r noipv6
    kernel=xcat/rhels6.1/x86_64/vmlinuz
    mac=e4:1f:13:f0:80:9c
    mgt=ipmi
    mtm=6391AC1
    netboot=pxe
    nfsserver=sn02
    ondiscover=nodediscover
    os=rhels6.1
    postbootscripts=otherpkgs,setupntp,nwu.ipoib
    postscripts=syslog,remoteshell,syncfiles,nwu.ofed
    primarynic=eth0
    profile=gpu
    provmethod=install
    serial=06CGM96
    serialflow=hard
    serialport=0
    serialspeed=115200
    servicenode=sn02
    status=booted
    statustime=07-20-2011 22:23:36
    supportedarchs=x86,x86_64
    switch=gpubnt01
    switchinterface=eth0
    switchport=1
    switchvlan=1
    xcatmaster=sn02

Christian D. Caruthers
Linux HPC Consultant
STG Lab Services
757-656-9675


------------------------------------------------------------------------------


5 Ways to Improve & Secure Unified Communications
Unified Communications promises greater efficiencies for business. UC can
improve internal communications as well as offer faster, more efficient
ways
to interact with customers and streamline customer service. Learn more!
http://www.accelacomm.com/jaw/sfnl/114/51426253/
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



------------------------------------------------------------------------------

5 Ways to Improve & Secure Unified Communications
Unified Communications promises greater efficiencies for business. UC can
improve internal communications as well as offer faster, more efficient
ways
to interact with customers and streamline customer service. Learn more!
http://www.accelacomm.com/jaw/sfnl/114/51426253/
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user




------------------------------------------------------------------------------
5 Ways to Improve & Secure Unified Communications
Unified Communications promises greater efficiencies for business. UC can 
improve internal communications as well as offer faster, more efficient ways
to interact with customers and streamline customer service. Learn more!
http://www.accelacomm.com/jaw/sfnl/114/51426253/
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to