So first I think we need to make sure your /etc/xcat/hostkeys are setup
correctly on the Service Nodes.  That should have been done during the
install of the Service Nodes.  Not sure what happened there.   You can run
xdsh <service>  -K  and that will correctly setup all the credentials and
keys on the Service Nodes.   Assuming you have the service  as your
servicenode group.   You will be prompted for root's password.

A couple of things to check after this.  Can you ssh to the service node
with no password prompt and can you access the database from the
servicenode.
  xdsh sn01,sn02 date
  xdsh sn01,sn02 tabdump site

One other thing  rpm -qa | grep xCAT for each service node and return the
output.   Just checking that the xCATsn-* rpm is on the service nodes not
the xCAT-* rpm which is for the Management Node.   This has happened
before.

It  looks like you are setup to have a dhcp server running on both service
nodes. Isn't this a bad on a flat network.  I thought we could only have
one dhcp server.   This is not my strong area.

Having the install directory mounted is typical.

Looking at   your definition of node001.  You do have servicenode name the
same as xcatmaster.   The servicenode is suppose to be the ip address as
known by the management node.   The xcatmaster is suppose to be the
servicenode address as known by node0001.   Is it correct to be the same?
Also will the xcatmaster name be resolved correctly during install.
servicenode=sn02
xcatmaster=sn02

Lissa K. Valletta
2-3/T12
Poughkeepsie, NY 12601
(tie 293) 433-3102





From:   Christian Caruthers/Richmond/IBM@IBMUS
To:     xcat-user@lists.sourceforge.net
Date:   07/21/2011 10:50 AM
Subject:        [xcat-user] service nodes node working correctly




I have a number of compute nodes, 2 service nodes and an xCAT MN (xCAT
2.5.2) on a flat network. I would like to have all available services
running off the service nodes, but I am running into some problems.

When a compute node PXE boots for stateful install (rhel 6.1) it gets a
DHCP response from the xCAT MN rather than the SNs. Looking on the SNs, I
see an empty dhcpd.leases file. Running makedhcp doesn't resolve this.

After the DHCP response, the node pulls its boot image and installation
from the correct SN, but it fails in updating its status and reinstalls
after rebooting. If I run nodeset node boot from the MN, some of the
postscripts don't appear to run correctly. For example, remoteshell doesn't
run, and when I run it using updatenode from the MN, I get an error:

<error>Unable to read private DSA key from /etc/xcat/hostkeys</error>
<error>Unable to read private RSA key from /etc/xcat/hostkeys</error>

Looking on the SNs, I don't see any /etc/xcat/hostkeys directory. What's
supposed to set this up?

Sharing the /install directory. Currently, my SNs are configured to
NFS-mount the /install directory from the MN on boot. Is this correct or
should they be syncing that directory? I may have missed it, but the wiki
page was unclear on this to me.

Finally, Looking on the node that was installed by the SN, I see syslog is
configured to log to the SN, but I don't see that happening.

nodels sn01 servicenode
qservice01: servicenode.dhcpserver: 1
qservice01: servicenode.tftpserver: 1
qservice01: servicenode.node: sn01
qservice01: servicenode.nameserver: 1
qservice01: servicenode.nimserver: 1
qservice01: servicenode.ftpserver: 1
qservice01: servicenode.conserver: 1
qservice01: servicenode.monserver: 1
qservice01: servicenode.nfsserver: 1
qservice01: servicenode.comments:
qservice01: servicenode.ldapserver:
qservice01: servicenode.ntpserver:
qservice01: servicenode.ipforward:
qservice01: servicenode.disable:

tabdump site
#key,value,comments,disable
"xcatdport","3001",,
"xcatiport","3002",,
"tftpdir","/tftpboot",,
"master","mn01",,
"domain","cluster.net",,
"installdir","/install",,
"timezone","America/Chicago",,
"forwarders","XXX",,
"dhcpinterfaces","bond0",,
"ntpservers","mn01",,
"consoleondemand","yes",,
"sharedtftp","0",,
"nameservers","mn01",,
"installloc","/install",,

nodels node0001 noderes
qgpu0001: noderes.primarynic: eth0
qgpu0001: noderes.xcatmaster: sn02
qgpu0001: noderes.installnic: eth0
qgpu0001: noderes.netboot: pxe
qgpu0001: noderes.servicenode: sn02
qgpu0001: noderes.node: node0001
qgpu0001: noderes.nfsserver: sn02
qgpu0001: noderes.tftpserver:
qgpu0001: noderes.comments:
qgpu0001: noderes.nfsdir:
qgpu0001: noderes.disable:
qgpu0001: noderes.discoverynics:
qgpu0001: noderes.nimserver:
qgpu0001: noderes.cmdinterface:
qgpu0001: noderes.next_osimage:
qgpu0001: noderes.current_osimage:
qgpu0001: noderes.monserver:

lsdef sn02

Object name: sn02
    arch=x86_64
    bmc=sn02-bmc
    bmcport=0
    currchain=boot
    currstate=boot
    groups=service,ipmi,bnt102-service,x3650m2,all
    initrd=xcat/rhels5.4/x86_64/initrd.img
    installnic=eth0
    interface=eth0
    ip=XXXXXX
    kcmdline=nofb utf8 ks=http://mn01/install/autoinst/qservice02
ksdevice=eth0 console=tty0 console=ttyS0,115200 noipv6
    kernel=xcat/rhels5.4/x86_64/vmlinuz
    mac=E4:1F:13:44:F5:9C
    mgt=ipmi
    mtm=7945AC1
    netboot=pxe
    nfsserver=mn01
    os=rhels5.4
    postbootscripts=otherpkgs,setupntp,setupntp

postscripts=syslog,remoteshell,syncfiles,nwu.service,servicenode,xcatserver,xcatclient

    primarynic=eth0
    profile=service
    provmethod=install
    serial= 06GA470
    serialport=0
    serialspeed=115200
    servicenode=mn01
    setupconserver=1
    setupdhcp=1
    setupftp=1
    setupnameserver=1
    setupnfs=1
    setupnim=1
    setuptftp=1
    status=booting
    statustime=07-20-2011 16:25:39
    switch=bnt102
    switchport=8
    tftpserver=mn01
    xcatmaster=mn01

lsdef node0001

Object name: node0001
    arch=x86_64
    bmc=node0001-bmc
    bmcport=0
    chain=runcmd=bmcsetup,standby
    currchain=boot
    currstate=boot
    groups=gpu,ipmi,dx360m3,gpubnt01,gpurack01,all,allgpu
    initrd=xcat/rhels6.1/x86_64/initrd.img
    installnic=eth0
    interface=eth0
    ip=XXXXXX
    kcmdline=nofb utf8 ks=http://sn02/install/autoinst/qgpu0001
ksdevice=eth0 console=tty0 console=ttyS0,115200n8r noipv6
    kernel=xcat/rhels6.1/x86_64/vmlinuz
    mac=e4:1f:13:f0:80:9c
    mgt=ipmi
    mtm=6391AC1
    netboot=pxe
    nfsserver=sn02
    ondiscover=nodediscover
    os=rhels6.1
    postbootscripts=otherpkgs,setupntp,nwu.ipoib
    postscripts=syslog,remoteshell,syncfiles,nwu.ofed
    primarynic=eth0
    profile=gpu
    provmethod=install
    serial=06CGM96
    serialflow=hard
    serialport=0
    serialspeed=115200
    servicenode=sn02
    status=booted
    statustime=07-20-2011 22:23:36
    supportedarchs=x86,x86_64
    switch=gpubnt01
    switchinterface=eth0
    switchport=1
    switchvlan=1
    xcatmaster=sn02

Christian D. Caruthers
Linux HPC Consultant
STG Lab Services
757-656-9675


------------------------------------------------------------------------------

5 Ways to Improve & Secure Unified Communications
Unified Communications promises greater efficiencies for business. UC can
improve internal communications as well as offer faster, more efficient
ways
to interact with customers and streamline customer service. Learn more!
http://www.accelacomm.com/jaw/sfnl/114/51426253/
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



------------------------------------------------------------------------------
5 Ways to Improve & Secure Unified Communications
Unified Communications promises greater efficiencies for business. UC can 
improve internal communications as well as offer faster, more efficient ways
to interact with customers and streamline customer service. Learn more!
http://www.accelacomm.com/jaw/sfnl/114/51426253/
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to