Well, I went through and restored all updated files in /install/postscripts 
back to previous version, and now 
my node boots again.  Something in one of the scripts probably depends on 
something in the newer binaries
that were not updated.   Thanks for the debugging tips.  

 — ddj

> On Jul 2, 2018, at 3:24 PM, David Johnson <david_john...@brown.edu> wrote:
> 
> .
> .
> .
> Running command on mgt5.oscar.ccv.brown.edu 
> <http://mgt5.oscar.ccv.brown.edu/>: chmod -R a+r /install/postscripts 2>&1
> 
>   mgt5.oscar.ccv.brown.edu <http://mgt5.oscar.ccv.brown.edu/>: Internal call 
> command: xdsh node552 --nodestatus -s -v -e 
> /install/postscripts/xcatdsklspost 1 -m 172.20.0.6 'setupntp' --tftp 
> /tftpboot --installdir /install --nfsv4 no -c -V
> Running command on mgt5.oscar.ccv.brown.edu 
> <http://mgt5.oscar.ccv.brown.edu/>: /bin/hostname 2>&1
> Running command on mgt5.oscar.ccv.brown.edu 
> <http://mgt5.oscar.ccv.brown.edu/>: ip -4 --oneline addr show |awk -F ' ' 
> '{print $4}'|awk -F '/' '{print $1}' 2>&1
> Running command on mgt5.oscar.ccv.brown.edu 
> <http://mgt5.oscar.ccv.brown.edu/>: hostname 2>&1
> Running command on mgt5.oscar.ccv.brown.edu 
> <http://mgt5.oscar.ccv.brown.edu/>: /opt/xcat/bin/pping node552 2>&1
> node552: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
> 
> Error: node552 remote shell had error code: 255
> [root@mgt5 xcat]# 
> 
> 
>> On Jul 2, 2018, at 3:20 PM, David Johnson <david_john...@brown.edu 
>> <mailto:david_john...@brown.edu>> wrote:
>> 
>> After the chdef command and reboot the message in /var/log/xcat/xcat.log is 
>> exactly the same
>> [root@node552 xcat]# cat xcat.log 
>> Mon Jul  2 15:18:10 EDT 2018 [info]: xcat.xcatdsklspost: trying to download 
>> postscripts...
>> Mon Jul  2 15:18:26 EDT 2018 [err]: xcat.xcatdsklspost: failed to download 
>> the postscripts from the xCAT server for node node552.oscar.ccv.brown.edu 
>> <http://node552.oscar.ccv.brown.edu/>
>> 
>> [root@mgt5 xcat]# tabdump site | grep debug
>> "xcatdebugmode","1",,
>> [root@mgt5 xcat]# 
>> 
>> 
>>> On Jul 2, 2018, at 3:13 PM, Casandra H Qiu <cxh...@us.ibm.com 
>>> <mailto:cxh...@us.ibm.com>> wrote:
>>> 
>>> updatenode will issue command from management node and process command 
>>> (postscripts) on the compute node after node is booted. 
>>> postscripts on the compute node /xcatpost/ will be downloaded again from 
>>> MN:/install/postscripts (not sure if you want to do this)
>>> 
>>> ]# updatenode mid21tor24cn01 setupntp -V
>>> [boston01]: Running command on boston01: ip -4 --oneline addr show |awk -F 
>>> ' ' '{print $4}'|awk -F '/' '{print $1}' 2>&1
>>> 
>>> [boston01]: Running command on boston01: chmod -R a+r /install/postscripts 
>>> 2>&1
>>> 
>>> [boston01]: boston01: Internal call command: xdsh mid21tor24cn01 
>>> --nodestatus -s -v -e /install/postscripts/xcatdsklspost 1 -m 172.16.37.1 
>>> 'setupntp' --tftp /tftpboot --installdir /install --nfsv4 no -c -V
>>> [boston01]: Running command on boston01: ip -4 --oneline addr show |awk -F 
>>> ' ' '{print $4}'|awk -F '/' '{print $1}' 2>&1
>>> [boston01]: Running command on boston01: hostname 2>&1
>>> [boston01]: Running command on boston01: /opt/xcat/bin/pping mid21tor24cn01 
>>> 2>&1
>>> [boston01]: mid21tor24cn01: Running /tmp/filez8tn6J.dsh 1 -m 172.16.37.1 
>>> setupntp --tftp /tftpboot --installdir /install --nfsv4 no -c -V
>>> [boston01]: mid21tor24cn01: trying to download postscripts...
>>> [boston01]: mid21tor24cn01: trying to download postscripts from 
>>> http://172.16.37.1/install/postscripts/ 
>>> <http://172.16.37.1/install/postscripts/>
>>> [boston01]: mid21tor24cn01: postscripts are downloaded from 172.16.37.1 
>>> successfully.
>>> [boston01]: mid21tor24cn01: postscripts downloaded successfully
>>> [boston01]: mid21tor24cn01: trying to get mypostscript from 172.16.37.1...
>>> [boston01]: mid21tor24cn01: trying to download 
>>> http://172.16.37.1/tftpboot/mypostscripts/mypostscript.mid21tor24cn01.. 
>>> <http://172.16.37.1/tftpboot/mypostscripts/mypostscript.mid21tor24cn01..>.
>>> [boston01]: mid21tor24cn01: mypostscript.mid21tor24cn01 is downloaded 
>>> successfully.
>>> [boston01]: mid21tor24cn01: Running //xcatpost/mypostscript
>>> [boston01]: mid21tor24cn01: Mon Jul 2 14:39:39 EDT 2018 Running postscript: 
>>> setupntp
>>> [boston01]: mid21tor24cn01: Failed to set time zone: Invalid time zone 
>>> 'America/New_York'
>>> [boston01]: mid21tor24cn01: inactive
>>> [boston01]: mid21tor24cn01: syncing the clock ...
>>> [boston01]: mid21tor24cn01: WARNING: NTP Sync Failed before timeout. ntp 
>>> server will try to sync...
>>> [boston01]: mid21tor24cn01: Created symlink from 
>>> /etc/systemd/system/multi-user.target.wants/ntpd.service to 
>>> /usr/lib/systemd/system/ntpd.service.
>>> [boston01]: mid21tor24cn01: postscript: setupntp exited with code 0
>>> [boston01]: mid21tor24cn01: //xcatpost/mypostscript return with 0
>>> [boston01]: mid21tor24cn01: Running of postscripts has completed.
>>> 
>>> -V options will show more debug information, hopefully we can see why 
>>> failed to downloaded the postscripts. 
>>> 
>>> by the way, after you rebooted the node, did u see any error message from 
>>> /var/log/xcat/xcat.log?
>>> 
>>> 
>>> ...................................................................
>>> Casandra Hong Qiu
>>> Phone: (845) 433-9291, t/l 293-9291
>>> Office: Building 8, 3-B-04
>>> cxh...@us.ibm.com <mailto:cxh...@us.ibm.com>
>>> 
>>> 
>>> 
>>> <graycol.gif>David Johnson ---07/02/2018 02:41:17 PM---I have never used 
>>> updatenode. Just reboot. Would that help debug? > On Jul 2, 2018, at 2:16 
>>> PM, Cas
>>> 
>>> From: David Johnson <david_john...@brown.edu 
>>> <mailto:david_john...@brown.edu>>
>>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net 
>>> <mailto:xcat-user@lists.sourceforge.net>>
>>> Date: 07/02/2018 02:41 PM
>>> Subject: Re: [xcat-user] go-xcat accidentally run with /install mounted 
>>> from production xcat master
>>> 
>>> 
>>> 
>>> 
>>> I have never used updatenode. Just reboot. Would that help debug?
>>> On Jul 2, 2018, at 2:16 PM, Casandra H Qiu <cxh...@us.ibm.com 
>>> <mailto:cxh...@us.ibm.com>> wrote:
>>> on the site table, turn on the "xcatdebugmode"
>>> 
>>> chdef -t site xcatdebugmode=1
>>> 
>>> Did u try to run "updatenode" again for the current compute node? add 
>>> verbose "-V" to the option, should able to get more logs.
>>> 
>>> 
>>> Thanks,
>>> Casandra Qiu
>>> 
>>> ...................................................................
>>> Casandra Hong Qiu
>>> Phone: (845) 433-9291, t/l 293-9291
>>> Office: Building 8, 3-B-04
>>> cxh...@us.ibm.com <mailto:cxh...@us.ibm.com>
>>> 
>>> 
>>> 
>>> <graycol.gif>David Johnson ---07/02/2018 11:37:15 AM---Thanks, it’s good to 
>>> know I’ve got a hundred or so copies with changes back to 9/2017 on our 
>>> compute
>>> 
>>> From: David Johnson <david_john...@brown.edu 
>>> <mailto:david_john...@brown.edu>>
>>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net 
>>> <mailto:xcat-user@lists.sourceforge.net>>
>>> Date: 07/02/2018 11:37 AM
>>> Subject: Re: [xcat-user] go-xcat accidentally run with /install mounted 
>>> from production xcat master
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Thanks, it’s good to know I’ve got a hundred or so copies with changes back 
>>> to 9/2017 on our compute nodes.
>>> However even after restoring postscripts/hostkeys, postscripts/_xcat and 
>>> postscripts/_ssh and checking the
>>> files we created are still there, the diskless boot still fails:
>>> Mon Jul 2 10:27:35 EDT 2018 [info]: xcat.xcatdsklspost: trying to download 
>>> postscripts...
>>> Mon Jul 2 10:28:14 EDT 2018 [err]: xcat.xcatdsklspost: failed to download 
>>> the postscripts from the xCAT server for node node552.oscar.ccv.brown.edu 
>>> <http://node552.oscar.ccv.brown.edu/>
>>> 
>>> 
>>> I will try restoring all the changed scripts, but would like to know at 
>>> what point it decided it was a failure.
>>> All the /xcatboot files seem to be there. I tried prepopulating the 
>>> /etc/ssh directory in the image, but that didn’t help.
>>> 
>>> Where do I turn up the debugging?
>>> 
>>> Thanks!
>>> ― ddj
>>> On Jul 2, 2018, at 5:29 AM, Song BJ Yang <yang...@cn.ibm.com 
>>> <mailto:yang...@cn.ibm.com>> wrote:
>>> 
>>> hi David Johnson,
>>> 
>>> In the scenario you described, I think xCAT installation will affect the 
>>> stuff under `/install/postscripts`
>>> 1) files with the same name with the files under `/install/postscripts` 
>>> shipped by xCAT will be overwritten 
>>> 2) the credentials under "/install/postscripts/hostkeys", 
>>> "/install/postscripts/_xcat" and "/install/postscripts/_ssh/" and 
>>> "/install/postscripts/ca/" will be overwritten
>>> 
>>> You can restore the original "/install/postscritps" directory with the 
>>> "/xcatpost" directory on the compute nodes(non-hierarchy cluster) or 
>>> service nodes(hierarchy cluster) which are not provisioned after xCAT 
>>> reinstallation with `go-xcat` accidentally.
>>> 
>>> 
>>> good luck 
>>> ------------------------------------------------------------------------------
>>> YANG Song (杨嵩)
>>> IBM China System Technology Laboratory
>>> Tel: 86-10-82452903
>>> Email: yang...@cn.ibm.com <mailto:yang...@cn.ibm.com>
>>> Address: Building 28, ZhongGuanCun Software Park,
>>> No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC
>>> 
>>> 北京市海淀区东北旺西路8号中关村软件园28号楼
>>> 邮编: 100193
>>> 
>>> 
>>> ----- Original message -----
>>> From: "Victor Hu" <v...@us.ibm.com <mailto:v...@us.ibm.com>>
>>> To: xcat-user@lists.sourceforge.net <mailto:xcat-user@lists.sourceforge.net>
>>> Cc:
>>> Subject: Re: [xcat-user] go-xcat accidentally run with /install mounted 
>>> from production xcat master
>>> Date: Fri, Jun 29, 2018 11:11 PM
>>> 
>>> Hi Dave,
>>> 
>>> Thanks for reporting this, can you open a issue here? 
>>> https://github.com/xcat2/xcat-core/issues 
>>> <https://github.com/xcat2/xcat-core/issues>
>>> If it's inconvenient, please let me know and I'll open one on your behalf.
>>> 
>>> Thanks,
>>> Victor
>>> 
>>> ----- Original message -----
>>> From: David Johnson <david_john...@brown.edu 
>>> <mailto:david_john...@brown.edu>>
>>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net 
>>> <mailto:xcat-user@lists.sourceforge.net>>
>>> Cc:
>>> Subject: [xcat-user] go-xcat accidentally run with /install mounted from 
>>> production xcat master
>>> Date: Thu, Jun 28, 2018 3:05 PM
>>> 
>>> I’m trying to set up a separate cluster with a new xcat master node, and 
>>> downloaded and ran the go-xcat script.
>>> I realized afterwards that it had updated a whole bunch of files in 
>>> /install, which I had forgotten was nfs-mounted
>>> from our main cluster’s master node.
>>> 
>>> Right off the bat, I’m thinking there might be a problem with ssh keys, and 
>>> ca.pem.
>>> Any other likely trouble spots?   I put all our netboot stuff in 
>>> /install/custom, and we
>>> have added postinstall scripts rather than customizing ones from the 
>>> distribution.
>>> 
>>> I can probably go out to any of the nodes still running and grab individual 
>>> files back.
>>> 
>>> Thanks,
>>> ― ddj
>>> Dave Johnson
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot <http://sdm.link/slashdot>
>>> _______________________________________________
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>>> 
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot <http://sdm.link/slashdot>
>>> _______________________________________________
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot_______________________________________________ 
>>> <http://sdm.link/slashdot_______________________________________________>
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot 
>>> <http://sdm.link/slashdot>_______________________________________________
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>>> 
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot_______________________________________________ 
>>> <http://sdm.link/slashdot_______________________________________________>
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot 
>>> <http://sdm.link/slashdot>_______________________________________________
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>>> 
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>>> http://sdm.link/slashdot_______________________________________________ 
>>> <http://sdm.link/slashdot_______________________________________________>
>>> xCAT-user mailing list
>>> xCAT-user@lists.sourceforge.net <mailto:xCAT-user@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>> 
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to