Hi Song, Thanks for your feedback, it turns out the issue had nothing to do with the osimage definition and partition file.
We noticed that a line in /opt/xcat/lib/perl/xCAT_plugin/dhcp.pm was related to a TFTP boot issue: 2720: push @netent, " filename = \"http://$tftp.':' . $httpport/tftpboot/xcat/xnba/nets/" . $net . "_" . $maskbits . "\";\n"; This was resulting in the following line in /etc/dhcp/dhcpd.conf after running ‘makedhcp -n’: 55: filename = "http://xcat.':' . 80/tftpboot/xcat/xnba/nets/10.40.0.0_16"; So when running the discovery process on the node, this was resulting in the following error: Filename: filename = http://xcat.':' . 80/tftpboot/xcat/xnba/nets/10.40.0.0_16 http://xcat.%27':' . 80/tftpboot/xcat/xnba/nets/10.40.0.0_16… Error 0x3e11613b No more network devices PXE-MOF: Exiting Broadcom PXE ROM. To get around this I have to run a sed command to remove the extraneous “.':' . “ characters in the tftpboot URL to correct the address: http://xcat:80/tftpboot/xcat/xnba/nets/10.40.0.0_16 Do you have any ideas why this might have started happening? Because this was not the case previously… The xCAT version is: 2.14.5 Thanks, Sandra From: Song BJ Yang <yang...@cn.ibm.com> Sent: Saturday, 16 March 2019 1:23 AM To: xcat-user@lists.sourceforge.net Cc: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] anaconda: DEBUG Gtk cannot be initialized What is the osimage definition? if you are provisioning rh7/centos7 and using customized partition file with disk specified with /dev/sdx, the sdx disk name is not persistent across reboots, as well as installer and the booted up system another hint is to enable xcatdebug mode by changing site.xcatdebugmode=2, then retrovision the node, then you can ssh into anaconda to obtain more info on what happened. ------------------------------------------------------------------------------ YANG Song (杨嵩) IBM China System Technology Laboratory Tel: 86-10-82452903 Email: yang...@cn.ibm.com<mailto:yang...@cn.ibm.com> Address: Building 28, ZhongGuanCun Software Park, No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC 北京市海淀区东北旺西路8号中关村软件园28号楼 邮编: 100193 ----- Original message ----- From: Sandra Maksimovic <sandra.maksimo...@mcri.edu.au<mailto:sandra.maksimo...@mcri.edu.au>> To: "xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>" <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Cc: Subject: [xcat-user] anaconda: DEBUG Gtk cannot be initialized Date: Fri, Mar 15, 2019 11:12 AM Hi all, While attempting to build an r720 host with 2 configured physical disks (sda,sdb) I continually receive the following kinds of errors (I’ve also included some suspicious messages higher up in the log): blivet: DEBUG partitions: [] blivet: DEBUG checking whether disk sda has an empty extended blivet: DEBUG IGNORED: Caught exception, continuing. blivet: DEBUG IGNORED: Begin exception details. blivet: DEBUG IGNORED: Traceback (most recent call last): blivet: DEBUG IGNORED: File "/usr/lib/python2.7/site-packages/blivet/formats/disklabel.py", line 348, in extendedPartition blivet: DEBUG IGNORED: extended = self.partedDisk.getExtendedPartition() blivet: DEBUG IGNORED: AttributeError: 'NoneType' object has no attribute 'getExtendedPartition' blivet: DEBUG IGNORED: End exception details. blivet: DEBUG IGNORED: Caught exception, continuing. blivet: DEBUG IGNORED: Begin exception details. blivet: DEBUG IGNORED: Traceback (most recent call last): blivet: DEBUG IGNORED: File "/usr/lib/python2.7/site-packages/blivet/formats/disklabel.py", line 357, in logicalPartitions blivet: DEBUG IGNORED: logicals = self.partedDisk.getLogicalPartitions() blivet: DEBUG IGNORED: AttributeError: 'NoneType' object has no attribute 'getLogicalPartitions' blivet: DEBUG IGNORED: End exception details. blivet: DEBUG extended is None ; logicals is [] blivet: DEBUG checking whether disk sdb has an empty extended blivet: DEBUG extended is None ; logicals is [] blivet: DEBUG DeviceTree.getDependentDevices: dep: existing 1786.5 GiB disk sdb (9) with existing msdos disklabel ; hidden: False ; … (the rest of the stuff about sdb seems OK so leaving it out) … anaconda: DEBUG running handleException anaconda: CRIT Traceback (most recent call last):#012#012 File "/sbin/anaconda", line 1374, in <module>#012 anaconda._intf.setup(ksdata)#012#012 File "/usr/lib64/python2.7/site-packages/pyanaconda/ui/tui/__init__.py", line 171, in setup#012 should_schedule = obj.setup(self.ENVIRONMENT)#012#012 File "/usr/lib64/python2.7/site-packages/pyanaconda/ui/tui/hubs/summary.py", line 64, in setup#012 spoke.execute()#012#012 File "/usr/lib64/python2.7/site-packages/pyanaconda/ui/tui/spokes/storage.py", line 439, in execute#012 doKickstartStorage(self.storage, self.data, self.instclass)#012#012 File "/usr/lib64/python2.7/site-packages/pyanaconda/kickstart.py", line 2529, in doKickstartStorage#012 ksdata.partition.execute(storage, ksdata, instClass)#012#012 File "/usr/lib64/python2.7/site-packages/pyanaconda/kickstart.py", line 1268, in execute#012 doPartitioning(storage)#012#012 File "/usr/lib/python2.7/site-packages/blivet/partitioning.py", line 974, in doPartitioning#012 allocatePartitions(storage, disks, partitions, free)#012#012 File "/usr/lib/python2.7/site-packages/blivet/partitioning.py", line 1101, in allocatePartitions#012 disklabel = disklabels[_disk.path]#012#012KeyError: u'/dev/sda' anaconda: DEBUG Gtk cannot be initialized anaconda: DEBUG In the main thread, running exception handler anaconda: INFO Running kickstart %%onerror script(s) anaconda: INFO All kickstart %%onerror script(s) have been run anaconda: INFO Running kickstart %%traceback script(s) anaconda: INFO All kickstart %%traceback script(s) have been run The weird thing is the node was building properly prior to reverting a small postscript change I had made yesterday, so I am not quite sure why this issue would be occurring now. I also went ahead and re-did the RAID controller setup on this host, as well as trying out various other configurations in the kickstart file (e.g. excluding /dev/sda from the config altogether, etc...) but so far it has not made a difference. Just wondering if anyone has encountered the issue before, or could perhaps suggest another investigatorial path…? Prior to re-configuring the RAID I was experiencing the following errors (if it’s helpful): Not enough space in file systems for the current software selection. An additional XXXX MiB is needed. and.. keyerror: u'/dev/sda' Thanks, Sandra Maksimovic Systems Administrator Information Technology Murdoch Children's Research Institute The Royal Children's Hospital, 50 Flemington Road Parkville, Victoria 3052 Australia T +61 3 8341 6498 E sandra.maksimo...@mcri.edu.au<mailto:sandra.maksimo...@mcri.edu.au> W mcri.edu.au<https://www.mcri.edu.au/> This e-mail and any attachments to it (the "Communication") are, unless otherwise stated, confidential, may contain copyright material and is for the use only of the intended recipient. If you receive the Communication in error, please notify the sender immediately by return e-mail, delete the Communication and the return e-mail, and do not read, copy, retransmit or otherwise deal with it. Any views expressed in the Communication are those of the individual sender only, unless expressly stated to be those of Murdoch Children’s Research Institute (MCRI) ABN 21 006 566 972 or any of its related entities. MCRI does not accept liability in connection with the integrity of or errors in the Communication, computer virus, data corruption, interference or delay arising from or in respect of the Communication. _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user<https://lists.sourceforge.net/lists/listinfo/xcat-user>
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user