So the OS deployment development in confluent has progressed to the point I can 
demonstrate most of its usage with a few suboptimal behaviors (e.g. I want more 
informative errors when things could be better, there's a few commands run once 
to bootstrap manually that I want to pull together into one 'wizard'), but 
enough for me to ask if anyone wants to discuss or review. In general obviously 
it is influenced by my work on xCAT, but different in a few ways:

Different SSH strategy:
-No private keys are ever moved around ever. To the extent there is interaction 
with SSH or TLS keys, there is no longer a need to move them around, even if 
you want root to ssh between nodes without a password)
-ssh_known_hosts are now automatically handled across a cluster for all users 
(confluent takes responsibility by managing an SSH CA, meaning a one line 
known_hosts file can suffice for thousands, though it'd be one line per 
collective member)
-Users no longer need private/public keys in their home directories to SSH 
between nodes (root and all users may have SSH enabled using host based 
authentication. Connections are authenticated according to known_hosts rather 
than user keys)
-This all sums to SSH keying material used within a cluster, but not useful 
between clusters

Confluent no longer runs as root (this is already the case, but will continue 
to be the case into OS deployment)
Confluent is tested with SELinux (documentation will include the couple of 
sesetbools that would commonly be needed when dealing with NFS or Gluster)
Confluent collective doesn't require a 'master' and 'service' nodes, all 
members are equal (though you are free to think of one as a master for 
convenience, the actual installation and code will not care about the 
distinction)
              -Additionally, unambiguously /var/lib/confluent should be 
synchronized or a cluster filesystem across the board, no collective member 
unique data is anywhere in the tree

The default templates will be closer to default behavior of relative oses. For 
example, install RHEL/CentOS and it will have SELinux enabled and firewall 
enabled unless the user customizes it.

The default root password behavior changes from 'won't install' to 'password 
authentication disabled'. If user does not specify, then only ssh keys may be 
used to log in as root.
              -When root password is specified, confluent will refuse to store 
the password in the clear and will always apply a one-way crypt to input before 
retaining
              -Even when configured, they are not present in the 
kickstart/autoyast/autoinstall files that get downloaded, they get handled 
differently

Scripted install files are no longer templates that get filled in to 
node-unique values and put into web server. Every node downloads the exact same 
scripted install file, which contains no node or site specific content (unless 
added through customization).

By the same token, nodes no longer receive unique kernel command line 
arguments. PXE, HTTP, and remote media based boot are working and the identical 
boot image may be used across many nodes.

Modifying the profile to have 'console=tty...' is optional, if not specified 
and serial console is detected from firmware, then a best effort will be made 
during install to give info on 'nodeconsole', and the installed system will 
have the autodetected value put into console=.. on your behalf (unless 
overridden).

Generally speaking, distribution specific logic is moved out of the server. 
Mostly this means that to the extent templates are specialized, it is done on 
the node during install time rather than at the server in advance.

OS images are no longer spread all around the filesystems glued together by 
database entries across multiple tables, each profile is a directory. The 
equivalent of 'lsdef -t osimage' can be done by 'ls 
/var/lib/confluent/public/os/'. To the extent that large content is in multiple 
profiles, symbolic links are used to share the data in an obvious way. For 
example:
[root@pompeius os]# pwd
/var/lib/confluent/public/os
[root@pompeius os]# ls
centos-8.1-x86_64-default  opensuse_leap-15.1-x86_64-hpc  
ubuntu-20.04-x86_64-default
[root@pompeius os]# find centos-8.1-x86_64-default/ ! -type d
centos-8.1-x86_64-default/kickstart
centos-8.1-x86_64-default/profile.yaml
centos-8.1-x86_64-default/initprofile.sh
centos-8.1-x86_64-default/scripts/firstboot.service
centos-8.1-x86_64-default/scripts/post.sh
centos-8.1-x86_64-default/scripts/pre.sh
centos-8.1-x86_64-default/scripts/firstboot.sh
centos-8.1-x86_64-default/boot/initramfs/addons.cpio
centos-8.1-x86_64-default/boot/initramfs/site.cpio
centos-8.1-x86_64-default/boot/initramfs/distribution
centos-8.1-x86_64-default/boot/kernel
centos-8.1-x86_64-default/boot/efi/boot/BOOTX64.EFI
centos-8.1-x86_64-default/boot/efi/boot/grubx64.efi
centos-8.1-x86_64-default/boot/efi/boot/grub.cfg
centos-8.1-x86_64-default/boot/boot.ipxe
centos-8.1-x86_64-default/distribution
centos-8.1-x86_64-default/boot.img


Generally speaking, OS image
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to