Re: [xcat-user] [External] Re: confluent 3.0.1 release

Jarrod Johnson Sun, 13 Sep 2020 09:44:18 -0700

Thanks.

To clarify, confluent is not strictly based on xCAT code, but instead is more 
like how I would have suggested xCAT 2 would have been done if I knew then all 
the things I know now and the general landscape behaved in the way it does. 
Most straightforward point to illustrate, confluent is a python project, while 
xCAT is a perl project.


Since I had been inspired by xCAT 1 and contributed largely to xCAT 2 design 
and also confluent, there are similarities:
-The noderange syntax and capabilities are familiar (though the confluent 
implementation is faster and the code is easier to read)
-The general command style is similar (though with tab completion and client 
side argument parsing)
-The concept of defining things at group or node level and formulaic expansion 
is similar (though confluent is faster, safer at evaluating formulae, generally 
a lower learning curve, but lacking the full power of regex currently)
-Broadly speaking, look and feel has a lot in common

However with an evolved landscape and learning a lot of things since xCAT 2, 
there are differences:
-Ensuring it can run without root (e.g. better managing ownership, switching 
away from loop mount from media import, leveraging systemd managed ambient 
capabilities for exceptions while using an interpreted language)
-Making many things faster (xCAT did all inheritance and evaluation on read, 
confluent does it on write, no fork per individual request, attribute expansion 
without an expensive ‘safe’, among other things)
-The multi-manager model evolves to no longer externalize the multi-system 
access to a SQL database, and all participant managers are more equal
-A tighter focus on security (more careful handling of TLS certs, no movement 
of private keys, forming a collective forces user to help prove mutual 
authentication, SELinux enabled during dev and testing, deployment having a 
more simplified and restricted mechanism to get node credentials, conveniences 
that may be needed but are arguably insecure are opt-in rather than always on, 
not even crypted root password in kickstart/autoyast files)
-A client-server model that is natively more sysfs-like or REST-like that is 
available either through a bespoke protocol or normal HTTP over REST without 
either side being a frontend for the other.
-An extended discovery facility that can discover some equipment in standby 
power mode, and advances PXE discovery to occur without offering an address 
(PXE discovery without a dynamic range is supported)
-More easy access to the bits and pieces that comprise functionality so that 
its easier to still use things like discovery from confluent, but use a 
different software for the engine of OS deployment (e.g the cluster-wide switch 
fdb is available to external software, discovery can be stopped at the mac 
collection step without ever offering an OS to boot, etc).
-Moving away from a SQL database model with a translation layer to pretend to 
be key-value to a natively key-value datastore with categorized attribute names 
to resemble the ‘table.column’ behavior of xCAT, and a reworked structure to 
avoid oddities like the same conceptual thing should be ‘ipmi.bmc’ or ‘mp.mpa’ 
by happenstance of plugin preference.
-Rather than OS deployment content being a bit spread out across the filesystem 
and held together only by database definition, have OS deployment profiles be a 
directly, using symlinks or copies as appropriate to keep things in one place 
for easy modification/examination without crazy amounts of disk space being 
used.
-Customization of os deployment in confluent is always on the OS profile, and 
the server side never makes per-node ‘kickstart’ files, moving that onto the 
node, and comma delimited sequences of scripts to run are replaced with editing 
script content freely. The first bits are strictly less powerful, but easier to 
understand and modify, and the latter is pretty much more powerful apart from 
not supporting different behaviors of a single profile across two nodes.
-Canned scripts are segmented by general OS category they are designed for. No 
longer is there a single ssh script that has a lot of conditionals to handle 
distro to distro differences.
-No dependency or conflict with an external DHCP server, more emphasis on 
static IP configuration being supported (though DHCP is also supported), easier 
to use without dedicated networks or in cases where new DHCP servers are 
forbidden.
-Moving away from hard requirement of name forward/reverse lookup as bound to 
node name, allowing cases where the deployment network is not the ‘primary’ 
network a bit easier.
-Going with the current winner of the general interpreted language popularity 
contest: python. It’s probably the least objectionable choice overall, though 
no choice can make everyone happy.

There are probably others, but this gives the general idea. It was a bit rough 
to go clean slate but it forced re-evaluating every little design choice in a 
more modern context with some experience under the belt and making some choices 
that make migration difficult. Conceptually it may be like ‘xCAT 3’ in my mind, 
but I don’t have the vision to make a migration from xCAT 2 to confluent 
trivial, so I’m reluctant to suggest it would be that straightforward.

Currently, it stops short of some xCAT functionality. The most loud feedback 
has been lack of stateless.  Currently I have two things in mind:
-Ability to import an xCAT genimage stateless and ‘confluent-ify’ it’s startup
-Talk to warewulf guys to see if they want to do a joint effort as they too are 
trying to reinvent themselves

I have ideas about ways to improve stateless for modern era, but I’m hoping 
that a clean Warewulf collaboration would leave room for that to be on the same 
page for more people.

From: Vinícius Ferrão via xCAT-user <xcat-user@lists.sourceforge.net>
Sent: Saturday, September 12, 2020 5:57 PM
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc: Vinícius Ferrão <fer...@versatushpc.com.br>
Subject: [External] Re: [xcat-user] confluent 3.0.1 release

Hi Jarrod,

Indeed Confluent have a lot of cool features, and still using xCAT as basis. 
Mainly the SSH infrastructure and security features are the best additions IMO. 
A lot of folks, including myself, are adapting xCAT to be more secure 
(SELinux/Firewall) and more compliant, changing or removing entire postscripts 
for example, because they are now dated, specially the SSH ones.

With this in mind, aren’t xCAT devs willing to incorporate those changes so 
everybody can benefit from it?

Thanks all.


On 10 Sep 2020, at 16:05, Jarrod Johnson 
<jjohns...@lenovo.com<mailto:jjohns...@lenovo.com>> wrote:

For those that may be interested, confluent 3 is out now:
https://hpc.lenovo.com/users/hpc/update/2020/09/10/20brelease.html

This marks the first time that confluent may be used for OS deployment for 
those that are interested.
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/xcat-user

_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Re: [xcat-user] [External] Re: confluent 3.0.1 release

Reply via email to