Hi, I'm looking for some advice about the best way to implement a build environment in the cloud for multiple dev teams which will scale as the number of dev teams grow.
Our devs are saying: *What do we want?* To scale our server-based build infrastructure, so that engineers can build branches using the same infrastructure that produces a releasable artefact, before pushing it into develop. As much automation of this as possible is desired.. *Blocker* : Can’t just scale current system – can’t keep throwing more hardware at it, particularly storage. The main contributor to storage requirements is using a local cache in each build workspace and there will be one workspace for each branch, per Jenkins agent: 3 teams x 10 branches per team x 70Gb per branch/workspace x number of build agents (let say 5) = 10Tb. As you can see this doesn’t scale well as we add branches, teams or build agents. Most of this 10Tb is the caches in each workspace, where most of the contents of each individual cache is identical. *A possible solution: * Disclaimer/admission: I’ve not really researched/considered _ all _ possible solutions to the problem above, I just started searching and reading and came up with/felt led towards this. I think there is some value in spending some of the meeting exploring other options to see if anything sounds better (for a what definition of better?). ** Something using the built-in cache mirror in Yocto–there are a few ways it can do this, as it’s essentially a file share somewhere. https://pelux.io/2017/06/19/How-to-create-a-shared-sstate-dir.html for an example shows how to share it via NFS, but you can also use http or ftp. Having a single cache largely solves the storage issue as there is only one cache, so having solved that issue, it introduces a few more questions and constraints: * How do we manage the size of the cache? There’s no built-in expiry mechanism I could find. This means we’d probably have to create something ourselves (parse access logs from the server hosting the cache and apply a garbage collector process). * How/When do we update the cache? All environments contributing to the cache need to be identical (that ansible playbook just grabs the latest of everything) to avoid subtle differences in the build artefacts depending on which environment populated the cache. * How much time will fetching the cache from a remote server add to the build? I think this is probably something we will have to just live with, but if it’s all in the cloud the network speed between VMs is fast. This shared cache solution removes the per agent cost on storage, and also – a varying extent – the per branch costs (assuming that you’re not working on something at the top/start beginning of the dependency tree) from the equation above. I’d love to see some other ideas as well as I worry I’m missing something easier or more obvious – better. Any thoughts? Thanks Phill
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48448): https://lists.yoctoproject.org/g/yocto/message/48448 Mute This Topic: https://lists.yoctoproject.org/mt/71347835/21656 Mute #yocto: https://lists.yoctoproject.org/mk?hashtag=yocto&subid=6691583 Group Owner: [email protected] Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
