Re: [yocto] Creating a build system which can scale. #yocto

Quentin Schulz Mon, 17 Feb 2020 04:44:51 -0800

Hi Philip,

*Very* quick and vague answer as it's not something I'm doing right now.
I can only give hints to where to look next.


On Mon, Feb 17, 2020 at 04:27:17AM -0800, [email protected] wrote:
> Hi,
> 
> I'm looking for some advice about the best way to implement a build 
> environment in the cloud for multiple dev teams which will scale as the 
> number of dev teams grow.
> 
> Our devs are saying:
> 
> *What do we want?*
> 
> To scale our server-based build infrastructure, so that engineers can build 
> branches using the same infrastructure that produces a releasable artefact, 
> before pushing it into develop. As much automation of this as possible is 
> desired..
> 
> *Blocker* : Can’t just scale current system – can’t keep throwing more 
> hardware at it, particularly storage. The main contributor to storage 
> requirements is using a local cache in each build workspace and there will be 
> one workspace for each branch, per Jenkins agent: 3 teams x 10 branches per 
> team x 70Gb per branch/workspace x number of build agents (let say 5) = 10Tb. 
> As you can see this doesn’t scale well as we add branches, teams or build 
> agents. Most of this 10Tb is the caches in each workspace, where most of the 
> contents of each individual cache is identical.
> 

Have you had a look at INHERIT += "rm_work"? Should get rid of most of
the space in the work directory (we use this one, tremendous benefit in
terms of storage space).

c.f. 
https://www.yoctoproject.org/docs/current/mega-manual/mega-manual.html#ref-classes-rm-work

Incidently, also highlights broken recipes (e.g. one getting files from other
sysroots/elsewhere in the FS).

> *A possible solution:
> *
> 
> Disclaimer/admission: I’ve not really researched/considered _ all _  possible 
> solutions to the problem above, I just started searching and reading and came 
> up with/felt led towards  this. I think there is some value in spending some 
> of the meeting exploring other options to see if anything sounds better (for 
> a what definition of better?).
> 
> **
> 
> Something using the built-in cache mirror in Yocto–there are a few ways it 
> can do this, as it’s essentially a file share somewhere. 
> https://pelux.io/2017/06/19/How-to-create-a-shared-sstate-dir.html for an 
> example shows how to share it via NFS, but you can also use http or ftp.
> 
> Having a single cache largely solves the storage issue as there is only one 
> cache, so having solved that issue, it introduces a few more questions and 
> constraints:
> 
> * How do we manage the size of the cache?
> 
> There’s no built-in expiry mechanism I could find. This means we’d probably 
> have to create something ourselves (parse access logs from the server hosting 
> the cache and apply a garbage collector process).
> 

Provided you're not using a webserver with a cache (or a cache that is
refreshed every now and then), cronjob with find -atime -delete and
you're good.

> * How/When do we update the cache?
> 
> All environments contributing to the cache need to be identical (that ansible 
> playbook just grabs the latest of everything) to avoid subtle differences in 
> the build artefacts depending on which environment populated the cache.
> 
> * How much time will fetching the cache from a remote server add to the build?
> 
> I think this is probably something we will have to just live with, but if 
> it’s all in the cloud the network speed between VMs is fast.
> 

I remember (wrongly?) reading that sharing sstate-cache over NFS isn't a very
good idea (latency outweights the benefits in terms of storage/shared
sstate cache).

> This shared cache solution removes the per agent cost on storage, and also – 
> a varying extent – the per branch costs (assuming that you’re not working on 
> something at the top/start beginning of the dependency tree)  from the 
> equation above.
> 
> I’d love to see some other ideas as well as I worry I’m missing something 
> easier or more obvious – better.
> 

I'm not too sure to have understood the exact use case but maybe you
would want to have a look at:

 - shared DL_DIR (this one can be served by an NFS, there isn't too much
 access to it during a build).
 - SSTATE_MIRRORS (c.f. 
https://www.yoctoproject.org/docs/current/mega-manual/mega-manual.html#var-SSTATE_MIRRORS),
 is basically a webserver serving the sstate-cache from an already-built
 image/system. This is RO, would make sense if your Jenkins is building
 a system and then your devs are basing their work on top of it. They
 would get the sstate-cache from your Jenkins and AFAIK, does not
 duplicate the sstate-cache locally = more free storage space.
 - investigate docker containers for guaranteed identical build environment,
 Pyrex has been often suggested on IRC.
 https://github.com/garmin/pyrex/

That's all I could think of about your issue, I unfortunately do not
have more knowledge to share on that topic.

Good luck, let us know what you decided to do :)

Quentin

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48449): https://lists.yoctoproject.org/g/yocto/message/48449
Mute This Topic: https://lists.yoctoproject.org/mt/71347835/21656
Mute #yocto: https://lists.yoctoproject.org/mk?hashtag=yocto&subid=6691583
Group Owner: [email protected]
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [yocto] Creating a build system which can scale. #yocto

Reply via email to