Re: [yocto] Creating a build system which can scale. #yocto

Rudolf J Streif Mon, 17 Feb 2020 08:54:08 -0800

Hi Philip,

We have done this with many Yocto Project builds using AWS EC2, Docker,
Gitlab and Artifactory.


Rest inlined below.

:rjs

On 2/17/20 4:27 AM, [email protected] wrote:
> Hi,
>
> I'm looking for some advice about the best way to implement a build
> environment in the cloud for multiple dev teams which will scale as
> the number of dev teams grow.
>
> Our devs are saying:
>
> *What do we want?*
>
> To scale our server-based build infrastructure, so that engineers can
> build branches using the same infrastructure that produces a
> releasable artefact, before pushing it into develop. As much
> automation of this as possible is desired..
>
It can be configured that any check in to branches can trigger a build.
That is what we do with developers on their own branches as well as with
the master branches. The master branch is the integration branch. Then
there are release and development branches but they all use the same
build environment.
>
>
> *Blocker*: Can’t just scale current system – can’t keep throwing more
> hardware at it, particularly storage. The main contributor to storage
> requirements is using a local cache in each build workspace and there
> will be one workspace for each branch, per Jenkins agent: 3 teams x 10
> branches per team x 70Gb per branch/workspace x number of build agents
> (let say 5) = 10Tb. As you can see this doesn’t scale well as we add
> branches, teams or build agents. Most of this 10Tb is the caches in
> each workspace, where most of the contents of each individual cache is
> identical.
>
> *A possible solution:
> *
>
> Disclaimer/admission: I’ve not really researched/considered _/all/_
>  possible solutions to the problem above, I just started searching and
> reading and came up with/felt led towards  this. I think there is some
> value in spending some of the meeting exploring other options to see
> if anything sounds better (for a what definition of better?).
>
> *
> *
>
We do this with Gitlab runners and working instances on EC2. Since it
can take some time to spin up a new instance we hold a certain amount
running during business hours. If more are needed more are spun up
transparently. Of course this costs money in particular when large
instances with a lot of memory and a lot of vCPUs are used. Instances
can automatically be terminated if there is overcapacity. There are
other cost control options. Docker images inside the instances provide
the controlled build environment.
>
> Something using the built-in cache mirror in Yocto–there are a few
> ways it can do this, as it’s essentially a file share somewhere.
> https://pelux.io/2017/06/19/How-to-create-a-shared-sstate-dir.html for
> an example shows how to share it via NFS, but you can also use http or
> ftp.
>
EC2 elastic storage works via NFS (pretty straight forward). Artifactory
can be used too.
>
> Having a single cache largely solves the storage issue as there is
> only one cache, so having solved that issue, it introduces a few more
> questions and constraints:
>
>  1. How do we manage the size of the cache?
>
> There’s no built-in expiry mechanism I could find. This means we’d
> probably have to create something ourselves (parse access logs from
> the server hosting the cache and apply a garbage collector process).
>
You have to prune it yourself. Typically based on age and when the
development moves to a new release of YP.
>
>  2. How/When do we update the cache?
>
> All environments contributing to the cache need to be identical (that
> ansible playbook just grabs the latest of everything) to avoid subtle
> differences in the build artefacts depending on which environment
> populated the cache.
>
We do this with the release builds only.
>
>  3. How much time will fetching the cache from a remote server add to
>     the build?
>
> I think this is probably something we will have to just live with, but
> if it’s all in the cloud the network speed between VMs is fast.
>
There is no generic answer to this. It depends on the storage and of
course the networking infrastructure.
>
> This shared cache solution removes the per agent cost on storage, and
> also – a varying extent – the per branch costs (assuming that you’re
> not working on something at the top/start beginning of the dependency
> tree)  from the equation above.
>
>  
>
Yes, that is the idea. Since the builds are running inside a Docker
instance there is a local cache but it will be discarded when the
container is discarded and the VM is spun down. Cache misses require
additional time but that's the nature of it.
>
> I’d love to see some other ideas as well as I worry I’m missing
> something easier or more obvious – better.
>
> Any thoughts?
> Thanks
> Phill
>
>  
>
>  
>
:rjs
>
> 

-- 
-----
Rudolf J Streif
CEO/CTO ibeeto
+1.855.442.3386 x700

signature.asc
Description: OpenPGP digital signature

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48458): https://lists.yoctoproject.org/g/yocto/message/48458
Mute This Topic: https://lists.yoctoproject.org/mt/71347835/21656
Mute #yocto: https://lists.yoctoproject.org/mk?hashtag=yocto&subid=6691583
Group Owner: [email protected]
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [yocto] Creating a build system which can scale. #yocto

Reply via email to