On Wed, Dec 10, 2025 at 12:28:19PM -0800, Stefano Stabellini wrote:
> On Wed, 10 Dec 2025, Marek Marczykowski-Górecki wrote:
> > On Tue, Dec 09, 2025 at 04:02:06PM -0800, Stefano Stabellini wrote:
> > > On Sat, 6 Dec 2025, Marek Marczykowski-Górecki wrote:
> > > > Setup a simple two domU system. One with network backend, running
> > > > xendriverdomain service, and one with frontend, trying to ping the
> > > > backend.
> > > > 
> > > > Contrary to other similar tests, use disk image instead of initrd, to
> > > > allow bigger rootfs without adding more RAM (for both dom0 and domU).
> > > > But keep using pxelinux as a bootloader as it's easier to setup than
> > > > installing grub on the disk. Theoretically, it could be started via 
> > > > direct
> > > > kernel boot in QEMU, but pxelinux is slightly closer to real-world
> > > > deployment.
> > > > 
> > > > Use fakeroot to preserve file owners/permissions. This is especially
> > > > important for suid binaries like /bin/mount - without fakeroot, they
> > > > will end up as suid into non-root user.
> > > > 
> > > > Signed-off-by: Marek Marczykowski-Górecki 
> > > > <[email protected]>
> > > > ---
> > > > Changes in v3:
> > > > - add fakeroot
> > > > - run ldconfig at the disk image creation time, to avoid running it at
> > > >   dom0/domU boot time (which is much slower)
> > > > Changes in v2:
> > > > - use heredoc
> > > > - limit ping loop iterations
> > > > - use full "backend" / "frontend" in disk image names
> > > > - print domU consoles directly to /dev/console, to avoid systemd-added
> > > >   messages prefix
> > > > - terminate test on failure, don't wait for timeout
> > > > ---
> > > >  automation/build/debian/13-x86_64.dockerfile    |   2 +-
> > > >  automation/gitlab-ci/test.yaml                  |   8 +-
> > > >  automation/scripts/qemu-driverdomains-x86_64.sh | 138 
> > > > +++++++++++++++++-
> > > >  3 files changed, 148 insertions(+)
> > > >  create mode 100755 automation/scripts/qemu-driverdomains-x86_64.sh
> > > > 
> > > > diff --git a/automation/build/debian/13-x86_64.dockerfile 
> > > > b/automation/build/debian/13-x86_64.dockerfile
> > > > index 2c6c9d4a5098..6382bafbd5bd 100644
> > > > --- a/automation/build/debian/13-x86_64.dockerfile
> > > > +++ b/automation/build/debian/13-x86_64.dockerfile
> > > > @@ -55,7 +55,9 @@ RUN <<EOF
> > > >  
> > > >          # for test phase, qemu-* jobs
> > > >          busybox-static
> > > > +        e2fsprogs
> > > >          expect
> > > > +        fakeroot
> > > >          ovmf
> > > >          qemu-system-x86
> > > >  
> > > > diff --git a/automation/gitlab-ci/test.yaml 
> > > > b/automation/gitlab-ci/test.yaml
> > > > index 7b36f1e126ca..abc5339a74ab 100644
> > > > --- a/automation/gitlab-ci/test.yaml
> > > > +++ b/automation/gitlab-ci/test.yaml
> > > > @@ -656,6 +656,14 @@ qemu-alpine-x86_64-gcc:
> > > >      - *x86-64-test-needs
> > > >      - alpine-3.22-gcc
> > > >  
> > > > +qemu-alpine-driverdomains-x86_64-gcc:
> > > > +  extends: .qemu-x86-64
> > > > +  script:
> > > > +    - ./automation/scripts/qemu-driverdomains-x86_64.sh 2>&1 | tee 
> > > > ${LOGFILE}
> > > > +  needs:
> > > > +    - *x86-64-test-needs
> > > > +    - alpine-3.22-gcc
> > > > +
> > > >  qemu-smoke-x86-64-gcc:
> > > >    extends: .qemu-smoke-x86-64
> > > >    script:
> > > > diff --git a/automation/scripts/qemu-driverdomains-x86_64.sh 
> > > > b/automation/scripts/qemu-driverdomains-x86_64.sh
> > > > new file mode 100755
> > > > index 000000000000..c0241da54168
> > > > --- /dev/null
> > > > +++ b/automation/scripts/qemu-driverdomains-x86_64.sh
> > > > @@ -0,0 +1,138 @@
> > > > +#!/bin/bash
> > > > +
> > > > +set -ex -o pipefail
> > > > +
> > > > +dom0_rootfs_extra_comp=()
> > > > +dom0_rootfs_extra_uncomp=()
> > > > +
> > > > +cd binaries
> > > > +
> > > > +# DomU rootfs
> > > > +
> > > > +mkdir -p rootfs
> > > > +cd rootfs
> > > > +mkdir -p etc/local.d
> > > > +passed="ping test passed"
> > > > +failed="TEST FAILED"
> > > > +cat > etc/local.d/xen.start << EOF
> > > > +#!/bin/bash
> > > > +
> > > > +set -x
> > > > +
> > > > +if grep -q test=backend /proc/cmdline; then
> > > > +    brctl addbr xenbr0
> > > > +    ip link set xenbr0 up
> > > > +    ip addr add 192.168.0.1/24 dev xenbr0
> > > > +    bash /etc/init.d/xendriverdomain start
> > > > +    # log backend-related logs to the console
> > > > +    tail -F /var/log/xen/xldevd.log /var/log/xen/xen-hotplug.log 
> > > > >>/dev/console 2>/dev/null &
> > > > +else
> > > > +    ip link set eth0 up
> > > > +    ip addr add 192.168.0.2/24 dev eth0
> > > > +    timeout=6 # 6*10s
> > > > +    until ping -c 10 192.168.0.1; do
> > > > +        sleep 1
> > > > +        if [ \$timeout -le 0 ]; then
> > > > +            echo "${failed}"
> > > > +            exit 1
> > > > +        fi
> > > > +        ((timeout--))
> > > > +    done
> > > > +    echo "${passed}"
> > > > +fi
> > > > +EOF
> > > > +chmod +x etc/local.d/xen.start
> > > > +fakeroot sh -c "
> > > > +    zcat ../rootfs.cpio.gz | cpio -imd
> > > > +    zcat ../xen-tools.cpio.gz | cpio -imd
> > > > +    ldconfig -r .
> > > > +    touch etc/.updated
> > > > +    mkfs.ext4 -d . ../domU-rootfs.img 1024M
> > > 
> > > Do we really need 1GB? I would rather use a smaller size if possible.
> > > I would rather use as little resources as possible on the build server
> > > as we might run a few of these jobs in parallel one day soon.
> > 
> > This will be a sparse file, so it won't use really all the space. But
> > this size is the upper bound of what can be put inside.
> > That said, it's worth checking if sparse files do work properly on all
> > runners in /build. AFAIR some older docker versions had issues with that
> > (was it aufs not supporting sparse files?).
> 
> I ran the same command on my local baremetal Ubuntu dev environment
> (arm64) and it created a new file of the size passed on the command
> line (1GB in this case). It looks like they are not sparse on my end. If
> the result depends on versions and configurations, I would rather err on
> the side of caution and use the smallest possible number that works.

Hm, interesting. What filesystem is that on?

On my side it's definitely sparse (ext4):

    [user@disp8129 Downloads]$ du -sch
    12K .
    12K total
    [user@disp8129 Downloads]$ mkfs.ext4 -d . ../domU-rootfs.img 1024M
    mke2fs 1.47.2 (1-Jan-2025)
    Creating regular file ../domU-rootfs.img
    Creating filesystem with 262144 4k blocks and 65536 inodes
    Filesystem UUID: f50a5dfe-4dcf-4f3e-82d0-3dc54a788ab0
    Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376

    Allocating group tables: done                            
    Writing inode tables: done                            
    Creating journal (8192 blocks): done
    Copying files into the device: done
    Writing superblocks and filesystem accounting information: done

    [user@disp8129 Downloads]$ ls -lhs ../domU-rootfs.img 
    33M -rw-r--r--. 1 user user 1.0G Dec 10 21:45 ../domU-rootfs.img

> > > Moreover this script will be run inside a container which means this
> > > data is probably in RAM.
> > 
> > Are runners configured to use tmpfs for /build? I don't think it's the
> > default.
> 
> I don't know for sure, they are just using the default. My goal was to
> make our solution more reliable as defaults and configurations might
> change.
> 
> 
> > > The underlying rootfs is 25M on both ARM and x86. This should be at most
> > > 50M.
> > 
> > Rootfs itself is small, but for driver domains it needs to include
> > toolstack too, and xen-tools.cpio is over 600MB (for debug build).
> > I might be able to pick just the parts needed for the driver domain (xl
> > with its deps, maybe some startup scripts, probably few more files), but
> > it's rather fragile.
> 
> My first thought is to avoid creating a 1GB file in all cases when it
> might only be needed for certain individual tests. Now, I realize that
> this script might end up only used in driver domains tests but if not, 

Indeed this script is specifically about driverdomains test.

> I
> would say to use the smallest number depending on the tests, especially
> as there seems to be use a huge difference, e.g. 25MB versus 600MB.
> 
> My second thought is that 600MB for just the Xen tools is way too large.
> I have alpine linux rootfs'es with just the Xen tools installed that are
> below 50MB total. I am confused on how we get to 600MB. It might be due
> to QEMU and its dependencies but still going from 25MB to 600MB is
> incredible!

Indeed it's mostly about QEMU (its main binary itself takes 55MB),
including all bundled firmwares etc (various flavors of edk2 alone take
270MB). There is also usr/lib/debug which takes 85MB.
But then, usr/lib/libxen* combined takes almost 50MB.

OTOH, non-debug xen-tools.cpio takes "just" 130MB.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature

Reply via email to