On 10/09/2025 2:11 pm, Andrew Cooper wrote: > On 10/09/2025 12:57 pm, Marek Marczykowski-Górecki wrote: >> On Wed, Sep 10, 2025 at 12:34:16PM +0100, Andrew Cooper wrote: >>> Testing on staging-4.19 is hitting a reliable failure, caused by alpine/3.18 >>> being a root build container, but debian/12-x86_64 being a non-root test >>> container. Specifically, the test container can't copy XEN_PAGING_DIR and >>> XEN_DUMP_DIR (both 700) from the build root in order to construct the >>> initrd. >>> >>> staging-4.20 and later do not repack the initrd in this way, so are not >>> affected. >>> >>> Switch both alpine containers to being non-root. This is still slightly >>> fragile, but better than depending on using root containers for both. >> This will likely explode done as is... >> >> First, grub.cfg is not writable anymore: >> https://gitlab.com/xen-project/hardware/xen-staging/-/jobs/11305545275#L170 >> >> I'm not sure what 'user' gets remapped to here, but the whole container >> is running under rootless podman, as gitlab-runner user. Files on the >> host are owned by gitlab-runner user. >> >> But second, repacking initrd as non-root, without any extra care will >> result in broken initrd. At the very least /sbin/mount is suid root - >> when repacked as normal user, it will end up as suid to non-root, >> breaking it quite effectively. I've run into this issue when needing to >> repack rootfs anyway and ended up using fakeroot (again): >> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/bab939159127a9f8e56e119c1fa553c7bbb6d4f7 >> >> At least your "CI: Create initrd fragments explicitly as root" patch may >> need backporting, but TBH I'm not sure if that's enough. /dev will >> likely be messed up too. > There's a lot of collateral damage here. Summarising things a little. > > * We cannot change the root-ness of alpine/3.18-arm64v8. Like > xilinx-xenial, it does need root to drive real hardware > > * We can change the root-ness of alpine/3.18. It is only used as a > build container, not a test container. > > * Contrary to my previous analysis, we have backported the test-artefact > CPIO work to 4.19, but not the build step CPIO archive. > > > And, what to do: > > * We really do need to be rootless in the build containers. Therefore I > think we need to split alpine/3.18-arm64v8 in two, and when bumping to > the next version is the obvious point to do this. We should make a > dedicated qubes container similar to xilinx-xenial, and separate it from > the build step. > > * I should see about backporting the build step CPIO archive. I'm > beginning to think that was an oversight of mine, because I didn't > intend to end up with a split like this. This should avoid the step > causing us problems here. > > * I'm very inclined to change the root-ness of alpine/3.18. Testing > suggests this is fine.
To follow up here, backporting the remaining CPIO patches has resolved the issue. https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/2032200270 ~Andrew