On 10/09/2025 2:11 pm, Andrew Cooper wrote:
> On 10/09/2025 12:57 pm, Marek Marczykowski-Górecki wrote:
>> On Wed, Sep 10, 2025 at 12:34:16PM +0100, Andrew Cooper wrote:
>>> Testing on staging-4.19 is hitting a reliable failure, caused by alpine/3.18
>>> being a root build container, but debian/12-x86_64 being a non-root test
>>> container.  Specifically, the test container can't copy XEN_PAGING_DIR and
>>> XEN_DUMP_DIR (both 700) from the build root in order to construct the 
>>> initrd.
>>>
>>> staging-4.20 and later do not repack the initrd in this way, so are not
>>> affected.
>>>
>>> Switch both alpine containers to being non-root.  This is still slightly
>>> fragile, but better than depending on using root containers for both.
>> This will likely explode done as is...
>>
>> First, grub.cfg is not writable anymore:
>> https://gitlab.com/xen-project/hardware/xen-staging/-/jobs/11305545275#L170
>>
>> I'm not sure what 'user' gets remapped to here, but the whole container
>> is running under rootless podman, as gitlab-runner user. Files on the
>> host are owned by gitlab-runner user.
>>
>> But second, repacking initrd as non-root, without any extra care will
>> result in broken initrd. At the very least /sbin/mount is suid root -
>> when repacked as normal user, it will end up as suid to non-root,
>> breaking it quite effectively. I've run into this issue when needing to
>> repack rootfs anyway and ended up using fakeroot (again):
>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/bab939159127a9f8e56e119c1fa553c7bbb6d4f7
>>
>> At least your "CI: Create initrd fragments explicitly as root" patch may
>> need backporting, but TBH I'm not sure if that's enough. /dev will
>> likely be messed up too.
> There's a lot of collateral damage here.  Summarising things a little.
>
> * We cannot change the root-ness of alpine/3.18-arm64v8.  Like
> xilinx-xenial, it does need root to drive real hardware
>
> * We can change the root-ness of alpine/3.18.  It is only used as a
> build container, not a test container.
>
> * Contrary to my previous analysis, we have backported the test-artefact
> CPIO work to 4.19, but not the build step CPIO archive.
>
>
> And, what to do:
>
> * We really do need to be rootless in the build containers.  Therefore I
> think we need to split alpine/3.18-arm64v8 in two, and when bumping to
> the next version is the obvious point to do this.  We should make a
> dedicated qubes container similar to xilinx-xenial, and separate it from
> the build step.
>
> * I should see about backporting the build step CPIO archive.  I'm
> beginning to think that was an oversight of mine, because I didn't
> intend to end up with a split like this.  This should avoid the step
> causing us problems here.
>
> * I'm very inclined to change the root-ness of alpine/3.18.  Testing
> suggests this is fine.

To follow up here, backporting the remaining CPIO patches has resolved
the issue. 
https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/2032200270

~Andrew

Reply via email to