An interesting issue, and I think I hit it as well (my best guess).

Here is my issue:
https://github.com/mguentner/cannelloni/issues/35

> During the thud-to-hardknott upgrade process, we did nightly
> builds of the new hardknott based target image from our thud
> based SDK VM. I assumed that since GCC10 was being built
> as part of the build sysroot bootstrap process, we were getting
> a clean and consistent result irrespective of the underlying
> build server OS.

Maybe you can try the following: in your local.conf to insert the
following line:

GCCVERSION = "9.%"

for hardknott release.

I need to try this myself, I just used gcc as is (default one which
comes with the release, I guess 10).

I have no idea if this is possible in the current YOCTO development stage:

GCCVERSION = "11.%"

To do the FF to GCC 11.

Zee
_______

On Fri, Jun 25, 2021 at 6:48 AM Chuck Wolber <chuckwol...@gmail.com> wrote:
>
> All,
>
> Please accept my apologies in advance for the detailed submission. I think it 
> is warranted in this
> case.
>
> There is something... "odd" about the GCC 10 compiler that is delivered with 
> Hardknott. I am still
> chasing it down, so I am not yet ready to declare a root cause or submit a 
> bug, but I am posting
> what I have now in case anyone has some insights to offer.
>
> For all I know it is something unusual that I am doing, but we have a lot of 
> history with our
> build/dev/release methods, so I would be surprised if that was actually the 
> case. I have also
> discussed aspects of this on IRC for the last few days, so some of this may 
> be familiar to some
> of you.
>
> Background: We maintain a virtual machine SDK for our developers that is as 
> close as possible to
> the actual embedded hardware environment that we target. The SDK image is our 
> baseline Linux
> OS plus lots of the expected dev and debugging tools. The image deployed to 
> our target devices is
> the baseline Linux OS plus the core application suite. It is also important 
> to note that we only
> support the x86_64 machine architecture in our target devices and development 
> workstations.
>
> We also spin up and spin down the SDK VM for our nightly builds. This 
> guarantees strict consistency
> and eliminates lots of variables when we are trying to troubleshoot something 
> hairy.
>
> We just upgraded from Thud to Hardknott. This means we built our new 
> Hardknott based SDK VM
> image from our Thud based SDK VM (GCC 8 / glibc 2.28). When we attempted to 
> build our target
> device image in the new Hardknott based SDK VM, we consistently got a 
> segfault when any build
> task involves bison issuing a warning of some sort. I traced this down for a 
> very long time and it
> seemed to have something to do with the libtextstyle library from gettext and 
> the way bison used it.
> But I now believe that this to be a red herring. Bison seems to be very 
> fragile, but in this case,
> that may have actually been a good thing.
>
> After some experimentation I found that the issue went away when I dropped 
> down to the 3.6.4
> recipe of bison found at OE-Core:bc95820cd. But this did not sit right with 
> me. There is no way I
> should be the only person seeing this issue.
>
> Then I tried an experiment... I assumed I was encountering a compiler 
> bootstrap issue with such a
> big jump (GCC8 -> GCC10), so I rebuilt our hardknott based SDK VM with the 
> 3.3.1 version of
> buildtools-extended. The build worked flawlessly, but when I booted into the 
> new SDK VM and
> kicked off the build I got the same result (bison segfault when any build 
> warnings are encountered).
>
> This is when I started to mentally put a few more details together with other 
> post-upgrade issues that
> had been discovered in our lab. We attributed them to garden variety API and 
> behavioral changes
> expected during a Yocto upgrade, but now I am not so sure.
>
> During the thud-to-hardknott upgrade process, we did nightly builds of the 
> new hardknott based
> target image from our thud based SDK VM. I assumed that since GCC10 was being 
> built as part of
> the build sysroot bootstrap process, we were getting a clean and consistent 
> result irrespective of the
> underlying build server OS.
>
> One of the issues we were seeing in the lab was a periodic hang during the 
> initramfs phase of the
> boot process. We run a couple of setup scripts to manage the sysroot before 
> the switch_root, so it
> is not unusual to see some "growing pains" after an upgrade. The hangs were 
> random with no
> obvious cause, but systemd is very weird anyway so we attributed it to a new 
> dependency or race
> condition that we had to address after going from systemd 239 to 247.
>
> It is also worth noting that systemd itself was not hung, it responded to the 
> 'ole "three finger salute"
> and dutifully filled the screen with shutdown messages. It was just that the 
> boot process randomly
> stopped cold in initramfs before the switch root. We would also occasionally 
> see systemd
> complaining in the logs, "Starting requested but asserts failed".
>
> Historically, when asserts fail, it is a sign of a much larger problem, so I 
> did another experiment...
>
> Since we could build our SDK VM successfully with buildtools-extended, why 
> not build the target
> images? So I did. After a day of testing in the lab, none of the testers have 
> seen the boot hang up in
> the initramfs stage, whereas before it was happening about 50% of the time. I 
> need a good week of
> successful test activity before I am willing to declare success, but the 
> results were convincing
> enough to make it worth this summary post.
>
> I did an extensive amount of trial and error testing, including meticulously 
> comparing
> buildtools-extended with our own versions of the same files. The only 
> intersection point was gcc.
>
> The gcc delivered with buildtools-extended works great. When I build 
> hardknott's gcc10 from the
> gcc in buildtools-extended, we are not able to build our target images with 
> the resulting compiler.
> When I build our target images from the old thud environment, we get a 
> mysterious hang and
> systemd asserts triggering during boot. Since GCC10 is an intermediate piece 
> of the build, it is
> also implicated despite the native environment running GCC8.
>
> I will continue to troubleshoot this but I was hoping for some insight (or 
> gentle guidance if I am
> making a silly mistake). Overall, I am at a loss to think of a reason why I 
> should not be able to build
> a compiler from the buildtools-extended compiler and then use it to reliably 
> build our target images.
>
> Thank you,
>
> ..Ch:W..
>
>
> P.S. For those who are curious, we started out on Pyro hosted on Ubuntu 
> 16.04. From there we made
> the jump to self hosting when we used that environment to build a thud based 
> VM SDK. After years of
> successful build, we are now in the process of upgrading to Hardknott.
>
> P.P.S. For the sake of completeness, I had to add the following files to the 
> buildtools-extended
> sysroot to fully complete the build of our images:
>
> /usr/include/magic.h -> util-linux "more" command requires this.
> /usr/include/zstd.h -> I do not recall which recipe required this.
> /usr/bin/free -> The OpenJDK 8 build scripts need this.
> /usr/include/sys/* -> openjdk-8-native
> /lib/libcap.so.2 -> The binutils "dir" command quietly breaks the build 
> without this. I am not a fan of the
>                             lack of error checking in the binutils build...
> /usr/include/sensors/error.h and sensors.h -> mesa-native
> /usr/include/zstd_errors.h -> qemu-system-native
>
> --
> "Perfection must be reached by degrees; she requires the slow hand of time." 
> - Voltaire
>
> 
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#53973): https://lists.yoctoproject.org/g/yocto/message/53973
Mute This Topic: https://lists.yoctoproject.org/mt/83777925/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to