Yocto Technical Team Minutes, Engineering Sync, for August 31, 2021
archive: 
https://docs.google.com/document/d/1ly8nyhO14kDNnFcW2QskANXW3ZT7QwKC5wWVDg9dDH4/edit

== disclaimer ==
Best efforts are made to ensure the below is accurate and valid. However,
errors sometimes happen. If any errors or omissions are found, please feel
free to reply to this email with any corrections.

== attendees ==
Trevor Woerner, Stephen Jolley, Richard Purdie, Saul Wold, Trevor Gamblin,
Scott Murray, Richard Elberger, Peter Kjellerstedt, Michael Opdenacker,
Joshua Watt, Randy MacLeod, Jon Mason, Armin Kuster, Michael Halstead, Ross
Burton, Jan-Simon Möller, Tim Orling, Alejandro Hernandez, Bruce Ashfield

== project status ==
- feature freeze for 3.4 (honister)
- there are a number of issues holding up an -m3 build:
  - rust was merged into oe-core but then a problem was found with
    cargo-native on centos7
  - a kernel issue regarding kernel module versioning changed
- a pseudo fix went in for the glibc 2.34 problem, but isn’t the most
    ideal way (binary shim) to solve the problem, investigation into other
    approaches would be good
- the glibc 2.34 upgrade introduced a bug with docker and the clone3 syscall
    which now returns EPERM instead of ENOSYS. this is an upstream problem,
    but we’ll probably need a local patch until this is resolved

== discussion ==
RP: -m3 not built yet, waiting on stuff, e.g. there’s a glibc issue with
    docker (probably an issue with docker, but we’ll probably need to
    carry a patch) which probably means a new uninative release as well
    unfortunately, there’s a kernel issue with newly added full version
    strings but there is a patch so it’ll need testing, and there’s a rust
    issue on centos7
Randy: i’ll take a look at the rust thing, i didn’t realize it was
    blocking -m3, i was looking at upgrading librsvg and the things that
    depend on it
RP: librsvg is important, but thanks
Randy: i was hoping for a smooth upgrade and that i’ve have something for
    you today, but there were some problems, so i think it’ll have to wait
    until the next release.
RP: were the problems with rust, or rsvg?
Randy: rsvg builds, but gstreamer-bad (or something) has a configuration
    issue, so that’s where i left it

RP: there’s pressure to get a hardknott build in, but we’ll wait until we
    get the openssl fix in. so whether there’ll be a new hardknott release
    or it’ll be -m3 i’m not sure. i’m hoping it’ll be -m3 but we’ll
    see how things go with builds
SJ: i’ll update the schedule
RP: there’s a little bit of pressure there because of the overrides changes
    and i think people are anxious for a hardknott release so we’ll try to
    fit one in. i’m told there is QA time available.

Ross: what’s the state of the sbom work?
JPEW: Saul found a bug with packaging native recipes, i need a way to figure
    out how to skip reading the subpackage metadata when there’s no packages
    actually created. i haven’t figured out a good way to do that other
    than: if it inherits class native? seems a bit kludgy, but i can do that 
Ross: that’s probably the easiest way
JPEW: we can skip the step where we read in the package variables and try to
    create all the packages if it inherits from native. i already had the
    check in to say “if there’s nothing actually packaged, skip creating
    the package spdx files and i thought that was sufficient, but apparently
    it’s not as you have to disregard all things related to packaging in
    native recipes, not just whether it created packages or not. so we can do
    that fairly simply, i think, after that, as far as i’m aware, it’s
    ready to go. it generates a bunch of warnings because of licensing things.
    we’re validating the licenses against the spdx license list so that we
    generate a valid spdx license and most of the warnings are if the user
    specifies “BSD” instead of BSD-2/3/4 clause. so i’ve been slowly
    fixing those
Ross: i thought that was fixed. i thought we had code to rationalize the
    license terms to spdx names?
JPEW: we do, but just “BSD”, by itself, is not a valid spdx license
    string. so i’ve gone in and changed a bunch of recipes where it was
    obvious that it was a BSD 2/3/4 clause. there is support in spdx for
    including your own custom licenses, but i don’t have that in yet. we
    could pull the generic license text from the paths that we have, so for
    any license string that we don’t recognize as spdx we could go and put
    our own. but i don’t have the working yet; the code to find a generic
    licence file is… “intense” so i wasn’t up for dissecting it
    right now. so for now it just adds a placeholder license and issues a
    warning. but the vast majority of them are the “BSD” thing needing
    to specify the 2/3/4 clause. other than that i think it’s ready to go.
    unfortunately, until they’re all fixed we’ll get warnings on the AB,
    but we could put it in and maybe not turn it on
RP: or we could just test a smaller subset
JPEW: i’m just only doing core-image-minimal and core-image-base now
RP: oh
JPEW: if you did world i think you’d get, thousands of errors 
Saul: i’ve been playing with world builds and meta-openembedded as well;
    there’s a lot of issues
JPEW: like i said, we could just include the generic license text, but those
    BSD ones are almost assuredly wrong because they’re not going to match
    the generic BSD license we have
Ross: the generic one we ship is 3-clause so we could just tell it that
    “BSD” means 3-clause?
JPEW: from what i’ve seen that’s not correct often enough
Ross: okay… and we are trying to generate correct data
JPEW: in the long run we should eliminate the bare BSD license. we should just
    force people to specify the clause correctly. there are also a bunch of
    one-off licensing things, not sure what to do about that
RP: there was an effort a while back to get rid of the generic BSD stuff, and
    that went so far but it sounds like it hasn’t gone far enough. we should
    go through and we should rationalize those. in spdx i think there’s a
    way to do an external license? an additional license that is not spdx?
JPEW: yes, if we encounter a license that doesn’t have an spdx mapping
    then we put a placeholder that is an external license but it’s still
    “wrong”. but what we could do is search through the license search
    path and pick out the generic license file the corresponds with that and
    put that in there. and then that would cover all those unknown licenses
    (e.g. bzip2 1.0.4). so it’ll know it can put that in as an external
    reference. however i would rather not do that until all the BSD ones are
    handled. it’ll eliminate the warning but then all these BSD ones would
    be silently wrong
RP: i’m thinking about having something we can actually merge
JPEW: i think we can merge what we have, it will generate some warnings and
    people should go fixup the licenses
RP: i’d like to get rid of the warnings before we merge it. if we allow
    one warning on the AB, then it breeds quietly and then suddenly lots of
    warnings get ignored
JPEW: i could comment out the warning. it does still generate the placeholder,
    it’s just not a very good placeholder; it literally says: “this is a
    BSD license”
RP: i think the spdx people would be interested in having a list of the
    licenses required to create a basic linux system that aren’t in their
    system
JPEW: the spdx spec does say that putting in a placeholder like that is okay,
    the warning is just nice to let you know where you have to fix things up
RP: i think having the warning commented out for getting it merged at this
    point is desirable at this point
JPEW: ok
PR: i saw Saul’s patches for native packages. what we’ve tried in the
    past, and is only half complete, is rather than zeroing out the packages
    field what we tried to do was preserve the packages field and then just do
    detection on whether native was being inherited or not. sometimes there is
    dependency information that is hidden in the RDEPENDS fields, and in order
    to know which RDEPENDS fields to lookup we need the PACKAGES variable.
    and in the native case if you zero out the PACKAGES variable then you no
    longer know which packages that that thing might be producing to know
    which variables to go query. the overrides changes might help clarify. but
    the intent was to stop destroying the PACKAGES variable so we could use it
    in more places
JPEW: if detecting whether or not it “inherits native” is the way to go
    then i’m happy to go with that

TrevorW: there were further replies to your email about the pseudo problem on
    the libc-help mailing list. one suggestion was to use ptrace, another was
    to use seccomp, yet another was to use the newer seccomp notify mechanism.
    also crOS does something similar to pseudo but uses lddtree.
RP: the seccomp notify is interesting, but there is some small print in the
    man page of particular note: we can’t write data back to the process,
    so while you can do all this cool stuff in the supervisory process, we
    can only change return value, but not the return data. we can poke into
    the process’s memory to read things but there’s all sorts of locking
    constraints. you have to be very careful how you read the data because
    you have to make sure that the process didn’t disappear while you were
    halfway through reading from its memory. so because of that you could
    never write data back to the process. pseudo is modifying data because it
    wants to fake the file permissions and therefore it would have to write
    data back and change the return data and i don’t think that would be
    viable. so i don’t think the new module would work. i’m not familiar
    with the existing seccomp stuff to know whether a module like that would
    be able to do the intercept and be able to do the writing. i don’t think
    doing this for every linux syscall (and all their quirks) is easier than
    doing it for every libc call (and all their quirks). we’ll just end up
    replacing one set of problems with another
JPEW: i suspect the advantage of doing it at the syscall lever would be that
    you’d catch things that don’t use the standard C library, e.g. Rust
RP: totally. there are advantages yes. you would catch things that are
    statically linked (and don’t have dynamic library support) so yes
    there are some advantages do doing that. whether it’s enough of an
    advantage… the jury’s still out; not sure. ptrace would be far too
    slow for what we need
TrevorW: what about if we used musl instead of glibc?
RP: that would give us a whole load of headaches because we would end up
    writing an entire intercept library. all of these libraries are linked
    against glibc, therefore pseudo links against glibc and intercepts glibc
    syscalls. it would be interesting to see whether musl has a pthread
    implementation that is simpler than glibc so it would let us statically
    link it into libpseudo
TrevorW: then maybe the versioning problem wouldn’t be there?
RP: no, there are some problems it would solve, but there are some problems it
    would create. libpseudo is a glibc intercept library, as built. because
    it’s intercepting things on a glibc system. if you put musl in there
    you’ll then having it linking against 2 different C libraries. which
    means you’ll have the .start and .end sections from 2 different C
    libraries involved. you can’t replace glibc with musl, you would have to
    have both
Scott: with pseudo we’re intercepting things from the host side as well as
    in the sysroot, correct? so the problem is on both sides because the host
    tools are getting intercepted
RP: it doesn’t matter that the target it because the host tools are linked
    against glibc and are dynamic
Scott: does the buildtools tarball cover everything we need from the host?
RP: no, probably 90%, but not everything
Scott: would it help if it did cover everything, then we could say always
    glibc 2.34 and not have the mismatch
RP: not sure. we don’t build our own git for example, and we do the same
    for a couple others. we do 2 levels of build tools because we do the
    basic one, then we do the one which has gcc and some of the “heavier”
    stuff in. so at the moment we do have a split-level approach. regardless,
    people could add to the host tools and bypass the thing, and it would have
    to work regardless. unless we mandate that we only ever build using our
    tools, but that feels like a backwards step
Scott: down the road, the problem goes away once everything is glibc 2.34?
RP: there are things like centos7 that hang around for a long time
Scott: and i’ve seen customer who play around with host tools, so it would
    be problematic for sure
RP: it’s an interesting idea. you’re getting to the point where you’re
    mandating the host environment, which i get nervous about. might as well
    use a container or force everyone to use docker (lol)
Scott: down the road we could use pseudo with user-id name-spacing but at that
    point we might as well use a full container
RP: it would be nice if those features were available in such a way that we
    could actually use them for this. so far i haven’t seen anything that we
    could use, but so far everything always needs privilege
Scott: it’s coming along, but not there yet
JPEW: podman might be a possibility, but it struggles with the number of file
    descriptors that it can open
RP: the seccomp interface was designed by someone with a specific use-case,
    i can’t help wonder if we couldn’t get a kernel interface designed
    that would work for us. our requirements aren’t that high. and there are
    other users of this sort of use-case (e.g. fakeroot) that might benefit as
    well
Scott: if you use BPF there might be a whole bunch of people interested
RP: …and write it in Rust (lol)
TrevorW: it’s a coin toss
RP: sometimes the act of putting together the proposal and getting feedback;
    it might not fly on the first attempt, but you would learn enough to make
    a second attempt fly. that would be the long shot, but perhaps worth it;
    it’s not out of this world and we’ve used fakeroot technology for most
    of that time in one form or another
JPEW: according to the seccomp man page regarding seccomp notify, and i think
    what it’s saying about the writing is you can’t know if you’re
    writing back to the target’s process’s memory space that it hasn’t
    interrupted the syscall for the thing you’re writing. and thus you’re
    corrupting something
RP: yes, basically you can’t use that to write the data back. and i
    couldn’t spot any other mechanism that was making that available either.
    the kernel should know, but perhaps the supervisor can’t know. i don’t
    think a supervisor module could work, but i was curious about seccomp
    itself, with the filtering, but i don’t think the program itself can
    make its own library calls
JPEW: the supervisor seems quite limited; it’s just for blocking system
    calls?
RP: yes, it just modifies return codes. we already have a model in pseudo
    where everything gets serialized and stuffed over a socket to the actual
    pseudo database. we’re mostly there with our code, it’s just a
    question of whether seccomp can get us the rest of the way. i suspect with
    seccomp having a security focus has a different set of constraints. what
    we want might be counter to what a security mechanism might work.

Randy: we’re still finding edge cases with the overrides thing. it would be
    nice to document the rationale for why this was done as a flag day instead
    of a warning period?
RP: what would you warn on?
Randy: i’d like to document the reasoning
RP: the reason for the change is because bitbake could never tell what was an
    override and what isn’t. so there was no way to tell where that colon
    should or shouldn’t be. take SRC_URI, would you like a warning that the
    underscore in SRC_URI may or may not be an override. that one is obvious,
    but there are a number that aren’t. have you looked at the migration
    guide?
Randy: no, i didn’t check that
RP: we did try to document it. we did find a quirk in bitbake that sometimes
    because of the order of processing various configs/layers, it would
    produce nicer error messages for some things but not for others. it would
    have been nice if all the warning/error messages would be uniform and
    clear, but they’re not

TimO: what’s the status of Rust on the centos7 worker?
RP: cargo-native fails to build with a glibc symbol mismatch error that looks
    a lot like the uninative issues. the interesting thing is that centos7 is
    using the buildtools tarball, so it does look like a bad interaction with
    the buildtools tarball. but this doesn’t seem to happen anywhere else
    we’re using the buildtools tarball (including my own local worker). so
    there’s something odd the centos7 machine is doing that the others are
    not. it looks like there’s something in the centos7 environment that’s
    letting it use the wrong linker. because buildtools should be using its
    own linker, and the host paths are all pointing at the buildtools linker.
    and the buildtools compiler should know how to use the buildtools linker.
    but something is escaping that somehow, but i’m struggling to prove
    what’s going on. but i think we need to investigate this to understand
    what’s going on to make sure it’s not going to affect other systems in
    some weird way.

RP: we were having some issue with dunfell with meta-aws (yocto check layer),
    there was a patch that was supposed to be on the AB but wasn’t. but we
    got the patch in and it looks like everything is okay now
RE: awesome, thank you
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#54646): https://lists.yoctoproject.org/g/yocto/message/54646
Mute This Topic: https://lists.yoctoproject.org/mt/85368415/21656
Group Owner: [email protected]
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to