On Thu, Feb 20, 2020, 9:22 AM Paul Barker <[email protected]> wrote:
> On Thu, 20 Feb 2020 at 12:04, Richard Purdie > <[email protected]> wrote: > > > > On Thu, 2020-02-20 at 11:59 +0000, Paul Barker wrote: > > > I'm now looking into this... > > > > > > In sstate_checkhashes() we mark sstate as available if > > > fetcher.checkstatus() succeeds. Then at a later point > > > sstate_setscene() calls sstate_installpkg() calls pstaging_fetch() > > > calls fetcher.download() to actually get the sstate artifact. If the > > > artifact is removed from the mirror between these two accesses (due > > > to an sstate mirror clean up running in parallel to a build), or if > > > there is an intermittent download failure we could see checkstatus() > > > succeed then download() fail. > > > > > > I don't think we should ignore all setscene errors but in the > > > specific case where it's the download step that fails I think that > > > should be a warning. Or it could be an error by default with a > > > variable we can set to turn it into a warning. Does that sound > > > reasonable? If so I'll work up a patch. > > > > Thinking about the code, I'm not sure how you're generically going to > > tell the difference between a setscene task that fails as the file > > disappeared compared to a setscene failure with another real error? :/ > > > > We could make all failed setscene tasks warnings but I think that > > buries actual real errors. > > > > This is probably why I've not changed the code before now. > > > > Special exit code values? :/ > > > > I'm open to proposals. > > > > I know we could put in some configuration option but in general I hate > > these as it just means more test matrix combinations and more ways for > > people to see different behaviours. They have a time/place but I'm not > > sure its here. > > I agree - I really don't want to have to add additional complexity > here. But I do think we need to fix this in some way, others are > affected by this as can be seen from previous discussions. And in the > case of a public sstate mirror we can't control when users decide to > run builds, there will always be the chance of a user running a build > on an old commit while old sstate artifacts are cleaned or starting a > build just as the mirror is taken offline for some maintenance. > > I think we might be able to make this work if we can avoid adding any > new conditional logic to the fetcher itself. I can see that almost > every call to logger.error() is followed by raising an error - perhaps > we could rework the code to include all the relevant info in the > raised error object and allow higher level code to catch the exception > and decide what to do with it. Because once logger.error() is called, > knotty counts an error and bitbake will exit non-zero even if the > error is safely handled. Once the fetcher simply raises exceptions in > the case of failed downloads we could handle this neatly in > sstate.bbclass. Would that be a viable way forward? Or would that > break the other fetcher use cases? > FWIW we also have this problem because our CI nodes all update the sstate cache via rsync after they finish, which causes races. This hasn't affected our developers, but I suspect that is only because they aren't doing builds at 1 AM. The way we worked around it was to split up the build into two invocations of bitbake: bitbake --setscene-only <target> || true bitbake --skip-setscene <target> Although this will likely not work very well with hash equivalence. > Thanks, > Paul > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48565): https://lists.yoctoproject.org/g/yocto/message/48565 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: [email protected] Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
