On 6/16/23 18:30, Richard Purdie wrote:
> On Fri, 2023-06-16 at 16:58 +0200, Alexis Lothoré wrote:
>> On 6/15/23 22:34, Alexis Lothoré wrote:
>>> Hello Richard, Michael,
>>> On 6/15/23 15:41, Richard Purdie wrote:
>>>> On Wed, 2023-06-14 at 10:56 +0200, Alexis Lothoré via 
>>>> lists.yoctoproject.org wrote:
>>>>> From: Alexis Lothoré <[email protected]>
>>>>>
>>>>> There must be a more robust rework to do (because the issue will likely
>>>>> happen on each major delivery), but I aimed for the quick and small fix to
>>>>> quickly bring back tests results storage without breaking other things in
>>>>> the process
>>>>
>>>> Thanks, I've merged this as it is a good first set of steps.
>>>>
>>>> As I mentioned, I think we should hardcode poky + "not ending with -
>>>> next" as the test, then we shouldn't run into this issue again.
>>>
>>> ACK, will do the fix
>>>>
>>>> I'd also like to retroactively push the test results for 4.2 since we
>>>> have them and should be able to merge them onto the branch. I'd then
>>>> like to see what the revised 4.3 M1 report looks like.
>>>
>>> I have started importing the archive kindly prepared by Michael in 
>>> poky-contrib
>>> test-results repository, but I am struggling a bit regarding regression 
>>> report
>>> generation with freshly imported result. I still have to confirm if it is 
>>> the
>>> generated tag that is faulty or if it is a kind of an edge case in 
>>> resulttool
>>
>> So, I have managed to generate the regression report locally (there's likely 
>> a
>> tag issue for older tests stored in test-results to be circumvented in
>> resulttool), and it is a bit disappointing. The report is 13MB large, and is
>> filled once again with false positive likely due to non static ptest names,
>> likely due to leaky build logs. Here's a sample
>>
>> ptestresult.gcc-g++-user.c-c++-common/Wbidi-chars-ranges.c  -std=gnu++14
>> expected multiline pattern lines 13-17 was found: "\s*/\*<U\+202E> \}
>> <U\+2066>if \(isAdmin\)<U\+2069> <U\+2066> begin admins only \*/[^\n\r]*\n
>> ~~~~~~~~                                ~~~~~~~~                    \^\n
>> \|                                       \|
>> \|[^\n\r]*\n       \|                                       \|
>>         end of bidirectional context[^\n\r]*\n       U\+202E \(RIGHT-TO-LEFT
>> OVERRIDE\)         U\+2066 \(LEFT-TO-RIGHT ISOLATE\)[^\n\r]*\n": PASS -> None
>>     ptestresult.gcc-g++-user.c-c++-common/Wbidi-chars-ranges.c  -std=gnu++14
>> expected multiline pattern lines 26-31 was found: "     /\* end admins only
>> <U\+202E> \{ <U\+2066>\*/[^\n\r]*\n                        ~~~~~~~~   
>> ~~~~~~~~
>> \^\n                        \|          \|        \|[^\n\r]*\n
>>      \|          \|        end of bidirectional context[^\n\r]*\n
>>         \|          U\+2066 \(LEFT-TO-RIGHT ISOLATE\)[^\n\r]*\n
>>       U\+202E \(RIGHT-TO-LEFT OVERRIDE\)[^\n\r]*\n": PASS -> None
>>
>> Most of this noise is about gcc ptests, there is also a bit about python3 and
>> ltp. I manually trimmed gcc false positive to reach a reasonable size, here 
>> it is:
>> https://pastebin.com/rYZ3qYMK
> 
> Thanks for getting us the diff!
> 
> Going through the details there, most of it is "expected" due to
> changes in version of the components. I did wonder if we could somehow
> show that version change?
> 
> I'm starting to wonder if we should:
> 
> a) file two bugs for cleaning up the python3 and gcc test results
> b) summarise the python3 and gcc test results in the processing rather
> than printing in full if the differences exceed some threshold (40
> changes?)

I would say yes and yes, and I like the idea of setting a general threshold,
either an absolute one or as a percentage of total number of test cases in
current test.

> 
> Basically we need to make this report useful somehow, even if we have
> to exclude some data for now until we can better process it.

Absolutely. I will use this report as a base to bring a new batch of
improvements. I will also add the stats I have been talking about earlier, to
know for example if for a test case, the generated noise is really affecting the
whole test or is a drop in the sea
> 
> I'm open to other ideas...
> 
> Cheers,
> 
> Richard
> 
> 
> 
> 
> 

-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#60330): https://lists.yoctoproject.org/g/yocto/message/60330
Mute This Topic: https://lists.yoctoproject.org/mt/99523809/21656
Group Owner: [email protected]
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to