On 2025-04-30 06:56, Marek Marczykowski-Górecki wrote:
On Tue, Apr 29, 2025 at 08:59:45PM -0400, Jason Andryuk wrote:
Hi Marek,
On Wed, Apr 23, 2025 at 8:42 AM Marek Marczykowski-Górecki
<marma...@invisiblethingslab.com> wrote:
I've got some more report confirming it's still happening on Linux
6.12.18. Is there anything I can do to help fixing this? Maybe ask users
to enable some extra logging?
Have you been able to capture a crash with debug symbols and run it
through scripts/decode_stacktrace.sh?
Not really, as I don't have debug symbols for this kernel. And I can't
reliably reproduce it myself (for me it happens about once in a
month...). I can try reproducing debug symbols, theoretically I should
have all ingredients for it.
I'm curious what process_msg+0x18e/0x2f0 is. process_writes() has a
direct call to wake_up(), but process_msg() calling req->cb(req) may
be xs_wake_up() which is a thin wrapper over wake_up().
There is a code dump in the crash message, does it help?
That's a little deeper in the call chain. If you have a vmlinux or
bzImage with a matching stacktrace, that would work to look up the
address in the disassembly. So if you don't have a matching pair, maybe
try to catch it the next time.
They make me wonder if req has been free()ed and at least partially
zero-ed, but it still has wake_up() called. The call stack here is
reminiscent of the one here
https://lore.kernel.org/xen-devel/Z_lJTyVipJJEpWg2@mail-itl/ and the
unexpected value there is 0.
That's interesting idea, the one above I've seen only on 6.15-rc1 (and
no latter rc). But maybe?
I am guessing, so I could be wrong. NULL pointer and unexpected zero
value are both 0 at least. Also Whonix looks like it may use
init_on_free=1 to zero memory at free time.
Regards,
Jason