On 2025-04-30 06:56, Marek Marczykowski-Górecki wrote:
On Tue, Apr 29, 2025 at 08:59:45PM -0400, Jason Andryuk wrote:
Hi Marek,

On Wed, Apr 23, 2025 at 8:42 AM Marek Marczykowski-Górecki
<marma...@invisiblethingslab.com> wrote:

I've got some more report confirming it's still happening on Linux
6.12.18. Is there anything I can do to help fixing this? Maybe ask users
to enable some extra logging?

Have you been able to capture a crash with debug symbols and run it
through scripts/decode_stacktrace.sh?

Not really, as I don't have debug symbols for this kernel. And I can't
reliably reproduce it myself (for me it happens about once in a
month...). I can try reproducing debug symbols, theoretically I should
have all ingredients for it.

I'm curious what process_msg+0x18e/0x2f0 is.  process_writes() has a
direct call to wake_up(), but process_msg() calling req->cb(req) may
be xs_wake_up() which is a thin wrapper over wake_up().

There is a code dump in the crash message, does it help?

That's a little deeper in the call chain. If you have a vmlinux or bzImage with a matching stacktrace, that would work to look up the address in the disassembly. So if you don't have a matching pair, maybe try to catch it the next time.

They make me wonder if req has been free()ed and at least partially
zero-ed, but it still has wake_up() called.  The call stack here is
reminiscent of the one here
https://lore.kernel.org/xen-devel/Z_lJTyVipJJEpWg2@mail-itl/ and the
unexpected value there is 0.

That's interesting idea, the one above I've seen only on 6.15-rc1 (and
no latter rc). But maybe?

I am guessing, so I could be wrong. NULL pointer and unexpected zero value are both 0 at least. Also Whonix looks like it may use init_on_free=1 to zero memory at free time.

Regards,
Jason

Reply via email to