On 16/12/2021 13:33, Jan Beulich wrote:
> On 16.12.2021 12:54, Andrew Cooper wrote:
>> On 13/12/2021 15:12, Jan Beulich wrote:
>>> show_hvm_stack() requires interrupts to be enabled to avoids triggering
>>> the consistency check in check_lock() for the p2m lock. To do so in
>>> spurious_interrupt() requires adding reentrancy protection / handling
>>> there.
>>>
>>> Fixes: adb715db698b ("x86/HVM: also dump stacks from
>>> show_execution_state()")
>>> Signed-off-by: Jan Beulich <[email protected]>
>>> ---
>>> The obvious (but imo undesirable) alternative is to suppress the call to
>>> show_hvm_stack() when interrupts are disabled.
>> show_execution_state() need to work in any context including the #DF
>> handler,
> Why? There's no show_execution_state() on that path.
Yes there is - it's reachable from any BUG().
It's also reachable on the NMI path via fatal_trap().
Talking of, didn't you say you'd found an unexplained deadlock with NMI
handling... ?
>
>> and
>>
>> /*
>> * Stop interleaving prevention: The necessary P2M lookups
>> * involve locking, which has to occur with IRQs enabled.
>> */
>> console_unlock_recursive_irqrestore(flags);
>>
>> show_hvm_stack(curr, regs);
>>
>> is looking distinctly dodgy...
> Well, yes, it does.
Because it is.
You cannot enable interrupts here, because you have no clue if it safe
to do so.
What you are doing is creating yet another instance of the broken
pattern we already have with shutdown trying to move itself to CPU0,
that occasionally explodes in the middle of a context switch.
Furthermore show_execution_state() it is already broken for any path
where interrupts are already disabled, including but not limited to the
one you've found here.
adb715db698bc8ec3b88c24eb88b21e9da5b6c07 is broken and needs reverting.
No amount of playing games with irqs here is going to improve things.
~Andrew