On 8.02.2024 12:30, Jan Beulich wrote:
On 14.11.2023 18:50, Krystian Hebel wrote:
If multiple CPUs called machine_restart() before actual restart took
place, but after boot CPU declared itself not online,
Can you help me please in identifying where this operation is? I can see
two places where a CPU is removed from cpu_online_map, yet neither
__stop_this_cpu() nor __cpu_disable() ought to be coming into play here.
In fact I didn't think CPU0 was ever marked not-online. Except perhaps
if we came through machine_crash_shutdown() -> nmi_shootdown_cpus(), but
I'm sure you would have mentioned such a further dependency.

Jan
BUG_ON() in cpu_notifier_call_chain() (I've been playing with some of
the notifiers and one of them eventually failed) resulted in panic()
around the same time AP did in pm_idle() due to inconsistent settings
between BSP and AP for MWAIT/MONITOR support after TXT dynamic
launch. There is 5s delay between smp_send_stop() and actual reboot,
during that time AP spammed the output so the original reason for
panic() was visible only after unreasonable amount of scrolling.

Adding TXT support is the reason why I even started making AP bring-up
parallel. Problem with MWAIT doesn't happen in current code or changes
in this patchset, but handling of such error is related to SMP so I've included it.

Best regards,

--
Krystian Hebel
Firmware Engineer
https://3mdeb.com | @3mdeb_com


Reply via email to