On 25/04/2025 1:36 pm, Alejandro Vallejo wrote:
> On Wed Apr 23, 2025 at 12:32 PM BST, Roger Pau Monne wrote:
>> There are several errata on Intel regarding the usage of the MONITOR/MWAIT
>> instructions, all having in common that stores to the monitored region
>> might not wake up the CPU.
>>
>> Fix them by forcing the sending of an IPI for the affected models.
>>
>> The Ice Lake issue has been reproduced internally on XenServer hardware,
>> and the fix does seem to prevent it.  The symptom was APs getting stuck in
>> the idle loop immediately after bring up, which in turn prevented the BSP
>> from making progress.
> Ugh... so this is what it was... Awesome having this madness fixed.
>
> Do you happen to know if Linux has a similar fix in place?

https://lore.kernel.org/lkml/20250421192205.7cc1a...@davehans-spike.ostc.intel.com/T/#u

>
>> This would happen before the watchdog was initialized, and hence the
>> whole system would get stuck.
> That's nasty. It was the misassumption that the watchdog was already
> running that had me going in circles thinking it was a lockup rather
> than a livelock. Oh, well.
>
> I believe the kudos for finally being able to reproduce this goes to
> Frediano?

Of course.

The bit about the watchdog is a little bit of a red herring.  The
rcu_barrier() loop processes softirqs, so the watchdog wouldn't have
fired even it had been set up.

~Andrew

Reply via email to