On Thu, Sep 21, 2023 at 1:23 PM George Dunlap <george.dun...@cloud.com> wrote:
>
> On large systems with many vcpus yielding due to spinlock priority
> inversion, it's not uncommon for a vcpu to yield its timeslice, only
> to be immediately stolen by another pcpu looking for higher-priority
> work.
>
> To prevent this:
>
> * Keep the YIELD flag until a vcpu is removed from a runqueue
>
> * When looking for work to steal, skip vcpus which have yielded
>
> NB that this does mean that sometimes a VM is inserted into an empty
> runqueue; handle that case.
>
> Signed-off-by: George Dunlap <george.dun...@cloud.com>

Marcus,

Just wanted to verify my interpretation of the testing you did of this
patch several months ago:

1. On the problematic workload, mean execution time for the task under
heavy load was around 12 seconds
2. With only patch 2 of this series (0004 in your tests), mean
execution time under heavy load was around 5 seconds
3. With only patch 1 of this series (0003 in your tests), mean
execution time under heavy load was around 3 seconds
4. With both patch 1 and patch 2 of this series (0003+0004 in your
tests), mean execution time under heavy load was also around 3 seconds

So both patches independently exhibit an improvement; but the combined
effect is about the same as the first patch.

Assuming those results are accurate, I would argue that we should take
both patches.  Does anyone want to argue we should only take the first
one?

 -George

Reply via email to