On 06/10/2022 14:11, Jan Beulich wrote:
> In an entirely different context I came across Linux commit 428e3d08574b
> ("KVM: x86: Fix zero iterations REP-string"), which points out that
> we're still doing things wrong: For one, there's no zero-extension at
> all on AMD. And then while RCX is zero-extended from 32 bits uniformly
> for all string instructions on newer hardware, RSI/RDI are only for MOVS
> and STOS on the systems I have access to. (On an old family 0xf system
> I've further found that for REP LODS even RCX is not zero-extended.)
>
> Fixes: 79e996a89f69 ("x86emul: correct 64-bit mode repeated string insn 
> handling with zero count")
> Signed-off-by: Jan Beulich <[email protected]>
> ---
> Partly RFC for none of this being documented anywhere (and it partly
> being model specific); inquiry pending.

None of this surprises me.  The rep instructions have always been
microcoded, and 0 reps is a special case which has been largely ignored
until recently.

I wouldn't be surprised if the behaviour changes with
MISC_ENABLE.FAST_STRINGS (given the KVM commit message) and I also
wouldn't be surprised if it's different between Core and Atom too (given
the Fam 0xf observation).

It's almost worth executing a zero-length rep stub, except that may
potentially go very wrong in certain ecx/rcx cases.

I'm not sure how important these cases are to cover.  Given that they do
differ between vendors and generation, and that their use in compiled
code is not going to consider the registers live after use, is the
complexity really worth it?

~Andrew

Reply via email to