On Thu, Oct 05, 2023 at 07:58:50PM +0100, Andrew Cooper wrote:
> I see this series has been committed.  But it's broken in a really
> fundamental way.
> 
> 
> This is a new extension with persistent side effects to an existing part
> of the guest ABI.

The only change in the ABI is the different return code for multiple
attempts to map the vcpu_info page, it used to be -EINVAL and it's
-EBUSY now, which seems more descriptive.

The added hypercalls are an extension of the ABI, not not a
modification of an existing part.  Or maybe I'm not understanding the
complaint.

> Yet there doesn't appear to be any enumeration that the interface is
> available to begin with.  Requiring the guest to probe subops, and
> having no way to disable it on a per-domain basis is unacceptable,

We have never mandated such disables to be part of the series adding
the new hypercalls, those have always been retro fitted in case of
need.  Not saying we shouldn't do it, but it's not something we have
asked submitters to do.

> and
> has exploded on us more times than I care to count in security fixes
> alone, and that doesn't even cover the issues Amazon have reported over
> the years.

That's fine, I can add the enumeration.  A CHANGELOG entry should also
be added.

> 
> Henry: Blocker for 4.18.   The absolutely bare minimum necessary to
> avoid reversion is some kind of positive enumeration that the two new
> hypercalls are available.
> 
> Otherwise I will be #if 0'ing out the new hypercalls before this ABI
> mistake gets set in stone.
> 
> 
> If this were x86-only it would need to be a CPUID flag, but it will need
> to be something arch-agnostic in this case.  The series should not have
> come without a proper per-domain control and toolstack integration, but
> everything else can be retrofitted in an emergency.
> 
> And on a related note, where is the documentation describing this new
> feature?  Some tests perhaps, or any single implementation of the guest
> side interface?

Not that I know, I was expecting Jan to post that once he gets back
from PTO.

I already noted somewhere that I wasn't able to test myself because I
couldn't find any Linux side patches to test the feature with, and I
didn't have time to write ones myself (was expecting Jan to have the
Linux side done already for testing reasons).

> This is engineering principles so basic that they do go without saying.
> 
> ~Andrew
> 

Reply via email to