On 06.05.24 08:42, Jan Beulich wrote:
On 03.05.2024 21:07, Stefano Stabellini wrote:
On Fri, 3 May 2024, Julien Grall wrote:
Hi Stefano,

On 02/05/2024 19:13, Stefano Stabellini wrote:
On Mon, 29 Apr 2024, Julien Grall wrote:
Hi Juergen,

On 29/04/2024 12:28, Jürgen Groß wrote:
On 29.04.24 13:04, Julien Grall wrote:
Hi Juergen,

Sorry for the late reply.

On 29/04/2024 11:33, Juergen Gross wrote:
On 08.04.24 09:10, Jan Beulich wrote:
On 27.03.2024 16:22, Juergen Gross wrote:
With lock handling now allowing up to 16384 cpus (spinlocks can
handle
65535 cpus, rwlocks can handle 16384 cpus), raise the allowed
limit
for
the number of cpus to be configured to 16383.

The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and
QINVAL_MAX_ENTRY_NR required to be larger than 2 *
CONFIG_NR_CPUS.

Signed-off-by: Juergen Gross <jgr...@suse.com>

Acked-by: Jan Beulich <jbeul...@suse.com>

I'd prefer this to also gain an Arm ack, though.

Any comment from Arm side?

Can you clarify what the new limits mean in term of (security)
support?
Are we now claiming that Xen will work perfectly fine on platforms
with up
to 16383?

If so, I can't comment for x86, but for Arm, I am doubtful that it
would
work without any (at least performance) issues. AFAIK, this is also an
untested configuration. In fact I would be surprised if Xen on Arm was
tested with more than a couple of hundreds cores (AFAICT the Ampere
CPUs
has 192 CPUs).

I think we should add a security support limit for the number of
physical
cpus similar to the memory support limit we already have in place.

For x86 I'd suggest 4096 cpus for security support (basically the limit
we
have with this patch), but I'm open for other suggestions, too.

I have no idea about any sensible limits for Arm32/Arm64.

I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the
number of
CPUs from Ampere)?

I am OK with that. If we want to be a bit more future proof we could say
256 or 512.

Sorry, I don't follow your argument. A limit can be raised at time point in
the future. The question is more whether we are confident that Xen on Arm will
run well if a user has a platform with 256/512 pCPUs.

So are you saying that from Xen point of view, you are expecting no difference
between 256 and 512. And therefore you would be happy if to backport patches
if someone find differences (or even security issues) when using > 256 pCPUs?

It is difficult to be sure about anything that it is not regularly
tested. I am pretty sure someone in the community got Xen running on an
Ampere, so like you said 192 is a good number. However, that is not
regularly tested, so we don't have any regression checks in gitlab-ci or
OSSTest for it.

One approach would be to only support things regularly tested either by
OSSTest, Gitlab-ci, or also Xen community members. I am not sure what
would be the highest number with this way of thinking but likely no
more than 192, probably less. I don't know the CPU core count of the
biggest ARM machine in OSSTest.

Another approach is to support a "sensible" number: not something tested
but something we believe it should work. No regular testing. (In safety,
they only believe in things that are actually tested, so this would not
be OK. But this is security, not safety, just FYI.) With this approach,
we could round up the number to a limit we think it won't break. If 192
works, 256/512 should work? I don't know but couldn't think of something
that would break going from 192 to 256.

I would suggest to aim at sticking to power-of-2 values. There are still
some calculations in Xen which can  be translated to more efficient code
that way (mainly: using shifts rather than multiplications or a
combination of shifts and adds). Of course those calculations depend on
what people choose as actual values, but giving an upper bound being a
power of 2 may at least serve as a hint to them.

It depends on how strict we want to be on testing requirements. I am not
sure what approach was taken by x86 so far. I am OK either way.

The bumping of the limit here clearly is forward-looking for x86, i.e. is
unlikely to be even possible to test right now (except maybe when running
Xen itself virtualized). I actually think there need to be two separate
considerations: One is towards for how many CPUs Xen can be built (and
such a build can be validated on a much smaller system), while another is
to limit what is supported (in ./SUPPORT.md).

My suggestion would be to add the following to my patch:

- introducing the number of security supported physical cpus to SUPPORT.md
  (4096 for x86, 256 for Arm64 and Arm32)

- adding the new upper bound to CHANGELOG.md

In case I don't hear any objections I'll send it out tomorrow.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to