On 2025-08-07 03:00, Jürgen Groß wrote:
On 07.08.25 03:56, Jason Andryuk wrote:
With hyperlaunch, a domU can start before its console ring is connected
by xenconsoled.  With nothing emptying the ring, it can quickly fill
during boot.  In domU_write_console(), __write_console() returns 0 when
the ring is full.  This loops spins until xenconsoled starts emptying
the ring:

         while (len) {
                 ssize_t sent = __write_console(cons, data, len);

                 if (sent < 0)
                         return sent;

                 data += sent;
                 len -= sent;

                 if (unlikely(len))
                         HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
         }

The goal of this patch is to add a way for the frontend to know when a
console is connected.  This patch adds a new flag to the end of the
console ring structure.  It is used for the backend to indicate that it
has connected and started servicing the page.

The two values are
XENCONSOLE_DISCONNECTED 1
XENCONSOLE_CONNECTED    0

XENCONSOLE_DISCONNECTED indicates to the guest that ring is
disconnected, so it will not be serviced.  The guest can avoid writing
into it in that case.  A domU can use console hypercalls and only
transition to the ring when it is connected and won't fill and block.

Once the backend (xenconsoled) maps and starts servicing the
console, the flag will be set to XENCONSOLE_CONNECTED (0) to indicate
the backend state to the frontend.

The connected value as 0 will be match the default of a zero-ed console
page.  Hyperlaunch can set the flag to XENCONSOLE_DISCONNECTED and let
xenconsoled set to XENCONSOLE_CONNECTED.

I think libxenguest should set XENCONSOLE_DISCONNECTED as well (see below).


Old domU hvc_xen drivers won't check the flag.
New domU hvc_xen running on a new xen/xenconsoled will work properly.
New domU hvc_xen on old xen/xenconsoled should only see a 0 for the flag
and behave as if connected.

Signed-off-by: Jason Andryuk <jason.andr...@amd.com>

Adapt the title of the patch?

---
v1:
Remove evtchn notify call
Set connected later when there is no error

RFC v3:
Flip flag values so 0 is connected.

The other option would be to add:
uint32_t features
uint32_t connected

New domUs would check features for a magic value and/or flag to know
they can rely on connected transitioning.

I think making XENCONSOLE_CONNECTED == 0 side steps the need for
an additional features field.  As long as assuming zero-ed memory is
acceptable.  However, this only matters for a hyperlaunched guest -
xenconsoled will normally readily connect the console and set the value.

I'd like to consider other cases as well, e.g. a console driver domain.
So any instance creating a domain with a console ring page should set the
flag initially to "disconnected".

Setting disconnected for domain creation is fine. Looking at libxenguest, there is also domain restore. There the console could be set to disconnected again before domain restore. Again, this should work and xenconsoled would set it connected again. I originally intended for a single one way transition disconnected -> connected.

Alternatively, restore could skip setting disconnected and just assume xenconsoled will promptly attach. Restore implies a toolstack is running, so there isn't the indefinite time period that is involved with hyperlaunch/dom0less. But I guess an actively changing flag accurately shows the state, so that is preferable.

This assumes that existing frontends are not using the flag space for
some other use.

Removed idea:
Send an event channel notification to let the domU know that xenconsoled
is connected.  Xenstored does similar, but for xenstore, the xenstore
driver owns the event channel/irq and can rebind it.  For hvc_xen, the
hvc subsystem owns the irq, so it isn't readily available for rebinding.
This is not implemented.

I had the idea for the kernel to use a static key and switch writing
from the hypercall to the PV ring once connected.  It didn't actually
work in my short attempt - I think changing the static key from within
an interupt was wrong.  I fell back to just checking the flag directly

You'd need to do the static key changing from a worker thread instead.

My static key idea has an issue. The flag needs to be per-instance, primary console and any additional PV consoles, but the kernel has only a single function to handle all of them. Either the primary console needs dedicated ops, or the the flag would need to be checked in the function. If the flag will toggle back and forth, then a static key may not be appropriate.

Regards,
Jason

Reply via email to