Hi,
On 13/10/2023 07:31, Juergen Gross wrote:
On 13.10.23 00:36, Stefano Stabellini wrote:
On Thu, 12 Oct 2023, George Dunlap wrote:
Stop tinkering in the hope that it hides the problem. You're only
making it harder to fix properly.
Making it harder to fix properly would be a valid reason not to commit
the (maybe partial) fix. But looking at the fix again:
diff --git a/tools/xenstored/domain.c b/tools/xenstored/domain.c
index a6cd199fdc..9cd6678015 100644
--- a/tools/xenstored/domain.c
+++ b/tools/xenstored/domain.c
@@ -989,6 +989,7 @@ static struct domain *introduce_domain(const
void *ctx,
talloc_steal(domain->conn, domain);
if (!restore) {
+ domain_conn_reset(domain);
/* Notify the domain that xenstore is
available */
interface->connection = XENSTORE_CONNECTED;
xenevtchn_notify(xce_handle, domain->port);
@@ -1031,8 +1032,6 @@ int do_introduce(const void *ctx, struct
connection *conn,
if (!domain)
return errno;
- domain_conn_reset(domain);
-
send_ack(conn, XS_INTRODUCE);
It is a 1-line movement. Textually small. Easy to understand and to
revert. It doesn't seem to be making things harder to fix? We could
revert it any time if a better fix is offered.
Maybe we could have a XXX note in the commit message or in-code
comment?
It moves a line from one function (do_domain_introduce()) into a
completely different function (introduce_domain()), nested inside two
if() statements; with no analysis on how the change will impact
things.
I am not the original author of the patch, and I am not the maintainer
of the code, so I don't feel I have the qualifications to give you the
answers you are seeking. Julien as author of the patch and xenstore
reviewer might be in a better position to answer. Or Juergen as xenstore
maintainer.
I did already provide some feedback when the patch was sent the first time
in May.
From what I can see the patch is correct.
You removed the dom0 special casing again, which I asked for to add back
then.
+1
And I still think there are missing barriers (at least for Arm).
Just to clarify. Do you mean adding a barrier after domain_conn_reset()
but before adding setting interface->connection? If so, I agree that we
need a wmb(). We don't have wmb() but smp_mb() in Xenstored. This
stronger than necessary, but I think this is ok as I don't view as a
hotpath.
Cheers,
--
Julien Grall