[PATCH v2] x86/gen-cpuid: correct cycle detection

Jan Beulich Mon, 01 Sep 2025 02:36:59 -0700

With the processing done linearly (rather than recursively), checking
whether any of the features was previously seen is wrong: That would
e.g. trigger for this simple set of dependencies


    X: [A, B]
    A: [C]
    B: [C]

(observed in reality when making AMX-AVX512 dependent upon both
AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
of dependents). But checking the whole accumulated set also isn't
necessary - just checking the feature we're processing dependents of is
sufficient. We may detect a cycle later that way, but we still will
detect it. What we need to avoid is adding a feature again when we've
already seen it.

As a result, seeding "seen[]" with "feat" isn't necessary anymore.

Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
Signed-off-by: Jan Beulich <jbeul...@suse.com>
---
Doing AMX-AVX512's dependencies like mentioned above still isn't quite
right; we really need AVX512F || AVX10, which can't be expressed right
now. I'm now handling this by some custom code in the AVX10 series.

This contextually collides with patch 2 of "x86/cpu-policy: minor
adjustments", posted almost 2 years ago and still pending (afair) any
kind of feedback.
---
v2: Adjust an error message. Reduce diff / indentation some.

--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -366,7 +366,7 @@ def crunch_numbers(state):
 
     for feat in deep_features:
 
-        seen = [feat]
+        seen = []
         to_process = list(deps[feat])
 
         while len(to_process):
@@ -379,14 +379,17 @@ def crunch_numbers(state):
 
             f = to_process.pop(0)
 
+            if f == feat:
+                raise Fail("ERROR: Cycle found when processing %s" % \
+                           (state.names[f], ))
+
             if f in seen:
-                raise Fail("ERROR: Cycle found with %s when processing %s"
-                           % (state.names[f], state.names[feat]))
+                continue
 
             seen.append(f)
             to_process = list(set(to_process + deps.get(f, [])))
 
-        state.deep_deps[feat] = seen[1:]
+        state.deep_deps[feat] = seen
 
     state.deep_features = deps.keys()
     state.nr_deep_deps = len(state.deep_deps.keys())

[PATCH v2] x86/gen-cpuid: correct cycle detection

Reply via email to