On 04.09.2025 23:53, Jason Andryuk wrote: > On 2025-04-16 05:00, Jan Beulich wrote: >> By observation GNU ld 2.25 may emit file symbols for .data.read_mostly >> when linking xen.efi. Due to the nature of file symbols in COFF symbol >> tables (see the code comment) the symbols_offsets[] entries for such >> symbols would cause assembler warnings regarding value truncation. Of >> course the resulting entries would also be both meaningless and useless. >> Add a heuristic to get rid of them, really taking effect only when >> --all-symbols is specified (otherwise these symbols are discarded >> anyway). >> >> Signed-off-by: Jan Beulich <jbeul...@suse.com> >> --- >> Factor 2 may in principle still be too small: We zap what looks like >> real file symbols already in read_symbol(), so table_cnt doesn't really >> reflect the number of symbol table entries encountered. It has proven to >> work for me in practice though, with still some leeway left. >> >> --- a/xen/tools/symbols.c >> +++ b/xen/tools/symbols.c >> @@ -213,6 +213,16 @@ static int symbol_valid(struct sym_entry >> if (strstr((char *)s->sym + offset, "_compiled.")) >> return 0; >> >> + /* At least GNU ld 2.25 may emit bogus file symbols referencing a >> + * section name while linking xen.efi. In COFF symbol tables the >> + * "value" of file symbols is a link (symbol table index) to the next >> + * file symbol. Since file (and other) symbols (can) come with one >> + * (or in principle more) auxiliary symbol table entries, the value in >> + * this heuristic is bounded to twice the number of symbols we have >> + * found. See also read_symbol() as to the '?' checked for here. */ >> + if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2) >> + return 0; >> + >> return 1; >> } > > I looked at this. It'll drop symbols, but I don't know enough to give > an R-b. I can't give an actionable A-b either. Maybe someone else can > chime in. > > Maybe this is just showing my lack of knowledge, but could any symbol > starting "?." be considered invalid? I don't think I've ever seen any > like that.
With quotation, almost any symbol name can appear in principle. I wouldn't want to judge symbol validity by its name. What's more important here, though, is that sym[0] isn't part of the name; it's the symbol's type as taken from nm's output. We're therefore heuristically looking at symbols of unknown type with a dot as the first character (as section names would conventionally have it). Jan