On 23.02.2024 13:23, Oleksii wrote: >> >>>>> As 1- and 2-byte cases are emulated I decided that is not to >>>>> provide >>>>> sfx argument for emulation macros as it will not have to much >>>>> affect on >>>>> emulated types and just consume more performance on acquire and >>>>> release >>>>> version of sc/ld instructions. >>>> >>>> Question is whether the common case (4- and 8-byte accesses) >>>> shouldn't >>>> be valued higher, with 1- and 2-byte emulation being there just >>>> to >>>> allow things to not break altogether. >>> If I understand you correctly, it would make sense to add the 'sfx' >>> argument for the 1/2-byte access case, ensuring that all options >>> are >>> available for 1/2-byte access case as well. >> >> That's one of the possibilities. As said, I'm not overly worried >> about >> the emulated cases. For the initial implementation I'd recommend >> going >> with what is easiest there, yielding the best possible result for the >> 4- and 8-byte cases. If later it turns out repeated acquire/release >> accesses are a problem in the emulation loop, things can be changed >> to explicit barriers, without touching the 4- and 8-byte cases. > I am confused then a little bit if emulated case is not an issue. > > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and aqcuire > version of xchg barries are used. > > The similar is done for cmpxchg. > > If something will be needed to change in emulation loop it won't > require to change 4- and 8-byte cases.
I'm afraid I don't understand your reply. Jan