On Mon, Apr 12, 2021 at 11:03 AM Jason A. Donenfeld <[email protected]> wrote: > Sorry I'm a bit late to this thread. I'm happy to see there's a > prototype for benchmarking, though I do wonder if this is a bit of > overeager optimization? That is, why is this necessary and does it > actually help? > > By returning packets back to the Wintun ring later, more of the ring > winds up being used, which in turn means more cache misses as it spans > additional cache lines. In other words, it seems like this might be > comparing the performance of memcpy+cache no-memcpy+cachemiss. Which > is better, and is it actually measurable? Is it possible that adding > this functionality actually has zero measurable impact on performance? > Given the complexity this adds, it'd be nice to see some numbers to > help make the argument, or perhaps reasoning that's more sophisticated > than my own napkin thoughts here.
I've moved these improvements to this branch while we wait for additional argumentation: https://git.zx2c4.com/wintun/log/?h=sr/api-improvements
