On 16/01/17 21:36, Adam Jackson wrote:
On Fri, 2017-01-13 at 15:07 +0100, Pierre Ossman wrote:I'll answer myself here... This seems to be a CPU cache issue. Below this limit I see: 4,469,985 cache-misses:u # 0.336 % of all cache refs 35,279,259,258 instructions:u # 1.70 insn per cycle (100.00%) Above the limit I get: 194,571,782 cache-misses:u # 30.322 % of all cache refs 18,084,891,734 instructions:u # 0.73 insn per cycle So no wonder things take a turn for the worse. I'll have to think a bit on how to make this more efficient. Ideas are always welcome.Seems like a job for non-temporal stores?
Will that help though? I suspect the performance hit is when reading back the buffer, not writing it. The test is rather simplistic and writes linearly to memory, so write-combining should take car of the store portion.
Perhaps some clever way of making the X server upload the data to the graphics card in tandem with the application generating it?
Or perhaps I should switch to OpenGL and do it all client side for things like this?
Regards -- Pierre Ossman Software Development Cendio AB https://cendio.com Teknikringen 8 https://twitter.com/ThinLinc 583 30 Linköping https://facebook.com/ThinLinc Phone: +46-13-214600 https://plus.google.com/+CendioThinLinc A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? _______________________________________________ [email protected]: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: https://lists.x.org/mailman/listinfo/xorg-devel
