2010/2/11 Michel Dänzer <[email protected]>: > On Wed, 2010-02-10 at 23:19 +0200, Pauli Nieminen wrote: >> On Wed, Feb 10, 2010 at 7:44 PM, Roland Scheidegger >> <[email protected]> wrote: >> > On 10.02.2010 15:12, Pauli Nieminen wrote: >> >> Hi! >> >> >> >> I made some testing how fast my system can move data to VRAM/GTT and I >> >> got very interestig results: >> >> >> >> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 78595us, >> >> resulting in 39Mbps >> > "Mbps" is a bit confusing I can only read that as Megabit per second >> > which is even more pathetic :-) >> > >> >> MBps then. I never remember to write big B for byte :/ >> >> >> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 11411us, >> >> resulting in 274Mbps >> >> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 8431us, >> >> resulting in 371Mbps >> >> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 75773us, >> >> resulting in 41Mbps >> >> (II) RADEON(0): BENCH: copy 3129344 gtt to vram took 3143us, resulting >> >> in 995Mbps >> >> >> >> >> >> So direct write to VRAM operates only at 40 mega bytes per second. >> >> That is insanely slow. I hope we won't hit that kind of limit anywhere >> >> in any code. >> >> >> >> I did check that VRAM is WC cached in /proc/mtrr. But still it is >> >> surprising slow. >> > Isn't that something which fast writes should help with, an option we >> > never really got to work? I agree though it's really bad. >> > In any case I think it would be interesting to repeat those tests on >> > pci/pcie cards. >> > >> >> Fast writes might be solution but how reliable they are? How good >> performance wise? > > While fast writes might help, colour me sceptical about the lack of them > explaining the slowness. I suspect you're not actually getting > write-combining for some reason (if PAT is enabled, have you tried > disabling it?). >
nopat doesn't boot correctly for me :/ THere is clearly some of memory in UC state and it is very slow to run most of applications. Performance test shows then that I get only 25-35M/s to VRAM while GTT speed is same as with pat. > >> Also change to memcpy instead of memmoves pushes speeds to 440-450 for >> first gtt copy and 470-490 for second copy. But still my CPU is slow >> when GPU has to first be notified about copy and then it has to signal >> back about work being done. >> >> So memove is also very slow for writes to WC cached GTT. > > I think it should be safe to change the X driver to use memcpy in > RADEONCopySwap(). > If wanting to be safe there could be test for overlap areas. > > -- > Earthling Michel Dänzer | http://www.vmware.com > Libre software enthusiast | Debian, X and DRI developer > _______________________________________________ xorg-driver-ati mailing list [email protected] http://lists.x.org/mailman/listinfo/xorg-driver-ati
