On Wed, 2010-02-10 at 12:57 +0200, Pauli Nieminen wrote: > 2010/2/10 Michel Dänzer <[email protected]>: > > On Tue, 2010-02-09 at 19:31 +0200, Pauli Nieminen wrote: > >> Memmove/memcpy to vram is very slow operation. To avoid the slow copy > >> allocate all buffer objects to GTT. > > > > I'm not sure that's all there is to it (Have you tried setting it to > > VRAM instead of GTT?). With write-combining, I wouldn't expect CPU > > writes to VRAM + blit from VRAM to be significantly slower than CPU > > writes to GTT + blit from GTT. I think the problem is rather that > > without specifying a domain explicitly, the BO starts out in system RAM, > > incurring additional overhead when rendering from it. > > > > I tested to force bo to vram and it was making the memmove take many > times more. Video handling is CPU limited because of demanding decode > process running in CPU while GPU has a lot less to do. So if the move > over AGP is slow then it is better to ave GPU taking the penalty than > CPU.
The X driver just copies the decoded YUV data to the BO. So I'm still not sure where the difference comes from. > > So while I'm okay with the code change, I suspect the commit log could > > be better. > > > > I agree. The commit message is not good enough. I try to rewrite it to > better explain the change. Thanks. > Also minor improvement might be letting ccaller to ask for placement > when calling memory allocation. That way only video functions would > request GTT placement while for others it could use default. Hmm, true. -- Earthling Michel Dänzer | http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer _______________________________________________ xorg-driver-ati mailing list [email protected] http://lists.x.org/mailman/listinfo/xorg-driver-ati
