Hi, I am trying to better understand /optimize texture upload on r600 and GCN based GPUs.
Currently I use PBOs to upload data generated by a worker thread to textures, using the following steps: 1. Unmap buffer n (from worker) 2. glTexSubImage2D n-1 to texture n-1 3. bin texture n-2 & draw & glutSwapBuffers 4. map buffer n-3 again and pass it to worker thread For each buffer only one step is executed per frame to avoid GPU stalls. However, after I had a look at radeon_gem_objects I am not sure this approach makes a lot of sence. All PBOs are located in system memory (GTT), so as far as I understand it, unmapping a PBO is actually a no-on and doesn't trigger any transfer? However, where is the actual DMA transfer triggerd - by glTexSubImage2D? And at which point the driver checks for DMA completion - at glutSwapBuffers? Furthermore, is it possible to perform async upload and rendering in parallel in case there are no data-dependencies? Some insights would be really great to better optimize the code. Thank you in advance & best regards, Clemens _______________________________________________ xorg-driver-ati mailing list [email protected] https://lists.x.org/mailman/listinfo/xorg-driver-ati
