> From: Alan Hourihane <[email protected]> > Date: Mon, 28 Feb 2011 22:32:49 +0000 > > On Mon, 2011-02-28 at 20:06 +0000, Matt Turner wrote: > > On Sat, Dec 11, 2010 at 10:24 PM, Mark Kettenis <[email protected]> > > wrote: > > >> Date: Sat, 11 Dec 2010 22:52:25 +0100 (CET) > > >> From: Mark Kettenis <[email protected]> > > >> > > >> > From: Alan Hourihane <[email protected]> > > >> > Date: Thu, 09 Dec 2010 13:00:54 +0000 > > >> > > > >> > On Thu, 2010-12-09 at 13:41 +0100, Mark Kettenis wrote: > > >> > > > Might want to check the hardware manuals on this, but there might > > >> > > > be a > > >> > > > byteswap bit someplace that does it in hardware. The PGX32 may do > > >> > > > this > > >> > > > at BIOS init time, whereas a PC version may not. There is a bit > > >> > > > for this > > >> > > > on the PM3, so there maybe something on PM2. > > >> > > > > >> > > Well, obviously the PGX32 doesn't have a BIOS, but the Open Firmware > > >> > > Forth code on the card might indeed do something like that. The > > >> > > OpenBSD kernel driver for this card also twiddles some of the > > >> > > endianness bits for the framebuffer windows to get things into the > > >> > > state expected by the glint driver. The Linux framebuffer driver > > >> > > does > > >> > > something similar, and I'm pretty sure NetBSD has something like that > > >> > > as well. > > >> > > > > >> > > I don't have access to hardware documentation. Documentation was > > >> > > only > > >> > > ever available under NDA isn't it? Do you expect there is a seperate > > >> > > bit to set the ednianness used by the YUV transformation hardware? > > >> > > > >> > Possibly. I can't remember. But it's worth checking because the code > > >> > wouldn't have been added without good reason in the first place. > > >> > > >> Did some further digging, and I think I've figured out what's going on > > >> here. > > >> > > >> The way the code works is that the YV12 data is uploaded to the > > >> texture buffer which is then translated to RGB and copied to the > > >> framebuffer by the hardware. Uploading the YV12 data to the texture > > >> buffer is done through the same aperture that maps the framebuffer. > > >> Therefore writes to the texture buffer undergo the same byte twisting > > >> as writes to the framebuffer. This has some interesting consequences. > > >> > > >> My PGX32 is a Permedia 2v and, by default, runs in 32bpp mode. So > > >> pixels are 32 bits wide and therefore the glint driver configures the > > >> aperture to do byte swapping, such that they end up as little-endian > > >> RGBA values in the framebuffer. Since the YV12 is handled as 32-bit > > >> wide units as well, they undergo the proper big-endian to > > >> little-endian transformation as well, and there is no need for > > >> CopyYV12() to do anything special. That means CopyYV12LE() is doing > > >> the right thing. > > >> > > >> However, if I tell X to run in 16bpp mode, the situation changes. > > >> Pixels are 16 bits wide and therefore the glint driver configures the > > >> aperture to do half-word swapping. But since the YV12 data is handled > > >> as 32-bit wide units, it now ends up being mangled when written to the > > >> texture buffer. Unsurprisingly this results in weird looking video as > > >> well. Note that the CopyYV12BE() code doesn't fix this. > > >> > > >> Running X in 8bpp mode is also still possible. In that case, the > > >> aperture is configured not to do any byte swapping. In this case the > > >> CopyYV12BE() code would do the right thing. But to be honest, given > > >> the limited color space that 8bpp mode provides, it is hard to tell if > > >> it does. > > >> > > >> So I guess some sort of byteswapping version of CopyYV12() will be > > >> necessary to fix the 16bpp. Alternatively we could use the second > > >> aperture and set it up to always do byteswapping and use that to > > >> upload the YV12 data to the texture buffer. > > > > > > The diff below (on top of my previous diff) does the the trick. Not > > > sure if the 8bpp and 24bpp modes are worth supporting. Video playback > > > on 8bpp looks absolutey afwul, and 24bpp is completely broken (crashes > > > the X server). > > > > > > I'll fight with git and submit a complete diff if folks think this is a > > > sensible approach. > > > > > > > > > Index: src/pm2_video.c > > > =================================================================== > > > RCS file: /cvs/xenocara/driver/xf86-video-glint/src/pm2_video.c,v > > > retrieving revision 1.3 > > > diff -u -p -r1.3 pm2_video.c > > > --- src/pm2_video.c 5 Dec 2010 20:25:26 -0000 1.3 > > > +++ src/pm2_video.c 11 Dec 2010 22:12:09 -0000 > > > @@ -1722,6 +1722,56 @@ CopyYV12(CARD8 *Y, CARD32 *dst, int widt > > > } > > > } > > > > > > +#if X_BYTE_ORDER == X_BIG_ENDIAN > > > + > > > +static void > > > +CopyYV12_16(CARD8 *Y, CARD32 *dst, int width, int height, int pitch) > > > +{ > > > + int Y_size = width * height; > > > + CARD8 *V = Y + Y_size; > > > + CARD8 *U = V + (Y_size >> 2); > > > + int pad = (pitch >> 2) - (width >> 1); > > > + int x; > > > + > > > + width >>= 1; > > > + > > > + for (height >>= 1; height > 0; height--) { > > > + for (x = 0; x < width; Y += 2, x++) > > > + *dst++ = Y[1] + (V[x] << 8) + (Y[0] << 16) + (U[x] << 24); > > > + dst += pad; > > > + for (x = 0; x < width; Y += 2, x++) > > > + *dst++ = Y[1] + (V[x] << 8) + (Y[0] << 16) + (U[x] << 24); > > > + dst += pad; > > > + U += width; > > > + V += width; > > > + } > > > +} > > > + > > > +static void > > > +CopyYV12_8(CARD8 *Y, CARD32 *dst, int width, int height, int pitch) > > > +{ > > > + int Y_size = width * height; > > > + CARD8 *V = Y + Y_size; > > > + CARD8 *U = V + (Y_size >> 2); > > > + int pad = (pitch >> 2) - (width >> 1); > > > + int x; > > > + > > > + width >>= 1; > > > + > > > + for (height >>= 1; height > 0; height--) { > > > + for (x = 0; x < width; Y += 2, x++) > > > + *dst++ = V[x] + (Y[1] << 8) + (U[x] << 16) + (Y[0] << 24); > > > + dst += pad; > > > + for (x = 0; x < width; Y += 2, x++) > > > + *dst++ = V[x] + (Y[1] << 8) + (U[x] << 16) + (Y[0] << 24); > > > + dst += pad; > > > + U += width; > > > + V += width; > > > + } > > > +} > > > + > > > +#endif > > > + > > > static void > > > CopyFlat(CARD8 *src, CARD8 *dst, int width, int height, int pitch) > > > { > > > @@ -1818,8 +1868,24 @@ Permedia2PutImage(ScrnInfoPtr pScrn, > > > > > > switch (id) { > > > case LE4CC('Y','V','1','2'): > > > - CopyYV12(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + > > > pPPriv->BufferBase[0]), > > > - width, height, pPPriv->BufferStride); > > > + switch(pGlint->HwBpp) { > > > +#if X_BYTE_ORDER == X_BIG_ENDIAN > > > + case 8: > > > + case 24: > > > + CopyYV12_8(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + > > > pPPriv->BufferBase[0]), > > > + width, height, pPPriv->BufferStride); > > > + break; > > > + case 15: > > > + case 16: > > > + CopyYV12_16(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + > > > pPPriv->BufferBase[0]), > > > + width, height, pPPriv->BufferStride); > > > + break; > > > +#endif > > > + default: > > > + CopyYV12(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + > > > pPPriv->BufferBase[0]), > > > + width, height, pPPriv->BufferStride); > > > + break; > > > + } > > > PutYUV(pPPriv, pPPriv->BufferBase[0], FORMAT_YUYV, 1, 0); > > > break; > > > > It'd be cool to get this in the next released version, so can anyone > > give this a review? I don't know enough about it to do it. > > Looks good.
Thanks Matt, for reminding me. I created a proper git diff: http://lists.x.org/archives/xorg-devel/2011-February/019760.html Alan, I should be able to commit that on wednesday if you're still ok with that. Cheers, Mark _______________________________________________ [email protected]: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
