I missed out a link in an earlier email, this is the dbus player control which I mentioned: https://www.freedesktop.org/wiki/Specifications/mpris-spec/
A dbus <---> OSC endpoint could actually make this super useful, perhaps I will do that when I have some time, On Thu, 4 Apr 2024 at 21:57, salsaman <[email protected]> wrote: > Just to re emphasise, the nearest we have presently are AV_PIXFMT_* which > are library specific values, and lack some important values > (there is no yuv888 packed for example), And the drm,h file is based on > monitor standards, and also lacks values like 'R', 'G', 'B', 'A' * > > I think we can agree there is a gap that could be filled by an agreed set > of definitions. I dont mean technical definitions, we can just point to the > standards > e.g https://glenwing.github.io/docs/ITU-R-BT.709-1.pdf > and an enumeration XDG_VIDEO_GAMMA_BT709 (== XDG_VIDEO_GAMMA_ITU709) > > G, > > * (also I would dispute their ambiguous yuv411 definition - if it were > yuv411p I would agree, otherwise it could be the camera format UYYVYY > packed). > > On Thu, 4 Apr 2024 at 18:40, salsaman <[email protected]> wrote: > >> I'll try to explain the rationale a bit. In the audio world it is quite >> common for apps to send audio from one to another. Generally speaking they >> would send or receive via an audio server, e.g pulseaudio, jack. >> Now imagine the same for video, let us suppose you have an app that >> generates video effects from audio. Now you want to send the output to >> another app, let's say you have a super optimised openGL video player. >> You could imagine connecting the 2 apps via dbus for example. The first >> app, the generator, sends a frame sync signal each time a frame is >> produced, and includes a pointer to the frame buffer, and the frame size. >> But how does it describe the format of the frame pixel data ? Is it RGB24 >> ? yuv420p ? if it is rgb, is it sRGB gamma or linear ? >> Well, you could maybe guess the first 4 bytes are a fourcc code. Then you >> write a lot of code to parse the 4cc and figure out what it might be, >> Or the easier way, you query the app and it responds with XDG constants. >> >> G, >> >> On Thu, 4 Apr 2024 at 17:13, salsaman <[email protected]> wrote: >> >>> Hi, >>> the problem with the drm.h header is, it is complicated, still needs >>> interpretation, and it lacks some commonly used formats, (e.g YUVA4444p) >>> Also it doesn't address the gamma value (linear, sRGB, bt701), or the >>> yuv subspace, (eg Y'CbCr vs bt701), the yuv ramge (16 - 240. 16 - 235 = >>> clamped / mpeg. 0 - 255 unclamped, full, jpeg range) or uv sampling >>> position, e.g center, top_left) >>> >>> I can see that having some common definitions would be useful for >>> exchanging data between applications. Eg my app gets a frame buffer and >>> metadata XDG_VIDEO_PALETTE_RGB24, XDG_VIDEO_GAMMA_LINEAR >>> then I know unambiguously that this is planar RGB 8:8:8 (so forget >>> little / big endian) and that the values are encoded with linear (not sRGB) >>> gamma. >>> >>> If you want to be more specific with palettes, then you could do so, but >>> it might require defining metadata structs, >>> >>> For example for my own standard (Weed effects) I have: >>> >>> // max number of channels in a palette >>> >>> >>> >>> #ifndef WEED_MAXPCHANS >>> #define WEED_MAXPCHANS 8 >>> #endif >>> >>> // max number of planes in a palette >>> >>> >>> >>> #ifndef WEED_MAXPPLANES >>> #define WEED_MAXPPLANES 4 >>> #endif >>> >>> #define WEED_VCHAN_end 0 >>> >>> #define WEED_VCHAN_red 1 >>> #define WEED_VCHAN_green 2 >>> #define WEED_VCHAN_blue 3 >>> >>> #define WEED_VCHAN_Y 512 >>> #define WEED_VCHAN_U 513 >>> #define WEED_VCHAN_V 514 >>> >>> #define WEED_VCHAN_alpha 1024 >>> >>> #define WEED_VCHAN_FIRST_CUSTOM 8192 >>> >>> #define WEED_VCHAN_DESC_PLANAR (1 << 0) ///< planar type >>> >>> >>> >>> #define WEED_VCHAN_DESC_FP (1 << 1) ///< floating point >>> type >>> >>> >>> #define WEED_VCHAN_DESC_BE (1 << 2) ///< pixel data is big >>> endian (within each component) >>> >>> >>> >>> #define WEED_VCHAN_DESC_FIRST_CUSTOM (1 << 16) >>> >>> typedef struct { >>> uint16_t ext_ref; ///< link to an enumerated type >>> >>> >>> >>> uint16_t chantype[WEED_MAXPCHANS]; /// e.g. {WEED_VCHAN_U, >>> WEED_VCHAN_Y, WEED_VCHAN_V, WEED_VCHAN_Y) >>> >>> >>> uint32_t flags; /// bitmap of flags, eg. WEED_VCHAN_DESC_FP | >>> WEED_VCHAN_DESC_PLANAR >>> >>> >>> uint8_t hsub[WEED_MAXPCHANS]; /// horiz. subsampling, 0 or 1 means >>> no subsampling, 2 means halved etc. (planar only) >>> uint8_t vsub[WEED_MAXPCHANS]; /// vert subsampling >>> >>> >>> >>> uint8_t npixels; ///< npixels per macropixel: {0, 1} == 1 >>> >>> >>> >>> uint8_t bitsize[WEED_MAXPCHANS]; // 8 if not specified >>> void *extended; ///< pointer to app defined data >>> >>> >>> >>> } weed_macropixel_t; >>> >>> Then I can describe all my palettes like: >>> advp[0] = (weed_macropixel_t) { >>> WEED_PALETTE_RGB24, >>> {WEED_VCHAN_red, WEED_VCHAN_green, WEED_VCHAN_blue} >>> }; >>> >>> advp[6] = (weed_macropixel_t) { >>> WEED_PALETTE_RGBAFLOAT, >>> {WEED_VCHAN_red, WEED_VCHAN_green, WEED_VCHAN_blue, >>> WEED_VCHAN_alpha}, >>> WEED_VCHAN_DESC_FP, {0}, {0}, 1, {32, 32, 32, 32} >>> }; >>> >>> advp[7] = (weed_macropixel_t) { >>> WEED_PALETTE_YUV420P, >>> {WEED_VCHAN_Y, WEED_VCHAN_U, WEED_VCHAN_V}, >>> WEED_VCHAN_DESC_PLANAR, {1, 2, 2}, {1, 2, 2} >>> }; >>> >>> IMO this is way superior to fourcc and if you were to supplement this >>> with gamma, interlace, yuv subspace, yuv clamping and yuv sampling, then >>> you would have a very comprehensive definition for any type of video frame. >>> >>> G. >>> >>> >>> >>> On Thu, 4 Apr 2024 at 08:52, Pekka Paalanen < >>> [email protected]> wrote: >>> >>>> On Wed, 3 Apr 2024 21:51:39 -0300 >>>> salsaman <[email protected]> wrote: >>>> >>>> > Regarding my expertise, I was one of the developers most involved in >>>> > developing the "livido" standard which was one of the main topics of >>>> the >>>> > Piksel Festivals held in Bergen, Norway. >>>> > In the early days (2004 - 2006) the focus of the annual event was >>>> precisely >>>> > the formulation of free / open standards, in this case for video >>>> effects. >>>> > Other contributors included: >>>> > Niels Elburg, Denis "Jaromil" Rojo, Tom Schouten, Andraz Tori, >>>> Kentaro >>>> > Fukuchi and Carlo Prelz. >>>> > I've also been involved with and put forward proposals for common >>>> command / >>>> > query / reply actions (Open Media Control). To the extent that these >>>> > proposals have not gained traction, I don't ascribe this to a failing >>>> in >>>> > the proposals, but rather to a lack of developer awareness. >>>> > >>>> > Now regarding specific areas, I went back and reviewed some of the >>>> > available material at >>>> https://www.freedesktop.org/wiki/Specifications/ >>>> > >>>> > free media player specifications >>>> > >>>> https://www.freedesktop.org/wiki/Specifications/free-media-player-specs/ >>>> > metadata standards for things like comments and ratings - talks mainly >>>> > about audio but describes video files also >>>> > >>>> > I am not a big fan of dbus, but this looks fine, it could be used for >>>> video >>>> > players. I'd be happier if it were a bit more abstracted and not tied >>>> to a >>>> > specific implementation (dbus). I could suggest some enhancements but >>>> I >>>> > guess this is a dbus thing and not an xdg thing. >>>> >>>> Thanks, these sound like they do not need to involve Wayland in any >>>> way, so they are not on my plate. >>>> >>>> > IMO what would be useful would be to define a common set of >>>> constants, most >>>> > specifically related to frame pixel fornats >>>> > The 2 most common in use are fourCC and avformat >>>> >>>> Wayland protocol extensions and I suppose also Wayland compositors >>>> internally standardise on drm_fourcc.h formats. Their authoritative >>>> definitions are in >>>> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/drm/drm_fourcc.h >>>> and they are not intentionally mirroring any other fourcc coding. >>>> >>>> These are strictly pixel formats, and do not define anything about >>>> colorimetry, interlacing, field order, frame rate, quantization range, >>>> or anything else. >>>> >>>> > Consider a frame in UYVY fornat >>>> > >>>> > fourCC values: >>>> > >>>> > #define MK_FOURCC(a, b, c, d) (((uint32_t)a) | (((uint32_t)b) << 8) >>>> \ >>>> > | (((uint32_t)c) << 16) | >>>> (((uint32_t)d) << >>>> > 24)) >>>> > >>>> > MK_FOURCC('U', 'Y', 'V', 'Y') >>>> > but also >>>> > MK_FOURCC('I', 'U', 'Y', 'B') >>>> > the same but with interlacing >>>> > MK_FOURCC('H', 'D', 'Y', 'C') >>>> > same but bt709 (hdtv) encoding >>>> > >>>> > so this requires interpretation by sender / receiver - a simpler way >>>> could >>>> > be with constants >>>> > >>>> > - probably the nearest we have are ffmpeg / libav definitions, but >>>> this is >>>> > the wrong way around, a lib shouldn't define a global standard, the >>>> > standard should come first and the lib should align to that. >>>> > >>>> > We have AV_PIX_FMT_UYVY422 which was formerly PIX_FMT_UYVY422 >>>> > and AVCOL_TRC_BT709, which is actually the gamma transfer function, >>>> There >>>> > is no equivalent bt709 constant fot bt709 yuv / rgb, instead this >>>> exists as >>>> > a matrix. >>>> > >>>> > Now consider how much easier it would be to share data if we had the >>>> > following constants enumerated: >>>> > >>>> > *XDG_VIDEO_PALETTE_UYVY* >>>> > *XDG_VIDEO_INTERLACE_TOP_FIRST* >>>> > *XDG_VIDEO_YUV_SUBSPACE_BT709* >>>> > *XDG_VIDEO_GAMMA_SRGB* >>>> > >>>> > (this is an invented example, not intended to be a real example). >>>> > >>>> > There is a bit more too it but that should be enough to give a >>>> general idea. >>>> >>>> Where should this be used? >>>> >>>> >>>> Thanks, >>>> pq >>>> >>>
