On 08/07/2017 06:14 PM, Oleksandr Andrushchenko wrote:


On 08/07/2017 04:55 PM, Clemens Ladisch wrote:
Oleksandr Andrushchenko wrote:
On 08/07/2017 04:11 PM, Clemens Ladisch wrote:
How does that interface work?
For the buffer received in .copy_user/.copy_kernel we send
a request to the backend and get response back (async) when it has copied the bytes into HW/mixer/etc, so the buffer at frontend side can be reused.
So if the frontend sends too many (too large) requests, does the
backend wait until there is enough free space in the buffer before
it does the actual copying and then acks?
Well, the frontend should be backend agnostic,
In our implementation backend is a user-space application which sits
either on top of ALSA driver or PulseAudio: so, it acks correspondingly,
e.g, when, for example, ALSA driver completes .copy_user and returns
from the kernel
If yes, then these acks can be used as interrupts.
we can probably teach our backend to track periods elapsed for ALSA,
but not sure if it is possible for PulseAudio - do you know if this is also
doable for pulse?

Let's assume backend blocks until the buffer played/consumed...
   (You still
have to count frames, and call snd_pcm_period_elapsed() exactly
when a period boundary was reached or crossed.)
... and what if the buffer has multiple periods? So, that the backend sends a single response for multiple periods (buffers with fractional period number
can be handled separately)?
We will have to either send snd_pcm_period_elapsed once (wrong, because
multiple periods consumed) or multiple times at one time with no delay (wrong, because there will be a confusion that multiple periods were not reported for quite
some long time and then there is a burst of events)
Either way the behavior will not be the one desired (please correct me
if I am wrong here)

Splitting a large read/write into smaller requests to the backend
would improve the granularity of the known stream position.

The overall latency would be the sum of the sizes of the frontend
and backend buffers.


Why is the protocol designed this way?
We also work on para-virtualizing display device and there we tried to use
page flip events from backend to frontend to signal similar to
period interrupt for audio. When multiple displays (read multiple audio streams) were in place we flooded with the system interrupts (which are period events in our case)
and performance dropped significantly. This is why we switched to
interrupt emulation, here via timer for audio. The main measures were:
1. Number of events between front and back
2. Latency
With timer approach we reduce 1) to the minimum which is a must (no period
interrupts), but 2) is still here
With emulated period interrupts (protocol events) we have issue with 1)
and still 2) remains.

BTW, there is one more approach to solve this [1],
but it uses its own Xen sound protocol and heavily relies
on Linux implementation, which cannot be a part of a generic protocol
So, to me, neither approach solves the problem for 100%, so we decided
to stick to timers. Hope, this gives more background on why we did things
the way we did.
  Wasn't the goal to expose
some 'real' sound card?

yes, but it can be implemented in different ways, please see above
Regards,
Clemens
Thank you for your interest,
Oleksandr

[1] https://github.com/OpenXT/pv-linux-drivers/blob/master/archive/openxt-audio/main.c#L356

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to