On 07/09/2024 8:46 am, Takashi Iwai wrote:
> On Fri, 06 Sep 2024 20:42:09 +0200,
> Ariadne Conill wrote:
>> This patch attempted to work around a DMA issue involving Xen, but
>> causes subtle kernel memory corruption.
>>
>> When I brought up this patch in the XenDevel matrix channel, I was
>> told that it had been requested by the Qubes OS developers because
>> they were trying to fix an issue where the sound stack would fail
>> after a few hours of uptime.  They wound up disabling SG buffering
>> entirely instead as a workaround.
>>
>> Accordingly, I propose that we should revert this workaround patch,
>> since it causes kernel memory corruption and that the ALSA and Xen
>> communities should collaborate on fixing the underlying problem in
>> such a way that SG buffering works correctly under Xen.
>>
>> This reverts commit 53466ebdec614f915c691809b0861acecb941e30.
>>
>> Signed-off-by: Ariadne Conill <ariadne@ariadne.space>
>> Cc: sta...@vger.kernel.org
>> Cc: xen-devel@lists.xenproject.org
>> Cc: alsa-de...@alsa-project.org
>> Cc: Takashi Iwai <ti...@suse.de>
> The relevant code has been largely rewritten for 6.12, so please check
> the behavior with sound.git tree for-next branch.  I guess the same
> issue should happen as the Xen workaround was kept and applied there,
> too, but it has to be checked at first.
>
> If the issue is persistent with there, the fix for 6.12 code would be
> rather much simpler like the blow.
>
>
> thanks,
>
> Takashi
>
> --- a/sound/core/memalloc.c
> +++ b/sound/core/memalloc.c
> @@ -793,9 +793,6 @@ static void *snd_dma_sg_alloc(struct snd_dma_buffer 
> *dmab, size_t size)
>       int type = dmab->dev.type;
>       void *p;
>  
> -     if (cpu_feature_enabled(X86_FEATURE_XENPV))
> -             return snd_dma_sg_fallback_alloc(dmab, size);
> -
>       /* try the standard DMA API allocation at first */
>       if (type == SNDRV_DMA_TYPE_DEV_WC_SG)
>               dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC;
>
>

Individual subsystems ought not to know or care about XENPV; it's a
layering violation.

If the main APIs don't behave properly, then it probably means we've got
a bug at a lower level (e.g. Xen SWIOTLB is a constant source of fun)
which is probably affecting other subsystems too.

I think we need to re-analyse the original bug.  Right now, the
behaviour resulting from 53466ebde is worse than what it was trying to fix.

~Andrew

Reply via email to