Re: [PATCH 1/1] drm/qxl: fixes qxl_fence_wait

From: Linux regression tracking (Thorsten Leemhuis)
Date: Wed Mar 20 2024 - 11:26:19 EST


On 08.03.24 02:08, Alex Constantino wrote:
> Fix OOM scenario by doing multiple notifications to the OOM handler through
> a busy wait logic.
> Changes from commit 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait") would
> result in a '[TTM] Buffer eviction failed' exception whenever it reached a
> timeout.
>
> Fixes: 5a838e5d5825 ("drm/qxl: simplify qxl_fence_wait")
> Link: https://lore.kernel.org/regressions/fb0fda6a-3750-4e1b-893f-97a3e402b9af@xxxxxxxxxxxxx
> Reported-by: Timo Lindfors <timo.lindfors@xxxxxx>
> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054514
> Signed-off-by: Alex Constantino <dreaming.about.electric.sheep@xxxxxxxxx>
> ---
> drivers/gpu/drm/qxl/qxl_release.c | 20 ++++++++++++++------
> 1 file changed, 14 insertions(+), 6 deletions(-)

Hey Dave and Gerd as well as Thomas, Maarten and Maxime (the latter two
I just added to the CC), it seems to me this regression fix did not
maybe any progress since it was posted. Did I miss something, is it just
"we are busy with the merge window", or is there some other a reason?
Just wondering, I just saw someone on a Fedora IRC channel complaining
about the regression, that's why I'm asking. Would be really good to
finally get this resolved...

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

> diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> index 368d26da0d6a..51c22e7f9647 100644
> --- a/drivers/gpu/drm/qxl/qxl_release.c
> +++ b/drivers/gpu/drm/qxl/qxl_release.c
> @@ -20,8 +20,6 @@
> * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> */
>
> -#include <linux/delay.h>
> -
> #include <trace/events/dma_fence.h>
>
> #include "qxl_drv.h"
> @@ -59,14 +57,24 @@ static long qxl_fence_wait(struct dma_fence *fence, bool intr,
> {
> struct qxl_device *qdev;
> unsigned long cur, end = jiffies + timeout;
> + signed long iterations = 1;
> + signed long timeout_fraction = timeout;
>
> qdev = container_of(fence->lock, struct qxl_device, release_lock);
>
> - if (!wait_event_timeout(qdev->release_event,
> + // using HZ as a factor since it is used in ttm_bo_wait_ctx too
> + if (timeout_fraction > HZ) {
> + iterations = timeout_fraction / HZ;
> + timeout_fraction = HZ;
> + }
> + for (int i = 0; i < iterations; i++) {
> + if (wait_event_timeout(
> + qdev->release_event,
> (dma_fence_is_signaled(fence) ||
> - (qxl_io_notify_oom(qdev), 0)),
> - timeout))
> - return 0;
> + (qxl_io_notify_oom(qdev), 0)),
> + timeout_fraction))
> + break;
> + }
>
> cur = jiffies;
> if (time_after(cur, end))