Re: Xorg SEGV in Xen PV dom0 after updating from 5.16.18 to 5.17.5

From: Thorsten Leemhuis
Date: Wed May 04 2022 - 01:46:58 EST


Hi, this is your Linux kernel regression tracker. Sending this just to
CC the developers of the culprit mentioned below (bdd8b6c98239cad
("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")) and the
maintainers for the subsystem.

While at it a quick note: I wonder if this is problem a similar to one
that recently turned up with amdgpu and is fixed by this problem:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=78b12008f20

Ciao, Thorsten

On 04.05.22 02:37, Marek Marczykowski-Górecki wrote:
>
> After updating from 5.16.18 to 5.17.5 in Xen PV dom0, my Xorg started
> crashing when displaying any window mapped from a guest (domU) system.
> This is 100% reproducible.
> The system is Qubes OS, and it uses a trick that maps windows content
> from other guests using Xen grant tables, wrapped as "shared memory"
> from Xorg point of view (so, the memory that Xorg mmaps is not just from
> another process, but from another VM). That's the ShmPutImage you can
> see on the stack trace below.
>
> Stack trace of thread 12858:
> #0 0x00007f80029e17d5 raise (libc.so.6 + 0x3c7d5)
> #1 0x00007f80029ca895 abort (libc.so.6 + 0x25895)
> #2 0x00005b3469ace0e0 OsAbort (Xorg + 0x1c60e0)
> #3 0x00005b3469ad3959 AbortServer (Xorg + 0x1cb959)
> #4 0x00005b3469ad46aa FatalError (Xorg + 0x1cc6aa)
> #5 0x00005b3469acb450 OsSigHandler (Xorg + 0x1c3450)
> #6 0x00007f8002b85a90 __restore_rt (libpthread.so.0 + 0x14a90)
> #7 0x00007f8002b0a2a1 __memmove_avx_unaligned_erms (libc.so.6 + 0x1652a1)
> #8 0x00007f80015dfcc9 linear_to_xtiled_faster (iris_dri.so + 0xc91cc9)
> #9 0x00007f80015e3477 _isl_memcpy_linear_to_tiled (iris_dri.so + 0xc95477)
> #10 0x00007f8001468440 iris_texture_subdata (iris_dri.so + 0xb1a440)
> #11 0x00007f8000a76107 st_TexSubImage (iris_dri.so + 0x128107)
> #12 0x00007f8000be9a47 texture_sub_image (iris_dri.so + 0x29ba47)
> #13 0x00007f8000becd0c texsubimage_err (iris_dri.so + 0x29ed0c)
> #14 0x00007f8000bf2939 _mesa_TexSubImage2D (iris_dri.so + 0x2a4939)
> #15 0x00007f800213831f glamor_upload_boxes (libglamoregl.so + 0x1e31f)
> #16 0x00007f800213856f glamor_upload_region (libglamoregl.so + 0x1e56f)
> #17 0x00007f800212aea6 glamor_put_image (libglamoregl.so + 0x10ea6)
> #18 0x00005b3469a4d79c damagePutImage (Xorg + 0x14579c)
> #19 0x00005b3469a00a7e ProcShmPutImage (Xorg + 0xf8a7e)
> #20 0x00005b3469965a2b Dispatch (Xorg + 0x5da2b)
> #21 0x00005b3469969b04 dix_main (Xorg + 0x61b04)
> #22 0x00007f80029cc082 __libc_start_main (libc.so.6 + 0x27082)
> #23 0x00005b3469952e6e _start (Xorg + 0x4ae6e)
>
> Disassembly of the surrounding code:
>
> 0x00007596ae8c82fb <+123>: ja 0x7596ae8c8338 <__memmove_avx_unaligned_erms+184>
> 0x00007596ae8c82fd <+125>: jb 0x7596ae8c8304 <__memmove_avx_unaligned_erms+132>
> 0x00007596ae8c82ff <+127>: movzbl (%rsi),%ecx
> 0x00007596ae8c8302 <+130>: mov %cl,(%rdi)
> 0x00007596ae8c8304 <+132>: retq
> 0x00007596ae8c8305 <+133>: vmovdqu (%rsi),%xmm0
> 0x00007596ae8c8309 <+137>: vmovdqu -0x10(%rsi,%rdx,1),%xmm1
> => 0x00007596ae8c830f <+143>: vmovdqu %xmm0,(%rdi)
> 0x00007596ae8c8313 <+147>: vmovdqu %xmm1,-0x10(%rdi,%rdx,1)
> 0x00007596ae8c8319 <+153>: retq
>
>
> I don't see any related kernel or Xen messages at this time. Xorg's SEGV
> handler prints also:
>
> (EE) Segmentation fault at address 0x3c010
>
> Git bisect says it's bdd8b6c98239cad ("drm/i915: replace X86_FEATURE_PAT
> with pat_enabled()"), and indeed with this commit reverted on top of
> 5.17.5 everything works fine.
>
> I guess this part of dom0's boot dmesg may be relevant:
>
> [ 0.000949] x86/PAT: MTRRs disabled, skipping PAT initialization too.
> [ 0.000953] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC
>
> Originally reported at
> https://github.com/QubesOS/qubes-issues/issues/7479
>
>
> #regzbot introduced bdd8b6c98239cad
> #regzbot monitor: https://github.com/QubesOS/qubes-issues/issues/7479
>