RE: framebuffer corruption due to overlapping stp instructions on arm64

From: David Laight
Date: Tue Aug 07 2018 - 10:32:24 EST


From: Mikulas Patocka
> Sent: 07 August 2018 15:07
...
> Unaccelerated scrolling is still painfully slow
> even on modern computers because of slow framebuffer read.

I solved that many years ago on a strongarm system by mapping
the screen memory at two separate virtual addresses.
One uncached used for writes, the second cached using the
'minicache' for reads.
(and immediately fell foul of a memcpy() function that compared
the two virtual addresses and decided to copy backwards)

I suspect some modern cpus don't like you doing that and the
graphics 'drivers' won't use different mappings.

Even in glibc you want a more general copy_to/from_io_memory()
rather than just 'copy_from_framebuffer()'.
Best to define both - even if they end up identical.
Other drivers allow PCIe space be mmap()ed into user space.

While your tests show vmovntdqa being slightly slower than an
avx read for uncached mappings it is still much better than
all the other options.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)