Re: [PATCHv8 02/17] x86/asm: Introduce inline memcpy and memset
From: David Laight
Date: Mon Jul 07 2025 - 05:34:15 EST
On Mon, 7 Jul 2025 11:02:06 +0300
"Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
> On Sun, Jul 06, 2025 at 10:13:42AM +0100, David Laight wrote:
> > On Thu, 3 Jul 2025 10:13:44 -0700
> > Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >
> > > On 7/1/25 02:58, Kirill A. Shutemov wrote:
> > > > Extract memcpy and memset functions from copy_user_generic() and
> > > > __clear_user().
> > > >
> > > > They can be used as inline memcpy and memset instead of the GCC builtins
> > > > whenever necessary. LASS requires them to handle text_poke.
> > >
> > > Why are we messing with the normal user copy functions? Code reuse is
> > > great, but as you're discovering, the user copy code is highly
> > > specialized and not that easy to reuse for other things.
> > >
> > > Don't we just need a dirt simple chunk of code that does (logically):
> > >
> > > stac();
> > > asm("rep stosq...");
> > > clac();
> > >
> > > Performance doesn't matter for text poking, right? It could be stosq or
> > > anything else that you can inline. It could be a for() loop for all I
> > > care as long as the compiler doesn't transform it into some out-of-line
> > > memset. Right?
> > >
> >
> > It doesn't even really matter if there is an out-of-line memset.
> > All you need to do is 'teach' objtool it isn't a problem.
>
> PeterZ was not fan of the idead;
>
> https://lore.kernel.org/all/20241029113611.GS14555@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> > Is this for the boot-time asm-alternatives?
>
> Not only boot-time. static_branches are switchable at runtime.
>
> > In that case I wonder why a 'low' address is being used?
> > With LASS enabled using a low address on a life kernel would make it
> > harder for another cpu to leverage the writable code page, but
> > that isn't a requirement of LASS.
>
> Because kernel side of address space is shared across all CPU and we don't
> want kernel code to be writable to all CPUs
So, as I said, it isn't a requirement for LASS.
Just something that LASS lets you do.
Although I'm sure there will be some odd effect of putting a 'supervisor'
page in the middle of 'user' pages.
Isn't there also (something like) kmap_local_page() that updates the local
page tables but doesn't broadcast the change?
>
> > If it is being used for later instruction patching you need the
> > very careful instruction sequences and cpu synchronisation.
> > In that case I suspect you need to add conditional stac/clac
> > to the existing patching code (and teach objtool it is all ok).
>
> STAC/CLAC is conditional in text poke on LASS presence on the machine.
So just change the code to use byte copy loops with a volatile
destination pointer and all will be fine.
David