Re: Simplfying copy_siginfo_to_user

From: Eric W. Biederman
Date: Mon Jul 24 2017 - 15:12:49 EST


Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Sat, Jul 22, 2017 at 1:25 PM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>> I played with some clever changes such as limiting the copy to 48 bytes,
>> disabling the memset and the like but I could not get a strong enough
>> signal to say that any one change removed the extra or a clear part of
>> it 20ns.
>
> What CPU did you use? Because the SMAP bit in particular matters.
>
> The field-by-field copies are extremely slow on modern CPU's that
> implement SMAP, unless you also use the special "unsafe_put_user()"
> code (or the nasty old put_user_ex() code that some of the x86 signal
> code uses).
>
> So one of the advantages of just copy_to_user() ends up being visible
> only on Broadwell+ (or whatever the SMAP cutoff is).

Good point.

The cpu I was testing on was an AMD A10. I don't actually have a cpu
that supports SMAP handy.

If you would like I can post the minimal patches and benckmark so anyone
who is interested could reproduce this for themselves.

I suspect that if it is down to only 20ns without SMAP this will
definitely be a performance improvement in the presence of SMAP.

Eric