Re: [PATCH] statx: optimize copy of struct statx to userspace

From: David Howells
Date: Sat Mar 11 2017 - 14:28:54 EST


Eric Biggers <ebiggers3@xxxxxxxxx> wrote:

> From: Eric Biggers <ebiggers@xxxxxxxxxx>
>
> I found that statx() was significantly slower than stat(). As a
> microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> file to the same with statx() passed a NULL path:
>
> $ time ./stat_benchmark
>
> real 0m1.464s
> user 0m0.275s
> sys 0m1.187s
>
> $ time ./statx_benchmark
>
> real 0m5.530s
> user 0m0.281s
> sys 0m5.247s
>
> statx is expected to be a little slower than stat because struct statx
> is larger than struct stat, but not by *that* much. It turns out that
> most of the overhead was in copying struct statx to userspace,
> apparently mostly in all the stac/clac instructions that got generated
> for each __put_user() call. (This was on x86_64, but some other
> architectures, e.g. arm64, have something similar now too.)
>
> stat() instead initializes its struct on the stack and copies it to
> userspace with a single call to copy_to_user(). This turns out to be
> much faster, and changing statx to do this makes it almost as fast as
> stat:
>
> $ time ./statx_benchmark
>
> real 0m1.573s
> user 0m0.229s
> sys 0m1.344s
>
> Signed-off-by: Eric Biggers <ebiggers@xxxxxxxxxx>

Acked-by: David Howells <dhowells@xxxxxxxxxx>