Re: fuse scalability part 1

From: Srinivas Eeda
Date: Tue Sep 29 2015 - 02:18:46 EST


Hi Miklos,

On 09/25/2015 05:11 AM, Miklos Szeredi wrote:
On Thu, Sep 24, 2015 at 9:17 PM, Ashish Samant <ashish.samant@xxxxxxxxxx> wrote:

We did some performance testing without these patches and with these patches
(with -o clone_fd option specified). We did 2 types of tests:

1. Throughput test : We did some parallel dd tests to read/write to FUSE
based database fs on a system with 8 numa nodes and 288 cpus. The
performance here is almost equal to the the per-numa patches we submitted a
while back.Please find results attached.
Interesting. This means, that serving the request on a different NUMA
node as the one where the request originated doesn't appear to make
the performance much worse.
with the new change, contention of spinlock is significantly reduced, hence the latency caused by NUMA is not visible. Even in earlier case, the scalability was not a big problem if we bind all processes(fuse worker and user (dd threads)) to a single NUMA node. The problem was only seen when threads spread out across numa nodes and contend for the spin lock.



Thanks,
Miklos

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/