Re: [PATCH] Radix-tree pagecache for 2.5

From: Ingo Molnar (mingo@elte.hu)
Date: Fri Feb 01 2002 - 04:04:50 EST


On Fri, 1 Feb 2002, Anton Blanchard wrote:

> There were a few solutions (from davem and ingo) to allocate a larger
> hash but with the radix patch we no longer have to worry about this.

there is one big issue we forgot to consider.

in the case of radix trees it's not only search depth that gets worse with
big files. The thing i'm worried about is the 'big pagecache lock' being
reintroduced again. If eg. a database application puts lots of data into a
single file (multiple gigabytes - why not), then the
mapping->i_shared_lock becomes a 'big pagecache lock' again, causing
serious SMP contention for even the read() case. Benchmarks show that it's
the distribution of locks that matters on big boxes.

dbench hides this issue, because it uses many temporary files, so the
locking overhead is distributed. Would you be willing to run benchmarks
that measure the scalability of reading from one bigger file, from
multiple CPUs?

with hash based locking, the locking overhead is *always* distributed.

with radix trees the locking overhead is distributed only if multiple
files are used. With one big file (or a few big files), the i_shared_lock
will always bounce between CPUs wildly in read() workloads, degrading
scalability just as much as it is degraded with the pagecache_lock now.

        Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 07 2002 - 21:00:11 EST