Re: [PATCH] concurrent block allocation for ext2 against 2.5.64

From: Andrew Morton (akpm@digeo.com)
Date: Sat Mar 15 2003 - 01:44:13 EST


William Lee Irwin III <wli@holomorphy.com> wrote:
>
> On Fri, Mar 14, 2003 at 08:54:55PM -0800, Andrew Morton wrote:
> > > `dbench 512' will presumably do lots of IO and spend significant
> > > time in I/O wait. You should see the effects of this change more
> > > if you use fewer clients (say, 32) so it doesn't hit disk.
> >
> On Fri, Mar 14, 2003 at 09:49:10PM -0800, William Lee Irwin III wrote:
> > Throughput 226.57 MB/sec 32 procs
> > dbench 32 2>& 1 25.04s user 515.02s system 1069% cpu 50.516 total
>
> It's too light a load... here's dbench 128.

OK.

> Looks like dbench doesn't scale. It needs to learn how to spread itself
> across disks if it's not to saturate a device queue while at the same
> time generating enough cpu load to saturate cpus.

Nope. What we're trying to measure here is pure in-memory lock contention,
locked bus traffic, context switches, etc, etc. To do that we need to get
the IO system out of the picture.

One way to do that is to increase /proc/sys/vm/dirty_ratio and
dirty_background_ratio to 70% or so. You can still hit IO wait if someone
tries to truncate a file which pdflush is writing out, so increase
dirty_expire_centisecs and dirty_writeback_centisecs to 1000000000 or so...

Then, on the second run, when all the required metadata blocks are in
pagecache you should be able to get an IO-free run.

> Is there a better (publicable/open/whatever) benchmark?

I have lots of little testlets which can be mixed and matched. RAM-only
dbench will do for the while. It is showing things.

>
> dbench 128:
> Throughput 161.237 MB/sec 128 procs
> dbench 128 2>& 1 143.85s user 3311.10s system 1219% cpu 4:43.27 total
>
> vma samples %-age symbol name
> c0106ff4 9134179 33.7261 default_idle
> c01dc3b0 5570229 20.5669 __copy_to_user_ll
> c01dc418 1773600 6.54865 __copy_from_user_ll
> c0119058 731524 2.701 try_to_wake_up
> c0108140 686952 2.53643 .text.lock.semaphore
> c011a1bc 489415 1.80706 schedule
> c0119dac 485196 1.79149 scheduler_tick
> c011fadc 448048 1.65433 profile_hook
> c0119860 356065 1.3147 load_balance
> c0107d0c 267333 0.987072 __down
> c011c4ff 249627 0.921696 .text.lock.sched

The wakeup and .text.lock.semaphore load indicates that there is a lot
of contention for a semaphore somewhere. Still.

I'm not sure which one. It shouldn't be a directory semaphore. Might be
lock_super() in the inode allocator, but that seems unlikely.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Mar 15 2003 - 22:00:42 EST