Re: [PATCH][2.6] Completely out of line spinlocks / x86_64

From: Marcelo Tosatti
Date: Mon Aug 09 2004 - 08:59:03 EST


On Mon, Aug 09, 2004 at 08:41:38AM -0300, Marcelo Tosatti wrote:
> On Mon, Aug 09, 2004 at 01:23:08PM +0200, Andi Kleen wrote:
> > On Sun, 8 Aug 2004 02:08:30 -0400 (EDT)
> > Zwane Mwaikambo <zwane@xxxxxxxxxxxxx> wrote:
> >
> > > arch/x86_64/Kconfig | 10 ++++++++++
> > > arch/x86_64/lib/Makefile | 1 +
> > > arch/x86_64/lib/spinlock.c | 38 ++++++++++++++++++++++++++++++++++++++
> > > include/asm-x86_64/spinlock.h | 22 ++++++++++++++++++++--
> > > 4 files changed, 69 insertions(+), 2 deletions(-)
> > >
> > > Index: linux-2.6.8-rc3-mm1-amd64/arch/x86_64/Kconfig
> > > ===================================================================
> > > RCS file: /home/cvsroot/linux-2.6.8-rc3-mm1/arch/x86_64/Kconfig,v
> > > retrieving revision 1.1.1.1
> > > diff -u -p -B -r1.1.1.1 Kconfig
> > > --- linux-2.6.8-rc3-mm1-amd64/arch/x86_64/Kconfig 5 Aug 2004 16:37:48 -0000 1.1.1.1
> > > +++ linux-2.6.8-rc3-mm1-amd64/arch/x86_64/Kconfig 7 Aug 2004 22:47:30 -0000
> > > @@ -438,6 +438,16 @@ config DEBUG_SPINLOCK
> > > best used in conjunction with the NMI watchdog so that spinlock
> > > deadlocks are also debuggable.
> > >
> > > +config COOL_SPINLOCK
> > > + bool "Completely out of line spinlocks"
> > > + depends on SMP
> > > + default y
> > > + help
> > > + Say Y here to build spinlocks which have common text for contended
> > > + and uncontended paths. This reduces kernel text size by at least
> > > + 50k on most configurations, plus there is the additional benefit
> > > + of better cache utilisation.
> >
> > I think the 50k number is wrong. I took a look at it and the big
> > difference is only seen when you enable interrupts during spinning, which
> > we didn't do before. If you compare it to the old implementation the
> > difference is much less.
> >
> > I don't really like the config option. Either it's a good idea
> > then it should be done by default without option or it should not be done at all.
> >
> > Did you do any lock intensive benchmarks that could show a slowdown?
>
> Out of curiosity, also, have you ran any lock intensive benchmarks to get some
> numbers out of the increased cacheline hits due to uninlining?
>
> I think you can measure the hits/misses precisely with Mikael's perfcounters.

Hi Zwane,

Just seen your bonnie++ results (should have the whole thread before replying),
looks great, except a slight reduction in sequential output:

out-of-line spinlocks:
Version @version@ ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
stp2-000 2G 7018 99 64560 36 21694 16 6789 97 43729 14 340.6 1
stp2-000 2G 7055 99 64836 39 21899 16 6752 97 44827 17 330.8 2
stp2-000 2G 7023 99 64525 38 22987 17 6704 96 44777 14 337.3 1

mainline:
Version @version@ ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
stp2-000 2G 7048 99 64912 38 22510 17 6732 96 43900 14 332.0 1
stp2-000 2G 7018 99 63821 39 21732 16 6787 97 44889 17 326.7 2
stp2-000 2G 7063 99 63834 38 22361 17 6738 97 43310 14 338.3 1
------Sequential Create------ --------Random Create--------

Probably just noise, still I think its worth mentioning.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/