Re: Lockmeter Analysis of 2 DDs

From: Peter Wong (wpeter@us.ibm.com)
Date: Fri Sep 07 2001 - 10:47:37 EST


I have done some lockmeter analysis on dd under two kernels:
(1) 2.4.5 base kernel and (2) 2.4.5 base kernel + Jens Axboe's
zero-bounce highmem I/O patch + my IPS patch. The data indicate that
the io_request_lock is very hot, especially in case (2).

System Configurations:
  Red Hat 6.2, 2.4.5 kernel, 4-way 500MHz PIII, 2.5GB RAM,
  2MB L2 Cache, 50 disks with 5 ServeRAID 4H controllers

The script to run 10 dds:

dd if=/dev/raw/raw1 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw4 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw7 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw10 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw13 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw16 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw19 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw22 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw25 of=/dev/null bs=65536 count=2500 &
dd if=/dev/raw/raw28 of=/dev/null bs=65536 count=2500 &

(1) Under 2.4.5 Base (38 seconds)

SPINLOCKS HOLD WAIT
  UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT

emergency_lock
 23.8% 0% 2.8us( 54us) 0us 3200001 100% 0% 0%

global_bh_lock
 24.6% 97.8% 80us(4385ms) 0us 114825 2.2% 0% 97.8%

io_request_lock
 27.6% 11.5% 1.4us( 64us) 5.8us( 115us)( 3.4%) 7633079 88.5% 11.5% 0%

rmqueue+0x2c
  6.7% 13.6% 0.8us( 9.5us) 2.0us( 15us)(0.58%) 3200862 86.4% 13.6% 0%

emergency_lock 23.8 % + 0.00% * 4 = 0.24
global_bh_lock 24.6 % + 0.00% * 4 = 0.25
io_request_lock 27.6 % + 3.40% * 4 = 0.41
rmqueue+0x2c 6.7 % + 0.58% * 4 = 0.09
==================================================
                                 Sum = 0.99 CPUs

(2) Under 2.4.5 + zero-bounce highmem I/O & IPS patches (22 seconds)

SPINLOCKS HOLD WAIT
  UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN
RJECT

global_bh_lock
 35.8% 8.9% 242us(2968us) 0us 32543 91.1% 0%
8.9%

io_request_lock
 57.7% 59.4% 1.8us( 118us) 10us( 192us)(47.9%) 6914223 40.6% 59.4%
0%

 global_bh_lock 35.8% + 0.00% * 4 = 0.36
 io_request_lock 57.7% + 47.9% * 4 = 2.49
===================================================================
                                 Sum = 2.85 CPUs

     Indeed, io_request_lock is very hot once the bounce buffers were
eliminated. Is anyone working on a patch for the io_request_lock that
possibly take the global lock and splits it into a per device queue lock?
We understand that getting this patch into 2.4 is unlikely, but it would
be nice to have this patch available on 2.4 for experimental purposes.

Wai Yee Peter Wong
IBM Linux Technology Center, Performance Analysis
email: wpeter@us.ibm.com
Office: (512) 838-9272, T/L 678-9272; Fax: (512) 838-4663

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Sep 07 2001 - 21:00:41 EST