Performance of 2.4.17-based Kernel vs 2.5.26-based Kernel Under Database Workload

From: Peter Wong (
Date: Thu Aug 22 2002 - 17:28:47 EST

I have compared the performance of 2.4.17 kernel+patches against that
of 2.5.26 kernel+patches under a very heavy database workload. A
100 GB database is used and stored on raw devices. The workload
consists of a sequence of highly complex queries, and is processed
with a 8-way 700 HMz Pentium III Xeon machine, 4 GB RAM and 2 MB L2
cache. Six SCSI adapters are used with 120 disks, each of which has
a capacity of 9.1 GB and a rotational speed of 10K RPM.

Details of the kernels:

The 2.4.17+ kernel consists of:
  - 2.4.17 (
  - bounce buffer patch (Jens Axboe)
  - IPS patch (Peter Wong)
  - io_request_lock patch (Jonathan Lahr)
  - rawvary patch (Badari Pulavarty)
  - changes to TASK_UNMAPPED_BASE and PAGE_OFFSET to provide
    more room for the database bufferpool

The 2.5.26+ kernel consists of
  - 2.5.26 (
  - direct I/O patch (Andrew Morton, Badari Pulavarty ported it
                      from 2.5.31)
  - changes to exec.c (a fix needed to run the benchmark)
  - changes to TASK_UNMAPPED_BASE to provide more room for the
    database bufferpool

Based upon the throughput of the workload, there is a 8% improvement
of 2.5.26+ over 2.4.17+, which indicates that the new 2.5 code
performs better than the 2.4 code. Indeed, the bounce buffer patch,
the removal of io_request_lock, and efficient handling of large I/O
via the bio struct are already incorporated into the 2.5 kernel.

I have not collected lockmeter and kernprof data on 2.5.26+ yet.
However, I have collected them on the 2.5.25-based kernel. Note
that the 2.5.25-based kernel achieves about the same performance
level as the 2.5.26+ kernel.

The lockmeter tool indicates no hot locks, and in fact, there is
almost no lock contention on the system. By examining one query
which scans a ~75 GB table and performs simple comparisons, the
lock spin time is close to 0%. The top lock is used inside the
IPS interrupt handler routine. The following is a clip of the
lockmeter result showing the *TOTAL* and do_ipsinstr+0x24.


       2.0% 5.3us(8532us) 9.5us( 414us)(0.07%) 65591728 *TOTAL*

15.3% 0.74% 63us( 162us) 10us( 202us)(0.00%) 1326034 do_ipsintr

All of the other complex queries show a similar lockmeter result.

Using the kernprof tool to examine the same query, do_ipsintr is also
at the top of the list, but it only consumes a small percentage of
the total time. The following is a clip of the kernprof result showing
the top functions.

        TOTAL_SAMPLES 3137055
        USER [c0125ef0]: 2159334 (68.8%)
        default_idle [c0105310]: 759032 (24.2%)
        do_ipsintr [c0213810]: 76169 ( 2.4%)
        do_softirq [c011b930]: 46244 ( 1.5%)
        scsi_dispatch_cmd [c01f58b0]: 12937 ( 0.4%)

All of the other complex queries show a similar kernprof result for
the top functions.


Peter Wai Yee Wong
IBM Linux Technology Center, Performance Analysis

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Fri Aug 23 2002 - 22:00:26 EST