Adventures with bonnie and linux-2.3.7

BOSZORMENYI Zoltan (zboszor@mol.hu)
Tue, 22 Jun 1999 07:24:26 +0100


Hi,

I tested 2.3.7 with bonnie against 2.2.10. I only tested 2 of my 3 IDE
hd's,
2.3.7 does not support fat, and my /dev/hda contains only one Windows
partition. The kernel was compiled UP, the machine is a 6x86MX/PR200
(166MHz)
with 64 MB SDRAM, Intel TX mainboard. IDE BM DMA or PIIX support was
compiled in,
but as my / is on a SCSI disk, the whole IDE subsystem was in modules.

First, the results of my /dev/hdc (/usr/local, 4K ext2),
a QUANTUM FIREBALL_TM1700A.

2.2.10 w/o BM DMA (after boot, BM DMA isn't activated on 2nd IDE
interface):

-------Sequential Output-------- ---Sequential Input--
--Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec
%CPU
100 1271 97.7 5259 89.4 2002 51.0 1208 95.8 4752 86.3
121.9 9.2

w/ BM DMA (after hdparm -d1):
-------Sequential Output-------- ---Sequential Input--
--Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec
%CPU
100 1693 99.4 7677 15.7 2881 12.0 1563 92.3 6132 9.2
130.9 1.8

2.3.7 /w BM DMA (activated on boot, finally automatically :) ):
-------Sequential Output-------- ---Sequential Input--
--Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec
%CPU
100 1577 92.1 5925 10.9 2831 6.1 1440 84.6 6089 10.1
119.9 1.7

And the results of my /dev/hdb (/usr, 4K ext2), a QUANTUM
FIREBALL_TM2110A.

2.2.10 w/ BM DMA:
-------Sequential Output-------- ---Sequential Input--
--Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec
%CPU
100 1690 99.3 7958 16.8 2741 10.6 1557 92.0 5891 8.7
120.9 1.5

2.3.7 w/ BM DMA:
-------Sequential Output-------- ---Sequential Input--
--Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec
%CPU
100 1556 90.8 6006 11.6 2818 6.6 1431 84.3 5946 8.6
112.4 1.7

The first test without DMA was made because I wanted to see what
it can do with raw CPU power and it is good for comparison.

The DMA tests show something interesting, 2.2.10 with DMA is faster in
some operations:

1.7GB hdc gives ~7.6 MB/sec with 2.2.10 and ~5.9 MB/sec with 2.3.7,
2.1GB hdb gives ~7.9 MB/sec with 2.2.10 and ~6 MB/sec with 2.3.7.

The difference is too large to be taken as a statistical error.

The other operations give almost the same throughput results but 2.3.7
does it with less CPU time, in one case with 50% less! I did these
tests some times, and the results were just like the above with some
(~1 - 2%) difference

The above was the end of the story. :) I had strange experiences with
2.3.7. After compilation, I rebooted with it and it gave oops when it
attempted "swapon", and fsck found uncorrectable errors.
Two reboots later (first with a 2.2.10, fsck found two correctable
errors:
it deleted /boot/System.map and /dev/log ; then with 2.3.7), these did
not appeared but running bonnie gave an oops and I was unable to compile
ksymoops, egcs-1.1.2 gave an internal error. I saved the oops, here it
is
disassembled with ksymoops, I compiled it another reboot later under
2.2.10,
ksymoops was from the 2.3.7 tree:

************************************************************************
Unable to handle kernel paging request at virtual address 00002548
current->tss.cr3 = 013c9000, %cr3 = 013c9000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c011d7e8>]
EFLAGS: 00010202
eax: 00002500 ebx: c0334e00 ecx: 00000afd edx: 00008000
esi: 00000000 edi: c13cc000 ebp: 000000ff esp: c13cdedc
ds: 0018 es: 0018 ss: 0018
Process bonnie (pid: 739, process nr: 43, stackpage=c13cd000)

Stack: c0334e00 0000001e 00000006 00000013 00000000 00c03fff c0123b1a
00000006
00000013 00000001 00000013 00000013 c0123c57 00000013 00000000
033fc000
c0124538 00000013 00000000 033fc000 00002000 00000000 c032d540
00001000
Call Trace: [<c0123b1a>] [<c0123c57>] [<c0124538>] [<c01200c9>]
[<c013e0ad>] [<c
013e06c>] [<c01274d5>]
[<c0109e38>]
Code: ff 48 48 8b 13 8b 4b 04 39 98 88 00 00 00 75 06 89 90 88 00
Warning: trailing garbage ignored on Code: line
Text: 'Code: ff 48 48 8b 13 8b 4b 04 39 98 88 00 00 00 75 06 89 90 88
00 '
Garbage: ' '

>>EIP: c011d7e8 <remove_page_from_inode_queue+8/3c>
Trace: c0123b1a <do_try_to_free_pages+26/78>
Trace: c0123c57 <try_to_free_pages+23/3c>
Trace: c0124538 <__get_free_pages+6c/1c4>
Trace: c01200c9 <generic_file_write+141/2c8>
Trace: c013e0ad <ext2_file_write+1d/98>
Trace: c013e06c <ext2_write_one_page+0/24>
Trace: c01274d5 <sys_write+a9/c0>
Trace: c0109e38 <system_call+34/38>
Code: c011d7e8 <remove_page_from_inode_queue+8/3c> 00000000 <_EIP>:
<===
Code: c011d7e8 <remove_page_from_inode_queue+8/3c> 0: ff 48
48
decl 0x48(%eax) <===
Code: c011d7eb <remove_page_from_inode_queue+b/3c> 3: 8b
13
movl (%ebx),%edx
Code: c011d7ed <remove_page_from_inode_queue+d/3c> 5: 8b 4b
04
movl 0x4(%ebx),%ecx
Code: c011d7f0 <remove_page_from_inode_queue+10/3c> 8: 39 98 88
00 00 0
0 cmpl %ebx,0x88(%eax)
Code: c011d7f6 <remove_page_from_inode_queue+16/3c> e: 75
06
jne c011d7fe <remove_page_from_inode_queue+1e/3c>
Code: c011d7f8 <remove_page_from_inode_queue+18/3c> 10: 89 90 88
00 00 0
0 movl %edx,0x88(%eax)
***************************************************************************
Another reboot later (2.3.7) bonnie was able to complete its task, and I
was
able to recompile ksymoops and linux-2.3.7 under itself.

Then (after the above mentioned kernel compile and without a reboot) I
tried
to test my /dev/sda (a Quantum LPS540S, an old async 540MB disk, LPS may
mean
Low Performance Scsi :)) but at first it gave a
"drastic I/O error (getc(3)): Input/output error" and an oops when
I wanted to run it second time. Here it is:

***************************************************************************
kernel BUG at buffer.c:1236!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01296ec>]
EFLAGS: 00010282
eax: 0000001d ebx: 00000000 ecx: c01c9ac8 edx: c3bb6000
esi: c031a958 edi: c293dba0 ebp: c240de70 esp: c240de28
ds: 0018 es: 0018 ss: 0018
Process bonnie (pid: 4551, process nr: 51, stackpage=c240d000)
Stack: 000004d4 c01d0e40 c293dba0 00000000 c016bb41 c293dba0 00000000
00000001
c240ded0 00000004 00000002 00000286 00000003 00000020 000000ff
c0128fe9
ffffffff fffffffe 00000003 c016bc7d 00000008 00000000 c293dba0
c293dc00
Call Trace: [<c016bb41>] [<c0128fe9>] [<c016bc7d>] [<c012a549>]
[<c012a549>] [<c
011e079>] [<c011e091>]
[<c011efad>] [<c011f255>] [<c011f1a0>] [<c0127415>] [<c0109e38>]
[<c01000
2b>]
Code: 0f 0b 83 c4 0c c7 46 34 00 00 00 00 0f b3 5e 18 19 c0 85 c0
Warning: trailing garbage ignored on Code: line
Text: 'Code: 0f 0b 83 c4 0c c7 46 34 00 00 00 00 0f b3 5e 18 19 c0 85
c0 '
Garbage: ' '

>>EIP: c01296ec <end_buffer_io_async+e4/130>
Trace: c016bb41 <make_request+475/480>
Trace: c0128fe9 <__brelse+55/6c>
Trace: c016bc7d <ll_rw_block+131/190>
Trace: c012a549 <block_read_full_page+189/1e0>
Trace: c012a549 <block_read_full_page+189/1e0>
Trace: c011e079 <try_to_read_ahead+81/b0>
Trace: c011e091 <try_to_read_ahead+99/b0>
Trace: c011efad <do_generic_file_read+631/824>
Code: c01296ec <end_buffer_io_async+e4/130> 00000000 <_EIP>: <===
Code: c01296ec <end_buffer_io_async+e4/130> 0: 0f
0b
ud2a <===
Code: c01296ee <end_buffer_io_async+e6/130> 2: 83 c4
0c
addl $0xc,%esp
Code: c01296f1 <end_buffer_io_async+e9/130> 5: c7 46 34 00 00
00 00
movl $0x0,0x34(%esi)
Code: c01296f8 <end_buffer_io_async+f0/130> c: 0f b3 5e
18
btrl %ebx,0x18(%esi)
Code: c01296fc <end_buffer_io_async+f4/130> 10: 19
c0
sbbl %eax,%eax
Code: c01296fe <end_buffer_io_async+f6/130> 12: 85
c0
testl %eax,%eax
***************************************************************************

Then, after a clean reboot, I attempted again bonnie on my /dev/sda.
Now I got two oopses, and one on "dmesg | more" after that.
The problem seems serious, since even "rm core" coredumped after them.
:)
Here are they:

***************************************************************************
Unable to handle kernel paging request at virtual address 82082110
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c011dc34>]
EFLAGS: 00010282
eax: 82082104 ebx: c0334038 ecx: 00002f0a edx: 82082104
esi: 00000000 edi: c0094000 ebp: 00000100 esp: c0095fb4
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 4, process nr: 4, stackpage=c0095000)
Stack: 00000030 c0094000 00000f00 c0123b1a 00000006 00000030 00000f00
c01a51ee
c00941cd c0123bbe 00000030 00000f00 c0003fb4 c0106000 00000e00
c01089bf
00000000 00000f00 c01dbfd8
Call Trace: [<c0123b1a>] [<c01a51ee>] [<c0123bbe>] [<c0106000>]
[<c01089bf>]
Code: 66 8b 70 0c 53 e8 5a ca 00 00 83 c4 04 85 c0 0f 85 17 01 00
Warning: trailing garbage ignored on Code: line
Text: 'Code: 66 8b 70 0c 53 e8 5a ca 00 00 83 c4 04 85 c0 0f 85 17 01
00 '
Garbage: ' '

>>EIP: c011dc34 <shrink_mmap+e0/27c>
Trace: c0123b1a <do_try_to_free_pages+26/78>
Trace: c01a51ee <tvecs+28ae/49e0>
Trace: c0123bbe <kswapd+52/c8>
Trace: c0106000 <get_options+0/70>
Trace: c01089bf <kernel_thread+23/30>
Code: c011dc34 <shrink_mmap+e0/27c> 00000000 <_EIP>: <===
Code: c011dc34 <shrink_mmap+e0/27c> 0: 66 8b 70
0c
movw 0xc(%eax),%si <===
Code: c011dc38 <shrink_mmap+e4/27c> 4:
53
pushl %ebx
Code: c011dc39 <shrink_mmap+e5/27c> 5: e8 5a ca 00
00
call c012a698 <try_to_free_buffers+0/8c>
Code: c011dc3e <shrink_mmap+ea/27c> a: 83 c4
04
addl $0x4,%esp
Code: c011dc41 <shrink_mmap+ed/27c> d: 85
c0
testl %eax,%eax
Code: c011dc43 <shrink_mmap+ef/27c> f: 0f 85 17 01 00
00
jne c011dd60 <shrink_mmap+20c/27c>

invalid operand: 0000
CPU: 0
EIP: 0010:[<c011d7ac>]
EFLAGS: 00010282
eax: 0000001c ebx: c2b486d8 ecx: 00000021 edx: 0000001a
esi: 031bb000 edi: c2ba6000 ebp: c021ed78 esp: c2ba7f14
ds: 0018 es: 0018 ss: 0018
Process bonnie (pid: 739, process nr: 43, stackpage=c2ba7000)
Stack: 00000045 c2b486d8 c011dfed c0264a38 c021ed78 c0264a38 031bb000
c021ed78
00000000 c012010d c0264a38 c2b486d8 031bb000 c021ed78 c2d85f60
ffffffea
00000000 00001000 00001000 30ad21b6 00000000 c04c1000 c021ed78
c2ba6000
Call Trace: [<c011dfed>] [<c012010d>] [<c013e0ad>] [<c013e06c>]
[<c01274d5>] [<c
0109e38>] [<c010002b>]
Code: 0f 0b 83 c4 0c 5b c3 90 8b 4c 24 04 8b 51 2c 85 d2 74 19 8b
Warning: trailing garbage ignored on Code: line
Text: 'Code: 0f 0b 83 c4 0c 5b c3 90 8b 4c 24 04 8b 51 2c 85 d2 74 19
8b '
Garbage: ' '

>>EIP: c011d7ac <__add_page_to_hash_queue+3c/44>
Trace: c011dfed <add_to_page_cache_unique+7d/88>
Trace: c012010d <generic_file_write+185/2c8>
Trace: c013e0ad <ext2_file_write+1d/98>
Trace: c013e06c <ext2_write_one_page+0/24>
Trace: c01274d5 <sys_write+a9/c0>
Trace: c0109e38 <system_call+34/38>
Trace: c010002b <startup_32+2b/11e>
Code: c011d7ac <__add_page_to_hash_queue+3c/44> 00000000 <_EIP>: <===
Code: c011d7ac <__add_page_to_hash_queue+3c/44> 0: 0f
0b
ud2a <===
Code: c011d7ae <__add_page_to_hash_queue+3e/44> 2: 83 c4
0c
addl $0xc,%esp
Code: c011d7b1 <__add_page_to_hash_queue+41/44> 5:
5b
popl %ebx
Code: c011d7b2 <__add_page_to_hash_queue+42/44> 6:
c3
ret
Code: c011d7b3 <__add_page_to_hash_queue+43/44> 7:
90
nop
Code: c011d7b4 <remove_page_from_hash_queue+0/2c> 8: 8b 4c 24
04
movl 0x4(%esp,1),%ecx
Code: c011d7b8 <remove_page_from_hash_queue+4/2c> c: 8b 51
2c
movl 0x2c(%ecx),%edx
Code: c011d7bb <remove_page_from_hash_queue+7/2c> f: 85
d2
testl %edx,%edx
Code: c011d7bd <remove_page_from_hash_queue+9/2c> 11: 74
19
je c011d7d8 <remove_page_from_hash_queue+24/2c>
Code: c011d7bf <remove_page_from_hash_queue+b/2c> 13: 8b
00
movl (%eax),%eax

invalid operand: 0000
CPU: 0
EIP: 0010:[<c011d7ac>]
EFLAGS: 00010282
eax: 0000001c ebx: c2b7d918 ecx: 00000013 edx: 00000028
esi: 00009000 edi: c2ba6000 ebp: c02078f0 esp: c2ba7eb0
ds: 0018 es: 0018 ss: 0018
Process dmesg (pid: 740, process nr: 43, stackpage=c2ba7000)
Stack: 00000045 c2b7d918 c011dfed c0265838 c02078f0 c2b6d020 c0501000
c2b7d918
00009000 c011e079 c0265838 c2b7d918 00009000 c02078f0 00000007
00009000
00000000 00000000 c0265838 c011f70f c2b6d020 00009000 00000000
c2c70d00
Call Trace: [<c011dfed>] [<c011e079>] [<c011f70f>] [<c011c0df>]
[<c011c20e>] [<c
010f238>] [<c0109f3d>]
Code: 0f 0b 83 c4 0c 5b c3 90 8b 4c 24 04 8b 51 2c 85 d2 74 19 8b
Warning: trailing garbage ignored on Code: line
Text: 'Code: 0f 0b 83 c4 0c 5b c3 90 8b 4c 24 04 8b 51 2c 85 d2 74 19
8b '
Garbage: ' '

>>EIP: c011d7ac <__add_page_to_hash_queue+3c/44>
Trace: c011dfed <add_to_page_cache_unique+7d/88>
Trace: c011e079 <try_to_read_ahead+81/b0>
Trace: c011f70f <filemap_nopage+1fb/388>
Trace: c011c0df <do_no_page+4b/b4>
Trace: c011c20e <handle_mm_fault+c6/160>
Trace: c010f238 <do_page_fault+160/428>
Trace: c0109f3d <error_code+2d/34>
Code: c011d7ac <__add_page_to_hash_queue+3c/44> 00000000 <_EIP>: <===
Code: c011d7ac <__add_page_to_hash_queue+3c/44> 0: 0f
0b
ud2a <===
Code: c011d7ae <__add_page_to_hash_queue+3e/44> 2: 83 c4
0c
addl $0xc,%esp
Code: c011d7b1 <__add_page_to_hash_queue+41/44> 5:
5b
popl %ebx
Code: c011d7b2 <__add_page_to_hash_queue+42/44> 6:
c3
ret
Code: c011d7b3 <__add_page_to_hash_queue+43/44> 7:
90
nop
Code: c011d7b4 <remove_page_from_hash_queue+0/2c> 8: 8b 4c 24
04
movl 0x4(%esp,1),%ecx
Code: c011d7b8 <remove_page_from_hash_queue+4/2c> c: 8b 51
2c
movl 0x2c(%ecx),%edx
Code: c011d7bb <remove_page_from_hash_queue+7/2c> f: 85
d2
testl %edx,%edx
Code: c011d7bd <remove_page_from_hash_queue+9/2c> 11: 74
19
je c011d7d8 <remove_page_from_hash_queue+24/2c>
Code: c011d7bf <remove_page_from_hash_queue+b/2c> 13: 8b
00
movl (%eax),%eax
***************************************************************************

The two "invalid operand" oops started with a "kernel BUG at
filemap.c:69!"
error message.

I hope this report helps to kill all those big BUGs. :)

Regards,
Zoltan Boszormenyi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/