multiple bread()s causes oops

From: Matt_Domsch@Dell.com
Date: Mon Oct 09 2000 - 15:35:27 EST


> I'm still trying to read physical sectors in some partitioning code in
> linux/fs/partitions, and have made progress. Thanks for the pointers on
> set_blocksize(), that helps.
>
My ultimate goal is to read blocks 1, 2-34, n, n-32..n-1, where n is the
number of hardware sectors on the disk. However, when I sequentially read
blocks, I get the oopses below. (kernel 2.4.0-test9-final on i686 SMP and
UP, IA-64 SMP fails similarly.)

dev is a SCSI disk, /dev/sdd or the like.
set_blocksize(dev, 512)
bread(dev, 1, 512) - succeeds. - Use the data to figure out how many more
blocks to read, and where they are.
bread(dev, 2, 512) - succeeds.
bread(dev, 3, 512) - succeeds.
bread(dev, 4, 512) - succeeds (most of the time).
bread(dev, 5, 512) - oops, kernel dereferences NULL.

If I insert a delay between the breads {while (jiffies < j + HZ)
schedule();}, I can read more blocks, but it fails after 9-10 blocks rather
than 4-5. Or, if I insert a delay in bread() between calling ll_rw_block()
and wait_on_block(), it fails after 9-10 blocks.

It seems that the I/O completion is actually causing the NULL dereference.
It's neither in ll_rw_block() or wait_on_block() when it bombs, from what I
can tell.

I'm not acquiring any locks in my code before calling bread(). Should I?
I've tried getting the BKL first, no difference.

This has tripped me up for about a week now, so I'd really appreciate any
pointers.

Thanks,
Matt Domsch
Dell Enterprise Systems Group
Linux Development Team

SMP oops.txt
Unable to handle kernel NULL pointer dereference at virtual address 00000000
00000000
*pde = 37b11001
Oops: 0000
CPU: 0
EIP: 0010:[<00000000>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00000000 ebx: c211dba0 ecx: 00000000 edx: c029bfa8
esi: 00000000 edi: 00000001 ebp: 0000001f esp: c029bf60
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c029b000)
Stack: c010d4c0 0000001f 00000000 c029bfa8 c029bfa8 c02d8be0 0000001f
c211dba0
       c010d6c1 0000001f c029bfa8 c211dba0 00000000 c029a000 c0109bf0
c029a000
       ffffe000 c010bbf8 c029a000 c029a000 00000000 c0109bf0 c029a000
ffffe000
Call Trace: [<c010d4c0>] [<c010d6c1>] [<c0109bf0>] [<c010bbf8>] [<c0109bf0>]
[<c0109c1e>] [<c0109ca2>]
Code: Bad EIP value.

>>EIP; 00000000 Before first symbol
Trace; c010d4c0 <handle_IRQ_event+60/90>
Trace; c010d6c1 <do_IRQ+a1/100>
Trace; c0109bf0 <default_idle+0/40>
Trace; c010bbf8 <ret_from_intr+0/20>
Trace; c0109bf0 <default_idle+0/40>
Trace; c0109c1e <default_idle+2e/40>
Trace; c0109ca2 <cpu_idle+52/70>

UP oops.txt
ksymoops 2.3.4 on i686 2.4.0-test9. Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.0-test9/ (default)
     -m /root/linux/System.map (specified)

Warning (expand_objects): object /lib/aic7xxx.o for module aic7xxx has
changed since load
Unable to handle kernel paging request at virtual address fffffffc
c016d49b
*pde = 00001063
Oops: 0000
CPU: 0
EIP: 0010:[<c016d49b>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: ffffffff ebx: ffffffff ecx: 00000000 edx: 00000000
esi: ffffffff edi: 0000021c ebp: c1ef52e0 esp: f7adb77c
ds: 0018 es: 0018 ss: 0018
Process insmod (pid: 7, stackpage=f7adb000)
Stack: 00000001 c1ef52e0 00000000 00000000 c025cc00 c016d68c c1ef52e0
f7fba870
       00000002 00000000 c02ae580 00000000 c01204da c1ef52e0 c02c6600
c011d6cc
       c02c6600 c011d5d8 00000001 00000001 c02ae5a0 0000000e c011d4dd
c02ae5a0
Call Trace: [<c016d68c>] [<c01204da>] [<c011d6cc>] [<c011d5d8>] [<c011d4dd>]
[<c010bc81>] [<c010a8a0>]
       [<c0119e9a>] [<c014e8e7>] [<c021bd24>] [<c014ea8c>] [<c021bede>]
[<c0119ea9>] [<c014ebb4>] [<c014ed6b>]
       [<c014edf7>] [<c014ee79>] [<c014f14d>] [<f881abcd>] [<f881a5a0>]
[<f881ad58>] [<f881b053>] [<c019b7e1>]
       [<c01a2cec>] [<c0117751>] [<c0250040>] [<c01209c0>] [<c0120cd3>]
[<c0120876>] [<c0120a84>] [<c011d6cc>]
       [<c011d5d8>] [<c011d4dd>] [<c010bc81>] [<c010a8a0>] [<c0119ea9>]
[<c0132040>] [<c0217440>] [<c014f371>]
       [<c014f383>] [<c014e5f0>] [<c0119ea9>] [<c014e04c>] [<c01a69fa>]
[<c014e134>] [<c01a710a>] [<f8825260>]
       [<c019cf30>] [<f8804000>] [<f8804050>] [<f881e257>] [<f8825260>]
[<f8825260>] [<c011a7d2>] [<f8804048>]
       [<c010a7f7>]
Code: 8b 0c 82 89 d8 31 cf 8b 4d 18 01 c8 21 f0 8b 0c 82 89 d8 31

>>EIP; c016d49b <add_entropy_words+5b/d0> <=====
Trace; c016d68c <batch_entropy_process+3c/c0>
Trace; c01204da <tqueue_bh+3a/50>
Trace; c011d6cc <bh_action+1c/60>
Trace; c011d5d8 <tasklet_hi_action+38/60>
Trace; c011d4dd <do_softirq+5d/80>
Trace; c010bc81 <do_IRQ+a1/b0>
Trace; c010a8a0 <ret_from_intr+0/20>
Trace; c0119e9a <printk+15a/180>
Trace; c014e8e7 <unparse_buffer_head+27/140>
Trace; c021bd24 <tvecs+7e40/fba8>
Trace; c014ea8c <my_bread+8c/c0> (this is my function, bread + extra
printks)
Trace; c021bede <tvecs+7ffa/fba8>
Trace; c0119ea9 <printk+169/180>
Trace; c014ebb4 <ReadLBA+f4/170> (this is my function, calls my_bread())
Trace; c014ed6b <ReadGuidPartitionEntries+4b/70> (These next 4 are my
functions too, they don't touch the disk at all).
Trace; c014edf7 <ReadGuidPartitionTableHeader+67/a0>
Trace; c014ee79 <IsGuidPartitionTableValid+49/1b0>
Trace; c014f14d <add_gpt_partitions+9d/250>
Trace; f881abcd <[aic7xxx]aic7xxx_build_negotiation_cmnd+1cd/1e0>
Trace; f881a5a0 <[aic7xxx]aic7xxx_negotiation_complete+0/460>
Trace; f881ad58 <[aic7xxx]aic7xxx_buildscb+178/340>
Trace; f881b053 <[aic7xxx]aic7xxx_queue+133/140>
Trace; c019b7e1 <scsi_dispatch_cmd+1b1/230>
Trace; c01a2cec <scsi_request_fn+2ec/330>
Trace; c0117751 <schedule+281/410>
Trace; c0250040 <llc_oui+121e/143e>
Trace; c01209c0 <update_process_times+20/80>
Trace; c0120cd3 <do_timer+23/90>
Trace; c0120876 <update_wall_time+16/50>
Trace; c0120a84 <timer_bh+24/250>
Trace; c011d6cc <bh_action+1c/60>
Trace; c011d5d8 <tasklet_hi_action+38/60>
Trace; c011d4dd <do_softirq+5d/80>
Trace; c010bc81 <do_IRQ+a1/b0>
Trace; c010a8a0 <ret_from_intr+0/20>
Trace; c0119ea9 <printk+169/180>
Trace; c0132040 <set_blocksize+1a0/1e0>
Trace; c0217440 <tvecs+355c/fba8>
Trace; c014f371 <efi_partition+71/a0> (this is my function, called by
msdos_partition())
Trace; c014f383 <efi_partition+83/a0>
Trace; c014e5f0 <msdos_partition+200/3b0>
Trace; c0119ea9 <printk+169/180>
Trace; c014e04c <check_partition+8c/d0>
Trace; c01a69fa <sd_init_onedisk+71a/730>
Trace; c014e134 <grok_partitions+64/b0>
Trace; c01a710a <sd_finish+14a/1e0>
Trace; f8825260 <[aic7xxx]driver_template+0/6b>
Trace; c019cf30 <scsi_register_host+280/2b0>
Trace; f8804000 <_end+38527350/385273b0>
Trace; f8804050 <_end+385273a0/385273b0>
Trace; f881e257 <[aic7xxx]init_this_scsi_driver+17/40>
Trace; f8825260 <[aic7xxx]driver_template+0/6b>
Trace; f8825260 <[aic7xxx]driver_template+0/6b>
Trace; c011a7d2 <sys_init_module+422/4a0>
Trace; f8804048 <_end+38527398/385273b0>
Trace; c010a7f7 <system_call+33/38>
Code; c016d49b <add_entropy_words+5b/d0>
00000000 <_EIP>:
Code; c016d49b <add_entropy_words+5b/d0> <=====
   0: 8b 0c 82 mov (%edx,%eax,4),%ecx <=====
Code; c016d49e <add_entropy_words+5e/d0>
   3: 89 d8 mov %ebx,%eax
Code; c016d4a0 <add_entropy_words+60/d0>
   5: 31 cf xor %ecx,%edi
Code; c016d4a2 <add_entropy_words+62/d0>
   7: 8b 4d 18 mov 0x18(%ebp),%ecx
Code; c016d4a5 <add_entropy_words+65/d0>
   a: 01 c8 add %ecx,%eax
Code; c016d4a7 <add_entropy_words+67/d0>
   c: 21 f0 and %esi,%eax
Code; c016d4a9 <add_entropy_words+69/d0>
   e: 8b 0c 82 mov (%edx,%eax,4),%ecx
Code; c016d4ac <add_entropy_words+6c/d0>
  11: 89 d8 mov %ebx,%eax
Code; c016d4ae <add_entropy_words+6e/d0>
  13: 31 00 xor %eax,(%eax)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:13 EST