Re: Linux 2.6.35-rc2

From: Tejun Heo
Date: Sun Jun 06 2010 - 11:49:23 EST


Hello, Torsten, Linus.

On 06/06/2010 04:19 PM, Linus Torvalds wrote:
>> [ 90.040053] general protection fault: 0000 [#1] SMP
>> [ 90.045062] last sysfs file: /sys/devices/pci0000:00/0000:00:06.0/0000:05:06.0/resource
>> [ 90.050007] CPU 0
>> [ 90.050007] Modules linked in: sg
>> [ 90.050007]
>> [ 90.050007] Pid: 335, comm: kblockd/0 Not tainted 2.6.35-rc2 #1 KFN5-D SLI/KFN5-D SLI
>> [ 90.050007] RIP: 0010:[<ffffffff8135aa64>] [<ffffffff8135aa64>] ata_find_dev+0x24/0x90
>> [ 90.050007] RSP: 0018:ffff88007ffdbda0 EFLAGS: 00010082
>> [ 90.050007] RAX: 0720072007200720 RBX: ffff88007ffc7000 RCX: 0720072007202558
>> [ 90.050007] RDX: ffff880007009e38 RSI: 0000000000000000 RDI: ffff880007008000
>> [ 90.050007] RBP: ffff880006cef700 R08: 0000000000000001 R09: 0000000000000008
>> [ 90.050007] R10: 0000000000000000 R11: ffff88000723edb0 R12: ffff88007f3a3800
>> [ 90.050007] R13: ffff880007008000 R14: ffffffff81340f80 R15: ffff88007ffc7138
>> [ 90.050007] FS: 00007f558bc58700(0000) GS:ffff880001c00000(0000) knlGS:0000000000000000
>> [ 90.050007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 90.050007] CR2: 00007fffa9653000 CR3: 0000000006429000 CR4: 00000000000006f0
>> [ 90.050007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 90.050007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
...
>> (gdb) list *0xffffffff8135aa64
>> 0xffffffff8135aa64 is in ata_find_dev (include/linux/libata.h:1201).
>> 1196 return ap->nr_pmp_links != 0;
>> 1197 }
>> 1198
>> 1199 static inline int ata_is_host_link(const struct ata_link *link)
>> 1200 {
>> 1201 return link == &link->ap->link || link == link->ap->slave_link;
>> 1202 }
>> 1203 #else /* CONFIG_SATA_PMP */
>> 1204 static inline bool sata_pmp_supported(struct ata_port *ap)
>> 1205 {

Hmmm... that's really odd. An ata_port contains one ata_link
structure in ap->link which is initialized by ata_link_init() during
ata_port_alloc(), so ap->link.ap == ap should always hold.

In the above case, ap->link.ap is containing 0x0720072007200720 (RAX)
instead of the proper address 0xffff880007008000 (RDI) and
ata_is_host_link() is causing oops trying to derference
ap->link.ap->slave_link.

Odd, I can't think of any change which could cause such behavior
difference. Where would 0x0720072007200720 come from? That's a
rather strange value to be there and it doesn't seem to be a magic
value. I'll see whether I can reproduce the problem. Can you please
try w/o KMS just in case?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/