Re: [PATCH] firewire: net: fix panic in fwnet_write_complete

From: Stefan Richter
Date: Mon Jan 18 2010 - 19:44:46 EST


Stefan Richter wrote:
> The fix is to add checks in the tx soft RQ and in the tasklet to
> determine who of these two is the last referrer to the transaction
> object. Then handle the cleanup of the object by the last referrer
> rather than assuming that the tasklet is always the last one.
...
> ÐÐÑÑ, could you give this a try?

My own testing on a dual core box --- peered with another Linux box
which ran the older eth1394 --- worked OK so far for transfers of
massive files (> 4 GiB) back and forth via FTP and ssh running on a text
console.

But in my first attempt to use FTP on X11 --- i.e. with more concurrent
interrupt sources --- the firewire-net box crashed very soon. In that
test I used Dolphin of KDE as FTP client, and the crash already happened
after Dolphin had loaded and displayed the remote home directory and was
peeking into files for preview data. I got the following trace:

------------: cut here ]------------
kernel: BUG at mm/slab.c:2885!
invalid: opcode: 0000 [#1]
PREEMPT:
SMP:
DEBUG_PAGEALLOC:

last: sysfs file:
/sys/devices/pci0000:00/0000:00:1e.0/0000:03:03.0/fw0/units
Modules: linked in:
firewire_net:
firewire_ohci:
firewire_core:
netconsole:
nfs:
lockd:
sunrpc:
i915:
drm_kms_helper:
drm:
i2c_algo_bit:
snd_hda_codec_idt:
snd_hda_intel:
snd_hda_codec:
applesmc:
led_class:
input_polldev:
snd_pcm:
rtc:
i2c_i801:
snd_timer:
sg:
sky2:
snd:
video:
backlight:
snd_page_alloc:
thermal:
output:
button:


Pid: 4267, comm: kio_thumbnail Not tainted 2.6.33-rc4 #2
Mac-F4208EC8/Macmini1,1
EIP: 0060:[<c1080e4f>] EFLAGS: 00010006 CPU: 0
EIP: is at cache_free_debugcheck+0x1e8/0x2e8
EAX: f8efddca EBX: f2008000 ECX: 65727661 EDX: 00c38e0a
ESI: d84156c5 EDI: f707fe40 EBP: f07fddc0 ESP: f07fdd80
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process: kio_thumbnail (pid: 4267, ti=f07fc000 task=f079b6e8
task.ti=f07fc000)
Stack:
f2008e20:
00000000:
0000000c:
c11c6f96:
00640100:
f2008000:
635688c0:
d84156c5:

kernel:
f2008ec0:
00000082:
f2008e18:
00000000:
00000000:
f715ff30:
f707fe40:
f2008e20:

kernel:
f07fddd8:
c108137c:
00000282:
f2008e20:
f2008e3c:
00000060:
f07fdde4:
c11c6f96:

Call: Trace:
? __kfree_skb+0x6e/0x71
? kmem_cache_free+0x56/0xb0
? __kfree_skb+0x6e/0x71
? kfree_skb+0x2b/0x2d
? unix_stream_recvmsg+0x3c3/0x48d
? file_read_actor+0x74/0xcc
? sock_aio_read+0xf2/0x107
? do_sync_read+0x89/0xc7


This was all that made it out via netconsole. This was on 2.6.33-rc4
and I think I am going back to 2.6.32.y to eliminate unrelated .33-rc
flakiness from my testing.
--
Stefan Richter
-=====-==-=- ---= =--==
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/