2.4.1 loopback bug

From: John Langford (l_k_account@crush.hunch.net)
Date: Tue Feb 13 2001 - 14:22:05 EST


There seems to be a heisenbug embedded in the 2.4 loopback driver. The
symptom is that at some point a kernel call just fails to return. I've
tried to reduce this to the simplest possible example and came up with the
following trace for a generic 2.4.1 kernel.

[root@crush jl]# ls -l bigrandom
-rw-r--r-- 1 jl 500 2097152000 Feb 6 12:16 bigrandom
[root@crush jl]# /sbin/losetup /dev/loop1 bigrandom
Feb 6 14:46:24 crush kernel: loop: enabling 8 loop devices
[root@crush jl]# strace /sbin/mke2fs /dev/loop1
lseek(3, 1745338368, SEEK_SET) = 1745338368
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,32768) = 32768
lseek(3, 1745371136, SEEK_SET) = 1745371136
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,32768

This last write never returned, the mke2fs process became unkillable, and
load went to '4'. There were no interesting messages in the log. The
thing which makes this bug nasty is that it surfaces at a random
time. I've had it happen on a file size of 1GB, but it only reliably
happens on a 2GB file.

Is anyone working on this? Or do you have suggestions for debugging it?

-John

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 15 2001 - 21:00:22 EST