Hi Juan,
I'm running 2.4.0-test1-ac10 on a 2 cpu alpha, and I get a BUG() call
from discard_buffer() when an unlink(2) and sync(2) race such that
sync_buffers() waits for a locked buffer in the following code after it
appears that discard_buffer() has locked it. I guess discard_buffer()
got in and locked the buffer again after the I/O, before sync_buffers()
got to run again?
sync_buffers()
...
if (buffer_locked(bh)) {
/* Buffer is locked; skip it unless wait
is * requested AND pass > 0.
*/
if (!wait || !pass) {
retry = 1;
continue;
}
atomic_inc(&bh->b_count);
spin_unlock(&lru_list_lock);
wait_on_buffer (bh);
atomic_dec(&bh->b_count);
goto repeat;
}
Should discard_buffer() unlock the buffer, give up locks and the cpu
and try again if the buffer is busy?
The sync() task has state TASK_RUNNING. I'll put the buffer and
traceback of the 2 tasks below.
Thanks for the help!
Anne
struct buffer_head {
b_next = 0x0,
b_blocknr = 0x24671,
b_size = 0x1000,
b_list = 0x1,
b_dev = 0x810,
b_count = {
counter = 0x2
},
b_rdev = 0x810,
b_state = 0x4,
b_flushtime = 0x1c8a345,
b_next_free = 0xfffffc001bfa8540,
b_prev_free = 0xfffffc00331b08c0,
b_this_page = 0xfffffc001bfa8540,
b_reqnext = 0x0,
b_pprev = 0x0,
b_data = 0xfffffc00244a0000 "",
b_page = 0xfffffc00016ea9e0,
b_end_io = 0xfffffc000035e5a0 <end_buffer_io_sync>,
b_dev_id = 0x0,
b_rsector = 0x123388,
b_wait = {
lock = {
lock = 0x0,
on_cpu = 0xffffffff,
line_no = 0x0,
previous = 0x0,
task = 0x0,
base_file = 0xfffffc00005302cb "none"
},
task_list = {
next = 0xfffffc007d91be58,
prev = 0xfffffc007d91be58
},
__magic = 0xfffffc001bfa8dd8,
__creator = 0xfffffc000035f220
},
b_kiobuf = 0x0
}
PID: 608 TASK: fffffc007d89c000 CPU: 0 COMMAND: "usex"
#0 [fffffc007d89fbd0] crash_save_current_state at fffffc000031e8bc
#1 [fffffc007d89fbe0] panic at fffffc0000328bc4
#2 [fffffc007d89fc70] die_if_kernel at fffffc00003113fc
#3 [fffffc007d89fca0] do_entIF at fffffc000031151c
EFRAME: fffffc007d89fcb0 R24: fffffc00005342b8
R0: 0000000000000007 R25: 0000000000000001
R1: 0000000000000002 R26: fffffc000035fb84
<discard_buffer+356>
R2: 0000000000000001 R27: fffffc000031d260
R3: 0000000000000000 R28: 0000000000000000
R4: 0000000004000000 HAE: 0000000000000000
R5: 0000000000000008 TRAP_A0: 0000000000000001
R6: 0000000000000001 TRAP_A1: fffffc0000534ca1
R7: 0000000000000020 TRAP_A2: 000000000000051d
R8: fffffc007d89c000 PS: 0000000000000000
R19: 0000000000000000 PC: fffffc000035fbac
<discard_buffer+396>
R20: fffffc000056ca10 GP: fffffc00005993a0
R21: fffffc007ffdb150 R16: fffffc0000575e90
R22: 0000000000000001 R17: fffffc0000534ca1
R23: fffffc00005ac050 R18: 000000000000051d
#4 [fffffc007d89fd98] discard_buffer at fffffc000035fbac
#5 [fffffc007d89fdd8] block_destroy_buffers at fffffc000035fe48
#6 [fffffc007d89fdf8] truncate_all_inode_pages at fffffc00003477bc
#7 [fffffc007d89fe38] iput at fffffc000037c144
#8 [fffffc007d89fe68] d_delete at fffffc0000379cac
#9 [fffffc007d89fe78] vfs_unlink at fffffc0000370428
#10 [fffffc007d89feb8] sys_unlink at fffffc0000370604
#11 [fffffc007d89ff18] entSys at fffffc0000310ca0
PID: 601 TASK: fffffc007d930000 CPU: 0 COMMAND: "usex"
#0 [fffffc007d933db8] schedule at fffffc000032401c
#1 [fffffc007d933e08] __wait_on_buffer at fffffc000035cf78
#2 [fffffc007d933ea8] sync_buffers at fffffc000035d270
#3 [fffffc007d933ef8] fsync_dev at fffffc000035d56c
#4 [fffffc007d933f08] sys_sync at fffffc000035d598
#5 [fffffc007d933f18] entSys at fffffc0000310ca0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:16 EST