NTFS directory read lockup

Stanislav Meduna (stano@trillian.eunet.sk)
Sun, 17 Jan 1999 11:40:39 +0100 (CET)


Hello,

there probably exists one more problem with NTFS.

I have run some script with a large find through the whole
tree, including mounted NTFS partitions. After some time
I found that the process blocks. I have explored
where it blocks and it was one specific directory,
not unusual in any way compared to other directories
on the fs (compressed, 8 levels, path length ca. 52,
35 files inside). Any operation that tried to list
the contents of the directory blocked the process
in the kernel - no kill -9, complaints during
reboot etc.

No messages were produced by the kernel at the time
of blocking (but I have probably overlooked an oops
which happened long before).

After the reboot the problem dismissed. I have run
the same script more times. One find failed (probably
with an oops described later). Next time the block
happened again, this time in another directory,
but relatively near the previous one.

echo 4095 > /proc/sys/fs/ntfs reports nothing
when doing the blocking 'ls', although there
are things reported in directories that do
function.

There were mysterious oopses in the find, which
probably indicate a problem, but I don't see what
sys_newlstat has to do with cleanup_module.
The 'ls' exhibits no such oops.

Jan 17 10:51:20 trillian kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Jan 17 10:51:20 trillian kernel: current->tss.cr3 = 03d4d000, `r3 = 03d4d000
Jan 17 10:51:20 trillian kernel: *pde = 00000000
Jan 17 10:51:20 trillian kernel: Oops: 0002
Jan 17 10:51:20 trillian kernel: CPU: 0
Jan 17 10:51:20 trillian kernel: EIP: 0010:[<c6825100>]
Jan 17 10:51:20 trillian kernel: EFLAGS: 00010202
Jan 17 10:51:20 trillian kernel: eax: 00000038 ebx: c3909ee4 ecx: 0000000e edx: 00000038
Jan 17 10:51:20 trillian kernel: esi: c0dada80 edi: 00000000 ebp: 00000038 esp: c3909e34
Jan 17 10:51:20 trillian kernel: ds: 0018 es: 0018 ss: 0018
Jan 17 10:51:20 trillian kernel: Process find (pid: 5282, process nr: 53, stackpage=c3909000)
Jan 17 10:51:20 trillian kernel: Stack: c6825bfd 00000000 c0dada80 00000038 c3909ee4 c185ed28 c68269b1 c3909ee4
Jan 17 10:51:20 trillian kernel: c0dada80 00000038 c3909ee4 c185ed28 c3909f28 c18bc800 00000000 00000003
Jan 17 10:51:20 trillian kernel: c185ed28 c6825be4 c6826004 c1bdc280 c682bd60 00000004 c3909ee4 00000800
Jan 17 10:51:20 trillian kernel: Call Trace: [<c6825bfd>] [<c68269b1>] [<c6825be4>] [<c6826004>] [<c682bd60>] [<c6826bf2>] [<c6828c37>]
Jan 17 10:51:20 trillian kernel: [<c682bd60>] [<c6825be4>] [<c6828cac>] [<c6824943>] [<c012cffe>] [<c0128a33>] [<c0128c10>] [<c0128cf0>]
Jan 17 10:51:20 trillian kernel: [<c0126e26>] [<c0108804>] [<c010002b>]
Jan 17 10:51:20 trillian kernel: Code: f3 a5 a8 02 74 02 66 a5 a8 01 74 01 a4 5e 5f c3 57 56 8b 7c

>>EIP: c6825100 <cleanup_module+307c/10fc8>
Trace: c6825bfd <cleanup_module+3b79/10fc8>
Trace: c68269b1 <cleanup_module+492d/10fc8>
Trace: c6825be4 <cleanup_module+3b60/10fc8>
Trace: c6826004 <cleanup_module+3f80/10fc8>
Trace: c682bd60 <cleanup_module+9cdc/10fc8>
Trace: c6826bf2 <cleanup_module+4b6e/10fc8>
Trace: c6828c37 <cleanup_module+6bb3/10fc8>
Trace: c682bd60 <cleanup_module+9cdc/10fc8>
Trace: c0126e26 <sys_newlstat+e/64>
Code: c6825100 <cleanup_module+307c/10fc8> 00000000 <_EIP>:
Code: c6825100 <cleanup_module+307c/10fc8> 0: f3 a5 repz movsl %ds:(%esi),%es:(%edi)
Code: c6825102 <cleanup_module+307e/10fc8> 2: a8 02 testb $0x2,%al
Code: c6825104 <cleanup_module+3080/10fc8> 4: 74 02 je 8 <_EIP+0x8> c6825108 <cleanup_module+3084/10fc8>
Code: c6825106 <cleanup_module+3082/10fc8> 6: 66 a5 movsw %ds:(%esi),%es:(%edi)
Code: c6825108 <cleanup_module+3084/10fc8> 8: a8 01 testb $0x1,%al
Code: c682510a <cleanup_module+3086/10fc8> a: 74 01 je d <_EIP+0xd> c682510d <cleanup_module+3089/10fc8>
Code: c682510c <cleanup_module+3088/10fc8> c: a4 movsb %ds:(%esi),%es:(%edi)
Code: c682510d <cleanup_module+3089/10fc8> d: 5e popl %esi
Code: c682510e <cleanup_module+308a/10fc8> e: 5f popl %edi
Code: c682510f <cleanup_module+308b/10fc8> f: c3 ret
Code: c6825110 <cleanup_module+308c/10fc8> 10: 57 pushl %edi
Code: c6825111 <cleanup_module+308d/10fc8> 11: 56 pushl %esi
Code: c6825112 <cleanup_module+308e/10fc8> 12: 8b 7c 00 00 movl 0x0(%eax,%eax,1),%edi

The environment is 2.2.0-pre6, NTFS as module, kmod, egcs 1.0.3.

This seems like an oops leaving some directory information
in locked state, but I am surely no expert in Linux filesystem
code. Do you have some more ideas what I could try
to further help in finding what happens here?

Regards

-- 
				Stano

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/