Re: 2.6.23.1: mdadm/raid5 hung/d-state

From: Justin Piszcz
Date: Sun Nov 04 2007 - 08:42:24 EST




On Sun, 4 Nov 2007, BERTRAND Joël wrote:

Justin Piszcz wrote:
# ps auxww | grep D
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 273 0.0 0.0 0 0 ? D Oct21 14:40 [pdflush]
root 274 0.0 0.0 0 0 ? D Oct21 13:00 [pdflush]

After several days/weeks, this is the second time this has happened, while doing regular file I/O (decompressing a file), everything on the device went into D-state.

Same observation here (kernel 2.6.23). I can see this bug when I try to synchronize a raid1 volume over iSCSI (each element is a raid5 volume), or sometimes only with a 1,5 TB raid5 volume. When this bug occurs, md subsystem eats 100% of one CPU and pdflush remains in D state too. What is your architecture ? I use two 32-threads T1000 (sparc64), and I'm trying to determine if this bug is arch specific.

Regards,

JKB


Using x86_64 here (Q6600/Intel DG965WH).

Justin.