Re: 2.6.23.1: mdadm/raid5 hung/d-state

From: Justin Piszcz
Date: Mon Nov 05 2007 - 13:36:23 EST




On Mon, 5 Nov 2007, Dan Williams wrote:

On 11/4/07, Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx> wrote:


On Mon, 5 Nov 2007, Neil Brown wrote:

On Sunday November 4, jpiszcz@xxxxxxxxxxxxxxx wrote:
# ps auxww | grep D
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 273 0.0 0.0 0 0 ? D Oct21 14:40 [pdflush]
root 274 0.0 0.0 0 0 ? D Oct21 13:00 [pdflush]

After several days/weeks, this is the second time this has happened, while
doing regular file I/O (decompressing a file), everything on the device
went into D-state.

At a guess (I haven't looked closely) I'd say it is the bug that was
meant to be fixed by

commit 4ae3f847e49e3787eca91bced31f8fd328d50496

except that patch applied badly and needed to be fixed with
the following patch (not in git yet).
These have been sent to stable@ and should be in the queue for 2.6.23.2


Ah, thanks Neil, will be updating as soon as it is released, thanks.


Are you seeing the same "md thread takes 100% of the CPU" that Joël is
reporting?


Yes, in another e-mail I posted the top output with md3_raid5 at 100%.

Justin.