Re: [1/6] 2.6.21-rc4: known regressions

From: Linus Torvalds
Date: Thu Mar 22 2007 - 11:25:35 EST




On Thu, 22 Mar 2007, Nick Piggin wrote:
>
> Nothing sleeps on PageUptodate, so I don't think that could explain it.

Good point. I forget that we just test "uptodate", but then always sleep
on "locked".

> The fs: fix __block_write_full_page error case buffer submission patch
> does change the locking, but I'd be really suprised if that was the
> problem, because it changes locking to match the regular non-error path
> submission.

I'd agree, except something clearly has changed ;^)

> > Alternatively, maybe it really is an _io_ problem (and the buffer-head thing
> > is just a red herring, and it could happen to other IO, it's just that
> > metadata IO uses buffer heads), and it's the scheduler changes since
> > 2.6.20..
>
> I see what you mean. Could it be an ext3 or jbd change I wonder?

jbd hasn't changed since 2.6.20, and the ext3 changes are mostly
things like const'ness fixes. And others were things like changing
"journal_current_handle()" into "ext3_journal_current_handle()", which
looked exciting considering that the hung processes were waiting for the
journal, but the fact is, that's just an inline function that just calls
the old function, so..

But interestingly, there *is* a "EA block reference count racing fix"
that does move a lock_buffer()/unlock_buffer() to cover a bigger area. It
looks "obviously correct", but maybe there's a deadlock possibility there
with ext3_forget() or something?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/