Re: [discuss] Re: 2.6.16-rc5-mm3: spinlock bad magic on CPU#0 onAMD64

From: Andrew Morton
Date: Sun Mar 12 2006 - 18:24:17 EST


Andi Kleen <ak@xxxxxxx> wrote:
>
> > Still. It seems that what's happened is that we took a pagefault while
> > reiserfs had a transaction open. The fault is against a mmapped ext3 file
> > and we ended up in the recently-reworked ext3_get_block() which tests
> > journal_current_handle() to work out whether we're in a write or a read.
> > oops. The presence of reiserfs journal_info makes it decide it's a write,
> > not a read so it starts treating a reiserfs journal_info as an ext3 one.
> >
> > The code used to work OK because it was only for direct-IO, which doesn't
> > get recurred into like this. But it got used for regular I/O in -mm.
>
> Oops. Can this happen in more situations?

I don't _think_ so, but it's pretty scary. The code's been this way for a
while.

Typical scenario:

reiserfs_journal_start - sets current->journal_info
copy_from_user
pagefault
ext3_readpages()

As long as ext3_readpage[s]() doesn't try to start a transaction we're OK.
And it shouldn't, if create==0.

Fortunately filemap_nopage() doesn't do atime updates. If it did, things
would get messy.

We do have deadlock possibilities in there - filemap_nopage() does
lock_page() inside journal_start(), whereas generic_file_write() does
journal_start() inside lock_page(). Chris Mason and I have stared
unhappily at that a few times. Hard to fix.

But I don't _think_ we have any more journal_start() recursions like this -
ext3 tends to get pretty noisy if it detects that in unexpected places.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/