Re: 2.6.19 file content corruption on ext3

From: Nick Piggin
Date: Mon Dec 18 2006 - 05:50:19 EST


Andrew Morton wrote:
On Mon, 18 Dec 2006 18:22:42 +1100
Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:

>>Yes I could believe it the corruption is caused by something else
>>completely.
>
>
> Think so. We do have a problem here, but only on threaded apps, I believe.
> rtorrent doesn't appear to be threaded, and the bug is hit on non-preempt
> UP.

I think (see below) that it does not apply only to threaded apps. But
it would need one of SMP or PREEMPT to trigger.


After try_to_free_buffers detaches the buffers from the page, a
pagefault can come in, and mark the pte writeable, then set_page_dirty
(which finds no buffers, so only sets PG_dirty).

The page can now get dirtied through this mapping.

try_to_free_buffers then goes on to clean the page and ptes.


try_to_free_buffers() isn't called against a page which doesn't have
buffers. It'll oops.

Sure. But I think the race exists... I'll try spelling it out in
the conventional way:

try_to_free_buffers()
drop_buffers() (succeeds)

** preempt here or run right-hand thread on 2nd CPU in SMP **

do_no_page()
set_page_dirty()

[now modify the page via this mapping
(from this process or a concurrent thread)]


clear_page_dirty() (clears PG_dirty + pte dirty, oops)


--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/