Re: [PATCH 3/4] jbd: abort when failed to log metadata buffers (rebased)

From: Hidehiro Kawai
Date: Tue May 20 2008 - 21:33:37 EST


Hi,

Jan Kara wrote:
>
> On Fri 16-05-08 19:26:57, Hidehiro Kawai wrote:
>
>>Jan Kara wrote:
>>
>>
>>>On Wed 14-05-08 13:49:51, Hidehiro Kawai wrote:
>>>
>>>
>>>>Subject: [PATCH 3/4] jbd: abort when failed to log metadata buffers
>>>>
>>>>If we failed to write metadata buffers to the journal space and
>>>>succeeded to write the commit record, stale data can be written
>>>>back to the filesystem as metadata in the recovery phase.
>>>>
>>>>To avoid this, when we failed to write out metadata buffers,
>>>>abort the journal before writing the commit record.
>>>>
>>>>Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@xxxxxxxxxxx>
>>>>---
>>>>fs/jbd/commit.c | 3 +++
>>>>1 file changed, 3 insertions(+)
>>>>
>>>>Index: linux-2.6.26-rc2/fs/jbd/commit.c
>>>>===================================================================
>>>>--- linux-2.6.26-rc2.orig/fs/jbd/commit.c
>>>>+++ linux-2.6.26-rc2/fs/jbd/commit.c
>>>>@@ -703,6 +703,9 @@ wait_for_iobuf:
>>>> __brelse(bh);
>>>> }
>>>>
>>>>+ if (err)
>>>>+ journal_abort(journal, err);
>>>>+
>>>> J_ASSERT (commit_transaction->t_shadow_list == NULL);
>>>
>>> Shouldn't this rather be further just before
>>>journal_write_commit_record()? We should abort also if writing revoke
>>>records etc. failed, shouldn't we?
>>
>>Unlike metadata blocks, each revoke block has a descriptor with the
>>sequence number of the commiting transaction. If we failed to write
>>a revoke block, there should be an old control block, metadata block,
>>or zero-filled block where we tried to write the revoke block.
>>In the recovery process, this old invalid block is detected by
>>checking its magic number and sequence number, then the transaction
>>is ignored even if we have succeeded to write the commit record.
>>So I think we don't need to check for errors just after writing
>>revoke records.
>
> Yes, I agree that not doing such check will not cause data corruption but
> still I think that in case we fail to properly commit a transaction, we
> should detect the error and abort the journal...

I see. I'll move the aborting point to just before
journal_write_commit_record() in the next version.

Thanks,
--
Hidehiro Kawai
Hitachi, Systems Development Laboratory
Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/