Re: [SUGGESTION]: drop virtual merge accounting in I/O requests

From: Andi Kleen
Date: Tue Jul 15 2008 - 10:06:29 EST


Mikulas Patocka wrote:
> On Tue, 15 Jul 2008, Andi Kleen wrote:
>
>> Mikulas Patocka wrote:
>>>>> BTW. what should the block device driver do when it receives a mapping
>>>>> error? (if it aborts the request and it was write request, there
>>>>> will be
>>>>> data corruption).
>>>>
>>>> I'm not sure how a aborted request can corrupt data on disk.
>>>
>>> Writes are done by an async daemon and no one checks for their
>>> completion status. If there are three writes to directory, inode table
>>> and inode bitmap and one of these writes fail, there's no code to undo
>>> the other two. So the filesystem will be corrupted on write failure.
>>
>> Normally journaling in ordered mode takes care of that. The transaction
>> is not committed until all earlier data has been successfully written.
>
> And if there was write error, then what happens? Retry? Blocking of any
> further updates?

The file system is mounted r/o and the transaction is not committed.
Then on mount it is replayed. Similar for a journal write error.

>
>> And even the other fs typically turn the file system read only
>> on IO error to prevent further corruption.
>
> There is no interface how filesystem could query that buffer marked with
> mark_buffer_dirty was not written. Or is there?

For journaled meta data at least the file system usually checks synchronously
(e.g. by using sync_dirty_buffer() and the handling the commit when all
IO completed successfully) For normal data it is just handled by the normal VFS functions.

-Andi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/