Re: Question regarding concurrent accesses through block device and fs

From: Nick Piggin
Date: Sun Mar 01 2009 - 10:33:39 EST


On Monday 02 March 2009 01:42:55 Francis Moreau wrote:
> [ Sorry for being long to answer but I was off, I'm slow and there are
> a lot of complex code to dig out ! ]
>
> Nick Piggin <nickpiggin@xxxxxxxxxxxx> writes:
> > On Saturday 21 February 2009 01:10:24 Francis Moreau wrote:
>
> [...]
>
> >> - looking at unmap_underlying_metadata(), there's no code to deal with
> >> meta data buffers. It gets the buffer and unmap it whatever the type
> >> of data it contains.
> >
> > That's why I say it only really works for buffer cache used by the same
> > filesystem that is now known to be unused.
>
> hum, I still don't know what you mean by this, sorry to be slow.

OK, the "buffercache", the cache of block device contents, is normally
thought of as metadata when it is being used by the filesystem (eg.
usually via bread() etc), or data when it is being read/written from
userspace via /dev/<blockdevice>.

In the former case, the buffer.c/filesystem code together know when a
metadata buffer is unused (because the filesystem has deallocated it),
so unmap_underlying_metadata will work there.

And it is insane to have a mounted filesystem and have userspace working
on the same block device, so unmap_underlying_metadata doesn't have to
care about that case. (IIRC some filesystem tools can do this, but there
are obviously a lot of tricks to it)


> >> What am I missing ?
> >
> > That we might complete the write of the new buffer before the
> > old buffer is finished writing out?
>
> Ah yes actually I realize that I don't know where and when the inode
> blocks are effectively written to the disk !
>
> It seems that write_inode(), called after data are commited to the
> disk, only marks the inode buffers as dirty but it performs no IO (at
> least it looks so for ext2 when its 'do_sync' parameter is 0 which is
> the case when this method is called by write_inode()).
>
> Could you enlight me one more time ?

Depends on the filesystem. Many do just use the buffercache as a
writeback cache for their metadata, and are happy to just let the
dirty page flushers write it out when it suits them (or when there
are explicit sync instructions given).

Most of the time, these filesystems don't really know or care when
exactly their metadata is under writeback.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/