Re: Q about pagecache data never written to disk

From: Andrew Morton
Date: Sun Sep 05 2004 - 05:55:47 EST


Andrey Savochkin <saw@xxxxxxxxxxxxx> wrote:
>
> Let's suppose an mmap'ed (SHARED, RW) file has a hole.
> AFAICS, we allow to dirty the file pages without allocating the space for the
> hole - filemap_nopage just "reads" the page filling it with zeroes, and
> nothing is done about the on-disk data until writepage.
>
> So, if the page can't be written to disk (no space), the dirty data just
> stays in the pagecache. The data can be read or seen via mmap, but it isn't
> and never be on disk. The pagecache stays unsynchronized with the on-disk
> content forever.

The kernel will make one attampt to write the data to disk. If that write
hits ENOSPC, the page is not redirtied (ie: the data can be lost).

When that write hits ENOSPC an error flag is set in the address_space and
that will be returned from a subsequent msync(). The application will then
need to do something about it.

If your application doesn't msync() the memory then it doesn't care about
its data anyway. If your application _does_ msync the pages then we
reliably report errors.

> Is it the intended behavior?
> Shouldn't we call the filesystem to fill the hole at the moment of the first
> write access?

That would be a retrograde step - it would be nice to move in the other
direction: perform disk allocation at writeback time rather than at write()
time, even for regular write() data. To do that we (probably) need space
reservation APIs. And yes, we perhaps could reserve space in the
filesystem when that page is first written to.

But then what would we do if there's no space? SIGBUS? SIGSEGV?
Inappropriate. SIGENOSPC?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/