Re: [RFC][PATCH] mm: fix page_mkclean() vs non-linear vmas

From: Miklos Szeredi
Date: Thu Mar 08 2007 - 06:22:50 EST


> Partial revert of commit: 204ec841fbea3e5138168edbc3a76d46747cc987
>
> Non-linear vmas aren't properly handled by page_mkclean() and fixing that
> would result in linear scans of all related non-linear vmas per page_mkclean()
> invocation.
>
> This is deemed too costly, hence re-instate the msync scan for non-linear vmas.
>
> However this can lead to double IO:
>
> - pages get instanciated with RO mapping
> - page takes write fault, and gets marked with PG_dirty
> - page gets tagged for writeout and calls page_mkclean()
> - page_mkclean() fails to find the dirty pte (and clean it)
> - writeout happens and PG_dirty gets cleared.
> - user calls msync, the dirty pte is found and the page marked with PG_dirty
> - the page gets writen out _again_ even though its not re-dirtied.
>
> To minimize this reset the protection when creating a nonlinear vma.
>
> I'm not at all happy with this, but plain disallowing
> remap_file_pages on bdis without BDI_CAP_NO_WRITEBACK seems to
> offend some people, hence restrict it to root only.

Root only for !BDI_CAP_NO_WRITEBACK mappings doesn't make sense
because:

- just encourages insecure applications

- there are no current users that want this and presumable no future
uses either

- it's a maintenance burden: I'll have to layer the m/ctime update
patch on top of this

- the only pro for this has been that Nick thinks it cool ;)

I think the proper way to deal with this is to

- allow BDI_CAP_NO_WRITEBACK (tmpfs/ramfs) uses, makes database
people happy

- for !BDI_CAP_NO_WRITEBACK emulate using do_mmap_pgoff(), should be
trivial, no userspace ABI breakage

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/