Re: [PATCH v7 0/8] iov_iter: Improve page extraction (ref, pin or just list)

From: Matthew Wilcox
Date: Mon Jan 23 2023 - 13:05:15 EST


On Mon, Jan 23, 2023 at 05:19:51PM +0000, David Howells wrote:
> Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> > > Wouldn't that potentially make someone's entire malloc() heap entirely NOCOW
> > > if they did a single DIO to/from it.
> >
> > Yes. Would that be an actual problem for any real application?
>
> Without auditing all applications that do direct I/O writes, it's hard to
> say - but a big database engine, Oracle for example, forking off a process,
> say, could cause a massive slow down as fork suddenly has to copy a huge
> amount of malloc'd data unnecessarily[*].
>
> [*] I'm making wild assumptions about how Oracle's DB engine works.

Yes. The cache is shared between all Oracle processes, so it's not COWed.
Indeed (as the mshare patches show), what Oracle wants is _more_ sharing
between the processes, not _less_.

> > > Also you only mention DIO read - but what about "start DIO write; fork();
> > > touch buffer" in the parent - now the write buffer belongs to the child
> > > and they can affect the parent's write.
> >
> > I'm struggling to see the problem here. If the child hasn't exec'd, the
> > parent and child are still in the same security domain. The parent
> > could have modified the buffer before calling fork().
>
> It could still inadvertently change the data its parent set to write out. The
> child *shouldn't* be able to change the parent's in-progress write. The most
> obvious problem would be in something that does DIO from a stack buffer, I
> think.

If it's a problem then O_DIRECT writes can also set the NOCOW flag.