Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE

From: Andy Lutomirski
Date: Thu Aug 26 2021 - 13:49:39 EST


On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote:
> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> > I’ll bite. How about we attack this in the opposite direction: remove the deny write mechanism entirely.
>
> I think that would be ok, except I can see somebody relying on it.
>
> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time.

Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY. Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it.

Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write. So, in a multithreaded program, one thread does:

fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC);
write(fd, some stuff);

<--- problem is here

close(fd);
execve("some exefile");

Another thread does:

fork();
execve("something else");

In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails. Whoops. See, for example:

https://github.com/golang/go/issues/22315

I propose we get rid of deny_write_access() completely to solve this.

Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons.

(OFD locks seem like they might have the same problem. Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?)

>
> But you are right that we have removed parts of it over time (no more
> MAP_DENYWRITE, no more uselib()) so that what we have today is a
> fairly weak form of what we used to do.
>
> And nobody really complained when we weakened it, so maybe removing it
> entirely might be acceptable.
>
> Linus
>