Re: [PATCH v6] close_range.2: new page documenting close_range(2)

From: Stephen Kitt
Date: Thu Jan 28 2021 - 18:08:36 EST


Hello Michael,

On Thu, 28 Jan 2021 21:50:23 +0100, "Michael Kerrisk (man-pages)"
<mtk.manpages@xxxxxxxxx> wrote:
> Thanks for your patch revision. I've merged it, and have
> done some light editing, but I still have a question:
>
> On 1/23/21 5:11 PM, Stephen Kitt wrote:
>
> [...]
>
> > +.SH ERRORS
>
> > +.TP
> > +.B EMFILE
> > +The per-process limit on the number of open file descriptors has been
> > reached +(see the description of
> > +.B RLIMIT_NOFILE
> > +in
> > +.BR getrlimit (2)).
>
> I think there was already a question about this error, but
> I still have a doubt.
>
> A glance at the code tells me that indeed EMFILE can occur.
> But how can the reason be because the limit on the number
> of open file descriptors has been reached? I mean: no new
> FDs are being opened, so how can we go over the limit. I think
> the cause of this error is something else, but what is it?

Here’s how I understand the code that can lead to EMFILE:

* in __close_range(), if CLOSE_RANGE_UNSHARE is set, call unshare_fd() with
CLONE_FILES to clone the fd table
* unshare_fd() calls dup_fd()
* dup_fd() allocates a new fdtable, and if the resulting fdtable ends up
being too small to hold the number of fds calculated by
sane_fdtable_size(), fails with EMFILE

I suspect that, given that we’re starting with a valid fdtable, the only way
this can happen is if there’s a race with sysctl_nr_open being reduced.

Incidentally, isn’t this comment in file.c somewhat misleading?

/*
* If the requested range is greater than the current maximum,
* we're closing everything so only copy all file descriptors
* beneath the lowest file descriptor.
*/

As I understand it, dup_fd() will always copy any open file descriptor
anyway, it won’t stop at max_unshare_fds if that’s lower than the number of
open fds (thanks to save_fdtable_size())...

Regards,

Stephen

Attachment: pgpe7KO0ErWy9.pgp
Description: OpenPGP digital signature