Re: Linux 4.9-rc6

From: Linus Torvalds
Date: Sun Nov 20 2016 - 18:27:30 EST


On Sun, Nov 20, 2016 at 2:27 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
>
> Hosts with ~100,000 threads have an issue with /prov/vmallocinfo
>
> It can take about 800 usec to skip over ~100,000 struct vmap_area
> in s_start(), while holding vmap_area_lock spinlock, and therefore
> blocking fork()/pthread_create().
>
> I presume we can not switch to the rbtree (vmap_area_root)
> for /proc/vmallocinfo, because this file is seek-able, right ?

Well, the good news is that the file is root-only anyway, which means
that at least it won't have the issue that a lot of other /proc files
have had - namely being opened by random user programs or libraries.

Which means that the users of it are likely fairly limited.

Which in turn means that we can probably afford to play more games
with it. Including, for example, possibly marking it non-seekable.

Or even just limit the maximum entries we are willing to walk.

Or we could decide that that file shouldn't be a seq_file at all, use
the old "one page buffer" approach that was so common for /proc files,
and make the position encode the vmalloc address in it (make the lower
PAGE_MASK bits be the offset in the line), and then we *could* just
look things up using the btree method.

Al, do you have any clever ideas?

Linus