Re: [PATCH 0/2] tracing: Have trace_pid_list be a sparse array

From: Eugene Syromyatnikov
Date: Fri Sep 24 2021 - 06:51:31 EST


On Fri, Sep 24, 2021 at 6:07 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Thu, 23 Sep 2021 23:35:47 -0400
> Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> > The pid_mask will start out with 1024 entries for the first 10 MSB bits.
> > This will cost 4K for 32 bit architectures and 8K for 64 bit. Each of
> > these will have a 1024 array to store the next 10 bits of the pid (another
> > 4 or 8K). These will hold an 512 byte bitmask (which will cover the LSB 12
> > bits or 4096 bits).
>
> Thinking about this more, I should adjust this split.
>
> Currently, this is a 10,10,12 split, but since the upper chunks are
> arrays of pointers, and the lower chunk is a bitmask, and that pids
> tend to be close together, I should make the lower split bigger.
>
> As a 4K page is 32768 bits (2^15), the lower split should be 15 bits,
> not 12. This will probably allocate better.
>
> Of course 32 - 15 is 17. So maybe to keep it simple, by having the two
> upper chunks still the same in size, I could have it be 14 bits for the
> lower (2048 bytes). And since pid_max can only be 2^30, we don't even
> need to cover the full 32 bits.
>
> 30 - 14 = 16 = 8 * 2.
>
> Then I can make the upper chunks cover 8 bits (arrays of 256 pointers)
> and the lower chunks cove 14 bits. This would coincidentally make all
> chunks 2K in size (on 64 bit machines).
>
> Hmm, in that case, I can make the lower and upper chunks use the same
> memory and not separate them. They are unions after all. But that may
> be unfair for 32 bit machines, as the upper chunks are only going to be
> 1K in size for them (256 * 4).

Note that there is only one top-level chunk (so its size doesn't
matter much), (usually) about one middle-tier chunk (except for some
heavy cases, since pids are allocated linearly), and quite some
lower-tier bitset leaves. So I'd optimise towards smaller leaves at
the expense of middle-tier (and especially top-tier) chunk size
(especially considering the fact that in the kernel, buddy allocator
is used), like 12-8-12 or something like that, but I have no factual
basis for arguing about specific split. Also, I cannot resist from
noticing that this reminds me an awful lot of XArray and [1]. Maybe,
some wrapper around XArray would do?

[1] https://gitlab.com/strace/strace/-/raw/master/src/trie.h

>
> -- Steve
>



--
Eugene Syromyatnikov
mailto:evgsyr@xxxxxxxxx
xmpp:esyr@jabber.{ru|org}