Re: [PATCH v3 01/10] sched: Provide sparsemask, a reduced contention bitmap

From: Omar Sandoval
Date: Tue Nov 27 2018 - 20:19:10 EST


On Tue, Nov 27, 2018 at 10:16:56AM -0500, Steven Sistare wrote:
> On 11/9/2018 7:50 AM, Steve Sistare wrote:
> > From: Steve Sistare <steve.sistare@xxxxxxxxxx>
> >
> > Provide struct sparsemask and functions to manipulate it. A sparsemask is
> > a sparse bitmap. It reduces cache contention vs the usual bitmap when many
> > threads concurrently set, clear, and visit elements, by reducing the number
> > of significant bits per cacheline. For each 64 byte chunk of the mask,
> > only the first K bits of the first word are used, and the remaining bits
> > are ignored, where K is a creation time parameter. Thus a sparsemask that
> > can represent a set of N elements is approximately (N/K * 64) bytes in
> > size.
> >
> > Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx>
> > ---
> > include/linux/sparsemask.h | 260 +++++++++++++++++++++++++++++++++++++++++++++
> > lib/Makefile | 2 +-
> > lib/sparsemask.c | 142 +++++++++++++++++++++++++
> > 3 files changed, 403 insertions(+), 1 deletion(-)
> > create mode 100644 include/linux/sparsemask.h
> > create mode 100644 lib/sparsemask.c
>
> Hi Peter and Ingo,
> I need your opinion: would you prefer that I keep the new sparsemask type,
> or fold it into the existing sbitmap type? There is some overlap between the
> two, but mostly in trivial one line functions. The main differences are:

Adding Jens and myself.

> * sparsemask defines iterators that allow an inline loop body, like cpumask,
> whereas the sbitmap iterator forces us to define a callback function for
> the body, which is awkward.
>
> * sparsemask is slightly more efficient. The struct and variable length
> bitmap are allocated contiguously,

That just means you have the pointer indirection elsewhere :) The users
of sbitmap embed it in whatever structure they have.

> and sbitmap uses an extra field "depth"
> per bitmap cacheline.

The depth field is memory which would otherwise be unused, and it's only
used for sbitmap_get(), so it doesn't have any cost if you're using it
like a cpumask.

> * The order of arguments is different for the sparsemask accessors and
> sbitmap accessors. sparsemask mimics cpumask which is used extensively
> in the sched code.
>
> * Much of the sbitmap code supports queueing, sleeping, and waking on bit
> allocation, which is N/A for scheduler load load balancing. However, we
> can call the basic functions which do not use queueing.
>
> I could add the sparsemask iterators to sbitmap (90 lines), and define
> a thin layer to change the argument order to mimic cpumask, but that
> essentially recreates sparsemask.

We only use sbitmap_for_each_set() in a few places. Maybe a for_each()
style macro would be cleaner for those users, too, in which case I
wouldn't be opposed to changing it. The cpumask argument order thing is
a annoying, though.

> Also, pushing sparsemask into sbitmap would limit our freedom to evolve the
> type to meet the future needs of sched, as sbitmap has its own maintainer,
> and is used by drivers, so changes to its API and ABI will be frowned upon.

It's a generic data structure, so of course Jens and I have no problem
with changing it to meet more needs :) Personally, I'd prefer to only
have one datastructure for this, but I suppose it depends on whether
Peter and Ingo think the argument order is important enough.

> FWIW, here is the amount of code involved:
>
> include/linux/sbitmap.h
> 250 lines basic operations
> 284 lines for queueing
> ---
> 534 lines total
>
> lib/sbitmap.c
> 201 lines basic operations
> 380 lines for queueing
> ---
> 581 lines total
>
> include/linux/sparsemask.h
> 260 lines total
> https://lkml.org/lkml/2018/11/9/1176
>
> lib/sparsemask.c
> 142 lines total
> https://lkml.org/lkml/2018/11/9/1176
>
> - Steve