Re: [PATCH 00/10] rcu: Cleanup RCU tree initialization

From: Paul E. McKenney
Date: Tue Mar 10 2015 - 10:57:22 EST


On Tue, Mar 10, 2015 at 02:33:37PM +0000, Alexander Gordeev wrote:
> On Mon, Mar 09, 2015 at 02:35:42PM -0700, Paul E. McKenney wrote:
> > On Mon, Mar 09, 2015 at 09:36:52AM +0000, Alexander Gordeev wrote:
> > > On Mon, Mar 09, 2015 at 09:34:04AM +0100, Alexander Gordeev wrote:
> > > > Hi Paul,
> > > >
> > > > Here is cleanup of RCU tree initialization rebased on linux-rcu rcu/next
> > > > repo, as you requested. Please, note an extra patch #10 that was not
> > > > present in the first post.
> > >
> > > Paul,
> > >
> > > Please, ignore patch #10 for now. I missed to notice rcu_node::grpnum is
> > > used in tracing, so the patch is incomplete. I am not sure why trailing
> > > spaces in seq_printf(m, "%lx/%lx->%lx %c%c>%c %d:%d ^%d ", ....) are
> > > needed for, so not sure if "^%d" part should be removed (possibly with
> > > the traling spaces) or replaced with three spaces.
> >
> > OK, dropping this one for the moment.
> >
> > The original use of ->grpnum was for manual debugging purposes. Yes, you
> > can get the same information out of ->grpmask, but the number is easier
> > to read. And on the debugfs trace information, ->grpnum is printed,
> > but ->grpmask is not.
> >
> > The trailing spaces on the seq_printf() allow the rcu_node data to be
> > printed on a single line, while still allowing the eye to pick out
> > where one rcu_node structure's data ends and the next one begins.
> >
> > So here are the choices, as far as I can see:
> >
> > 1. Leave ->grpnum as is.
> >
> > 2. Remove ->grpnum, but regenerate it in print_one_rcu_state(),
> > for example, by counting the number of rcu_node structures
> > since the last ->level change.
> >
> > 3. Drop ->grpnum and also remove it from the debugfs tracing.
> > The reader can rely on the ->grplo and ->grphi fields to
> > work out where this rcu_node structure fits in, but we
> > lose the visual indication of any bugs in computing these
> > quantities.
> >
> > 4. Drop ->grpnum and replace it with ->grpmask. This seems a
> > bit obtuse to me.
> >
> > 5. Redesign print_one_rcu_state()'s output from scratch.
> >
> > #1 has certain advantages from a laziness viewpoint. #2 would open up
> > some space in the rcu_node structure, but space really isn't an issue
> > for that structure given that huge systems have only 257 of them and
> > the really small systems use Tiny RCU instead. #3 might be OK, but I
> > am not really convinced. #4 seems a bit ugly. I am not signing up
> > for #5, in part because not all that many people use RCU's debugfs
> > output, so I don't see the point in investing the time.
> >
> > But what did you have in mind?
>
> I probably should have marked this patch as an RFC. Given your summary
> #1 seems as the best choice.
>
> However, I have something else in mind, indeed. What is the reason to
> have 'grpnum' and 'level' as u8 while, say 'grplo' and 'grphi' - as int?
> IOW, do we conserve on memory for this structure or not?

The ->grplo and ->grphi fields must hold a CPU number. Since CPU numbers
are int elsewhere, they are int here. I considered making them a short,
but there are systems uncomfortably close to the limit. There have
been 4096-CPU systems for quite some time, and I have heard rumors of
16384-CPU systems. A limit of 32768 seems uncomfortably tight, especially
given that memory footprint is at best a minor requirement for Tree RCU.
Tiny RCU is of course another story -- memory savings is Job One there.

And yes, I do owe the community a writeup of the requirements on RCU.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/