Re: [PATCH 22/23] sysctl arm: Remove binary sysctl support

From: Andi Kleen
Date: Mon Nov 09 2009 - 07:04:51 EST


On Mon, Nov 09, 2009 at 03:45:06AM -0800, Eric W. Biederman wrote:
> Andi Kleen <andi@xxxxxxxxxxxxxx> writes:
>
> > ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
> >>
> >> The glibc pthread code that uses sysctl has no problems if sys_sysctl
> >> is gone. It both falls back to reading /proc/sys and it just controls
> >> an optimization and the code works with either result. Been there,
> >> done that.
> >
> > /proc/sys is much slower than sysctl though. So you made program startup
> > slower.
>
> Not much slower, but slower. I just measured it in a case that favors
> sysctl and the ration is about 5:2. Or sysctl is about 2.5x faster.
> About 49usec for open/read/close on proc and 19usec for sysctl.
> In my emulation it is a bit slower than that.

That's not good.

>
> > Also I agree with Arjan that breaking such a common ABI is not
> > really a good idea. But I think it's enough to only handle
> > common sysctls that are actually used, which are very few.
>
> Well I haven't broken anything at this point. I am simply edging
> us to the point when we are close to being able to forget about
> sys_sysctl for good.

I think all-or-nothing that you have right now is a bad trade-off
because it breaks an established interface used by lots of code (glibc)

You should have three states

a) all
b) common ones used by glibc and perhaps a few others only
c) none

I suspect most users would use (b), in fact (c) might be redundant
if (b) is cheap enough (which it should be)
> As for the rest the common number of sysctls with glibc > 2.8 is
> exactly 0. Which makes compiling out sys_sysctl support sane.
> Especially since we have been throwing a warning for years if
> anyone uses any of the others.

Yes, but people ignore the warning. Perhaps should make it a WARN()
and track it with kernelops?

>
> > It would be better to simply keep the commonly used binary sysctls
> > as emulation around always (commonly = used by glibc and perhaps
> > added by user printk feedback) That's very cheap because it's just
> > a simple translation and can be done internally cheaper than going
> > through the VFS with a bazillion of locks.
>
> A micro optimization for code that does not exist. That is a bad
> trade off.

Hmm? There's plenty of glibc code that uses the binary sysctl.


> Further it is my intention to optimize /proc/sys when I get the
> chance now that we don't have all of the old sysctl baggage holding
> back the code.

The VFS will always be comparably slow I suspect; I'm not sure
you can optimize it that much compared to a fast custom path
(especially handling the kernel version should be really fast)

> Ultimately what drives me most is that people are still accidentally
> adding binary sysctls, which no one uses or tests. For a recent
> example see:

Yes I agree new binary sysctls should just be deprecated right now.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/