Re: [PATCH v2 0/8] scheduler tinification

From: Nicolas Pitre
Date: Thu Jun 08 2017 - 16:16:12 EST

Next message: Mauro Carvalho Chehab: "Re: [RFC 00/10] V4L2 explicit synchronization support"
Previous message: Krister Johansen: "Re: [PATCH tip/core/rcu 45/88] rcu: Add memory barriers for NOCB leader wakeup"
In reply to: Alan Cox: "Re: [PATCH v2 0/8] scheduler tinification"
Next in thread: Ingo Molnar: "Re: [PATCH v2 0/8] scheduler tinification"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, 8 Jun 2017, Ingo Molnar wrote:

>
> Also, let me make it clear at the outset that we do care about RAM footprint all
> the time, and I've applied countless data structure and .text reducing patches to
> the kernel. But there's a cost/benefit analysis to be made, and this series fails
> that test in my view, because it increases the complexity of an already complex
> code base:
>
> * Nicolas Pitre <nicolas.pitre@xxxxxxxxxx> wrote:
>
> > Most IOT targets are so small that people are rewriting new operating systems
> > from scratch for them. Lots of fragmentation already exists.
>
> Let me offer a speculative if somewhat cynical prediction: 90% of those ghastly
> IOT hardware hacks won't survive the market. The remaining 10% will be successful
> financially, despite being ghastly hardware hacks and will eventually, in the next
> iteration or so, get a proper OS.

Your prediction is based on a false premise. There is simply no money to
be made with IoT hardware, especially in the low end. Those little
devices will be given away for free because it is in the service
subscription that the money is. So the hardware has to, and will be,
extremely cheap to produce. If a serious bug turns up in one of those
device, my own cynical prediction is that no one will bother with field
upgradability and they will ask you to throw the device away instead
while they ship you a replacement (field upgradability implies at least
twice the flash memory size and that comes with a cost so some will
gamble that obsolescence will happen before a serious bug turns up).

> As users ask for more features the the hardware capabilities will increase
> dramatically and home-grown microcontroller derived code plus minimal OSes will be
> replaced by a 'real' OS. Because both developers and users will demand IPv6
> compatibility, or Bluetooth connectivity, or storage support, or any random range
> of features we have in the Linux kernel.

The "Cloud" is taking care of most of that. For the rest, your cellphone
or IoT gateway will take over. IPv6 stacks are already used in tiny
microcontrollers with as low as 32KB of RAM.

> With the stroke of a pen from the CFO: "yes, we can spend more on our next
> hardware design!" the problem goes away, overnight, and nobody will look back at
> the hardware hack that had only 1MB of RAM.

Of course hobbyists can already get a Raspberry Pi Zero and run a full
featured Linux distro on it... for a mere 5 bucks. That comes with 512MB
of RAM so my patches certainly don't make a difference there.

But that's not that simple. First there is a fundamental constraint
which is power consumption. If you want your device to run for months
(some will hope years) from the same tiny battery then you just cannot
afford SDRAM. So we're talking static RAM here. And to keep costs down
because you want to give away your thingies by the millions for free it
usually means single-chip designs with on-chip sub-megabyte static RAM.
And in that field the 256KB mark is located towards the high end of the
spectrum. Many IPv6-capable chips available today have less than that.

And the thing is: people already manage to do a awful lot of stuff in
such a constrained device. Some probably did a good job of it, but most
of them likely suck and we don't know about their bugs because we have
no idea what's running inside.

And because it is rather easy to write a new OS from scratch for such a
small environment (and who didn't dream of writing his own OS, right?)
then about every company in that field did so. That's not counting most
Open Source ones which usually are close to single-person projects. So
you get a lot of fragmentation, very very little peer review, and no
incentive for proper maintenance because the cost saving simply isn't
significant enough.

It is just like asteroids. Some of them collapse to form bigger objects
like planets, while others have too weak a gravitational field to gather
more matter. My vision is about leveraging the Linux gravitational power
to bring the tiny embedded space together because, on its own, the tiny
embedded space simply has not enough community power to actually
organize itself.

Of course there are important parts of Linux that couldn't be reused as
is in such a setup, but yet many other things still can be reused with
either some modifications or a tiny parallel subsystem substitution.
Technically, it is always possible to find ways to make it low on
maintenance and beneficial to the wider community. But first and
foremost you have to agree with the fundamental principle of gathering
more people around a common codebase to make it better for everyone and
not suggest that they stick to themselves. If you agree to that then we
can move back to a technical discussion.

> > [...] We're talking about systems with less than one megabyte of RAM, sometimes
> > much less.
>
> Two data points:
>
> Firstly, by the time any Linux kernel change I commit today gets to a typical
> distro it's at least 0.5-1 years, 2 years for it to get widely used by hardware
> shops - 5 years to get used by enterprises. More latency in more conservative
> places.

Don't forget that you are also merging patches today from the Android
folks that have been deployed into actual products years ago. So the
enterprise distro comparison simply has no commonalities here.

> Secondly, I don't see Moore's Law reversing:
>
> http://nerdfever.com/wp-content/uploads/2015/06/2015-06_Moravec_MIPS.png
>
> If you combine those two time frames, the consequence of this:
>
> Even taking the 1MB size at face value (which I don't: a networking enabled system
> can probably not function very well with just 1MB of RAM) - the RAM-starved 1 MB
> system today will effectively be a 2 MB system in 2 years.

As surprising as it might be, IPv6 stacks requiring only a few dozens of
kilobytes of memory do exist. Not so surprisingly though, some people
think that the existing stacks simply suck and they are rewriting yet
another one ... because they think their own will be better of course.

So there *is* still a huge market for sub-megabyte systems. I was also
counting on Moore's law so that by the time Linux actually has the
ability to be tailored for such systems then typical SRAM in those
10-cents microcontrollers will be 512KB instead of 128 or 32.

> You can already fit a mostly full Linux system into 32 MB just fine, i.e. the
> problem has solved itself just by waiting a bit or by increasing the hardware
> capabilities a bit.

You just can't procure SDRAM chips smaller than 32MB on the market
anymore. That's why Linux didn't get any pressure to fit in smaller than
that for quite a while. But I've heard of some people having use cases
for thousands if not millions of Linux VMs on a single server and
they're looking at 10MB VMs or smaller for their application.

> But the kernel complexity you introduce with this series stays with us! It will be
> an additional cost added to many scheduler commits going forward. It's an added
> cost for all the other usecases.

OK, let's talk about that a bit. How isn't sched/core.c with its 7387
lines not overly complex already? How is my moving of rt related code to
rt.c and dl related code to dl.c not helping things? Isn't it easier to
understand the 3500 lines of code in futex.c when half of it i.e. the PI
specific code is split into a separate file? I ask you.

If you want to pick only those patches for now then please be my guest.
At lease the first two patches of the series should be mergeable without
even a doubt.

As to the actual complexity I'm introducing... this is just about not
compiling some files in and stubbing calls to them out. Isn't that a
sign of good isolation when you can stub the dl class out with only 9
insertions and 6 deletions to sched/core.c? I'm not saying the
complexity is nonexistent here, but just the _ability_ to remove a
scheduler class enforces code abstractions which should be a good thing
maintenance wise, no?

> Also, it's not like 20k .text savings will magically enable Linux to fit into 1MB
> of RAM - it won't. The smallest still practical more or less generic Linux system
> in existence today is around 16 MB. You can shrink it more, but the effort
> increases exponentially once you go below a natural minimum size.

Again, I'm not after a tiny-and-generic Linux target. I'm after a
tiny-and-heavily-tailored Linux subset that shares the same ABI and API
as the generic Linux. Once you start compiling out pieces of the core
kernel, it obviously isn't generic anymore, but the potential for size
reduction becomes much bigger.

Anyway... as I said, you have to agree with the high level goal and
principle of leveraging the Linux codebase to gather the tiny embedded
people around it. The tiny embedded community simply will never take
hold otherwise. . If we cannot agree on that then any other point of
discussion is moot. In which case I'll simply drop this project entirely
and move on.

Nicolas

Next message: Mauro Carvalho Chehab: "Re: [RFC 00/10] V4L2 explicit synchronization support"
Previous message: Krister Johansen: "Re: [PATCH tip/core/rcu 45/88] rcu: Add memory barriers for NOCB leader wakeup"
In reply to: Alan Cox: "Re: [PATCH v2 0/8] scheduler tinification"
Next in thread: Ingo Molnar: "Re: [PATCH v2 0/8] scheduler tinification"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]