Re: kdbus: to merge or not to merge?

From: Andy Lutomirski
Date: Tue Aug 04 2015 - 10:47:45 EST


On Tue, Aug 4, 2015 at 7:09 AM, David Herrmann <dh.herrmann@xxxxxxxxx> wrote:
> Hi
>
> On Tue, Aug 4, 2015 at 3:46 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> On Tue, Aug 4, 2015 at 1:58 AM, David Herrmann <dh.herrmann@xxxxxxxxx> wrote:
>>>
>>> You lack a call to sd_bus_unref() here.
>>
>> I assume it was intentional. Why would Andy talk about "scaling" otherwise?

It was actually an error. I assumed that, since the user version
worked fine (at least for as long as I ran it) and the kernel version
didn't (killed X and left a blinking cursor, no visible log messages
even when run from a text console, and no obvious OOM recovery after a
long wait) that it was a kdbus issue or issue with other kdbus
clients.

I'll play with it more today.

>>
>> And the worry was why the kdbus version killed the machine, but the
>> userspace version did not. That's a rather big difference, and not a
>> good one.
>
> Neither test 'kills' the machine:
>
> * The userspace version will be killed by the OOM killer after about
> 20s running (depending how much memory you have).

Not on my system. Maybe too much memory?

>
> * The kernel version runs for 1024 iterations (maximum kdbus
> connections per user) and then produces errors.
>
> In fact, the kernel version is even more stable than the user-space
> version, and bails out much earlier. Run it on a VT and everything
> works just fine.


On my system, everything died as described above.

>
> The only issue you get with kdbus is the compat-bus-daemon, which
> assert()s as a side-effect of accept4() failing. In other words, the
> compat bus-daemon gets ENFILE if you open that many connections, then
> assert()s and thus kills all other proxy connections. This has the
> side effect, that Xorg loses access to your graphics device and thus
> your screen 'freezes'. Also networkmanager bails out and stops network
> connections.

Ah, interesting.

>
> This is a bug in the proxy (which is already fixed).

Should I expect to see it in Rawhide soon?

Anyway, the broadcasts that I intended to exercise were
KDBUS_ITEM_ID_REMOVE. Those appear to be broadcast to everyone,
irrespective of "policy", so long as the "match" thingy allows it. As
far as I can tell, that's the default behavior (i.e. receivers accept
KDBUS_DST_ID_BROADCAST), but even if it's not default, we'll still
fail to scale as long as the number of receivers accepting
KDBUS_DST_ID_BROADCAST grows as systems become more kdbus-integrated.

The bloom filter thing won't help at all according to the docs: bloom
filters don't apply to kernel-generated notifications.

So yes, as far as I can tell, kdbus really does track object lifetime
by broadcasting every single destruction event to every single
receiver (subject to caveats above) and pokes the data into every
receiver's tmpfs space (via kdbus_bus_broadcast ->
kdbus_conn_entry_insert -> lots of other stuff -> vfs_iter_write). At
that point, there's well over a gigabyte of tmpfs space that can be
scribbled on (and thus committed and thus needs to be read) by rogue
broadcasters even on Rawhide, and Rawhide seems to have barely started
converting all the kdbus clients from using the proxy to using kdbus
directly.

IIUC, once gdbus switches over to using kdbus directly, with current
buffer sizing, the average laptop will have more kdbus pool tmpfs
space mapped than total RAM. I still don't see how this will work
well.

I guess my test didn't exercise what I meant it to. I wrote it,
userspace survived (on my system) and kdbus didn't. Apparently I blew
up the bus proxy, not the pool mechanism. Next time I'll try to
better characterize exactly what it is I'm doing to my poor VM...

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/