Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")

From: Lukas Bulwahn
Date: Mon Jan 10 2022 - 11:20:10 EST


On Mon, Jan 10, 2022 at 3:02 PM Thorsten Leemhuis
<regressions@xxxxxxxxxxxxx> wrote:
>
>
> On 09.01.22 22:20, Jakub Kicinski wrote:
> > On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> >> Dear Rao and David,
> >>
> >>
> >> In our syzkaller instance running on linux-next,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> >> observing a memory leak in prepare_creds,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> >> for quite some time.
> >>
> >> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> >> So, it is in mainline, was released and has not been fixed in
> >> linux-next yet.
> >>
> >> As syzkaller also provides a reproducer, we bisected this memory leak
> >> to be introduced with commit 314001f0bf92 ("af_unix: Add OOB
> >> support").
> >>
> >> We also tested that reverting this commit on torvalds' current tree
> >> made the memory leak with the reproducer go away.
> >>
> >> Could you please have a look how your commit introduces this memory
> >> leak? We will gladly support testing your fix in case help is needed.
> >
> > Let's test the regression/bug report tracking bot :)
> >
> > #regzbot introduced: 314001f0bf92
>
> Great, thx for trying, you only did a small mistake: it lacked a caret
> (^) before the "introduced", which would have told regzbot that the
> parent mail (the one you quoted) is the one containing the report (which
> later is linked in patch descriptions of fixes and allows rezgbot to
> connect things). That's why regzbot now thinks you reported the issue
> and looks out for patches and commits that link to your mail. :-/
>
> Don't worry, I just added it properly and now mark this as duplicate:
>
> #regzbot dup-of:
> https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@xxxxxxxxxxxxxx/
>
> Thx again for trying.
>

Thorsten, Jakub, formally this may or may not be a "regression"---as
Thorsten defines it:

It's a regression if some application or practical use case running fine on one
Linux kernel works worse or not at all with a newer version compiled using a
similar configuration.

The af_unix functionality without oob support works before
314001f0bf92 ("af_unix: Add OOB support").
The af_unix functionality without oob support works after 314001f0bf92
("af_unix: Add OOB support").
The af_unix with oob support after the new feature with 314001f0bf92
("af_unix: Add OOB support") makes a memory leak visible; we do not
know if this feature even triggers it or just makes it visible.

Now, if we disable oob support we get a kernel without an observable
memory leak. However, oob support is added by default, and this makes
this memory leak visible. So, if oob support is turned into a
non-default option or nobody ever made use of the oob support before,
it really does not count as regression at all. The oob support did not
work before and now it works but just leaks a bit of memory---it is
potentially a bug, but not a regression. Of course, maybe oob support
is also just intended to make this memory leak observable, who knows?
Then, it is not even a bug, but a feature.

Thorsten's database is still quite empty, so let us keep tracking the
progress with regzbot. I guess we cannot mark "issues" in regzbot as a
true regression or as a bug (an issue that appears with a new
feature).

Also, this reproducer is automatically generated, so it barely
qualifies as "some application or practical use case", but at best as
some derived "stress test program" or "micro benchmark".

The syzbot CI and kernel CI database are also planning to track such
things (once all databases and all the interfaces all work smoothly),
so in the long term, such issues as this one would not qualify for
regzbot. For now, many things in these pipelines are still manual and
hence, triggering and investigation is manual effort, as well as
manually informing the involved developers, which also means that
tracking remains manual effort, for which regzbot is probably the
right new tool for now.

We will learn what should go into regzbot's tracker and what should
not---as we move on in the community: various information from other
systems (syzbot, kernel CI, kernel test robot etc.) and their reports
are also still difficult to add, find, track, bisect etc.

Lukas