Re: [PATCHv3 bpf-next 0/9] mm/bpf/perf: Store build id in file object

From: Al Viro
Date: Fri Mar 17 2023 - 17:21:39 EST


On Fri, Mar 17, 2023 at 09:14:03PM +0000, Al Viro wrote:
> On Fri, Mar 17, 2023 at 09:33:17AM -0700, Andrii Nakryiko wrote:
>
> > > But build IDs are _generally_ available. The only problem (AIUI)
> > > is when you're trying to examine the contents of one container from
> > > another container. And to solve that problem, you're imposing a cost
> > > on everybody else with (so far) pretty vague justifications. I really
> > > don't like to see you growing struct file for this (nor struct inode,
> > > nor struct vm_area_struct). It's all quite unsatisfactory and I don't
> > > have a good suggestion.
> >
> > There is a lot of profiling, observability and debugging tooling built
> > using BPF. And when capturing stack traces from BPF programs, if the
> > build ID note is not physically present in memory, fetching it from
> > the BPF program might fail in NMI (and other non-faultable contexts).
> > This patch set is about making sure we always can fetch build ID, even
> > from most restrictive environments. It's guarded by Kconfig to avoid
> > adding 8 bytes of overhead to struct file for environment where this
> > might be unacceptable, giving users and distros a choice.
>
> Lovely. As an exercise you might want to collect the stats on the
> number of struct file instances on the system vs. the number of files
> that happen to be ELF objects and are currently mmapped anywhere.
> That does depend upon the load, obviously, but it's not hard to collect -
> you already have more than enough hooks inserted in the relevant places.
> That might give a better appreciation of the reactions...

One possibility would be a bit stolen from inode flags + hash keyed by
struct inode address (middle bits make for a decent hash function);
inode eviction would check that bit and kick the corresponding thing
from hash if the bit is set.

Associating that thing with inode => hash lookup/insert + set the bit.