Re: plumbers session on profiling?

From: Bill Wendling
Date: Fri Jul 01 2022 - 14:57:47 EST


On Fri, Jul 1, 2022 at 4:49 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Jul 01, 2022 at 03:17:54AM -0700, Bill Wendling wrote:
> > On Fri, Jul 1, 2022 at 2:02 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, Jun 28, 2022 at 07:08:48PM +0200, Jose E. Marchesi wrote:
> > > >
> > > > [Added linux-toolchains@vger in CC]
> > > >
> > > > It would be interesting to have some discussion in the Toolchains track
> > > > on building the kernel with PGO/FDO. I have seen a raise on interest on
> > > > the topic in several companies, but it would make very little sense if
> > > > no kernel hacker is interested in participating... anybody?
> > >
> > > I know there's been a lot of work in this area, but none of it seems to
> > > have trickled down to be easy enough for me to use it.
> >
> > We use an instrumented kernel to collect the data we need. It gives us
> > the best payoff, because the profiling data is more fine-grained and
> > accurate. (PGO does much more than make inlining decisions.)
> >
> > If I recall correctly, you previously suggested using sampling data.
> > (Correct?) Is there a document or article that outlines that process?
>
> IIRC Google has LBR sample driven PGO somewhere as well. ISTR that being
> the whole motivation for that gruesome Zen3 BRS hack.
>
> Google got me this: https://research.google.com/pubs/archive/45290.pdf
>
Right. However, there's a chicken-and-egg issue with AutoFDO for the
production kernel. We can't release a kernel that hasn't been compiled
with PGO/FDO. We could only release it in a test environment, in which
case we could use AutoFDO. However, the document says that AutoFDO
only reaches ~90% of FDO. They list some reasons for this, but
nonetheless I suspect that the delta would be too severe for us to
release the kernel.

As for LBR, that will work with Intel/AMD, but I thought that LBR
doesn't exist for Arm processors (my knowledge could be out of date on
this).

What would make PGO (sample-based or instrumented) easy enough for you
to use? What're the key elements missing?

-bw