Re: [PATCH] perf cgroups: Don't rotate events for cgroups unnecessarily

From: Peter Zijlstra
Date: Fri Aug 23 2019 - 09:03:42 EST


On Fri, Aug 23, 2019 at 06:26:34PM +0530, Ganapatrao Kulkarni wrote:
> On Fri, Aug 23, 2019 at 5:29 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Fri, Aug 23, 2019 at 04:13:46PM +0530, Ganapatrao Kulkarni wrote:
> >
> > > We are seeing regression with our uncore perf driver(Marvell's
> > > ThunderX2, ARM64 server platform) on 5.3-Rc1.
> > > After bisecting, it turned out to be this patch causing the issue.
> >
> > Funnily enough; the email you replied to didn't contain a patch.
>
> Hmm sorry, not sure why the patch is clipped-off, I see it in my inbox.

Your email is in a random spot of the discussion for me. At least it was
fairly easy to find the related patch.

> > > Test case:
> > > Load module and run perf for more than 4 events( we have 4 counters,
> > > event multiplexing takes place for more than 4 events), then unload
> > > module.
> > > With this sequence of testing, the system hangs(soft lockup) after 2
> > > or 3 iterations. Same test runs for hours on 5.2.
> > >
> > > while [ 1 ]
> > > do
> > > rmmod thunderx2_pmu
> > > modprobe thunderx2_pmu
> > > perf stat -a -e \
> > > uncore_dmc_0/cnt_cycles/,\
> > > uncore_dmc_0/data_transfers/,\
> > > uncore_dmc_0/read_txns/,\
> > > uncore_dmc_0/config=0xE/,\
> > > uncore_dmc_0/write_txns/ sleep 1
> > > sleep 2
> > > done
> >
> > Can you reproduce without the module load+unload? I don't think people
> > routinely unload modules.
>
> The issue wont happen, if module is not unloaded/reloaded.
> IMHO, this could be potential bug!

Does the softlockup give a useful stacktrace? I don't have a thunderx2
so I cannot reproduce.