Re: [rfc patch] sched/topology: fix domain reconstruction memory leakage

From: Mike Galbraith
Date: Mon Aug 21 2017 - 09:31:04 EST


On Mon, 2017-08-21 at 10:16 +0200, Peter Zijlstra wrote:
> On Sat, Aug 19, 2017 at 08:10:49AM +0200, Mike Galbraith wrote:
> > Greetings,
> >
> > While beating on cpu hotplug with the shiny new topology fixes
> > backported, my memory poor 8 socket box fairly quickly leaked itself to
> > death, 0c0e776a9b0f being the culprit. ÂWith the below applied, box
> > took a severe beating overnight without a whimper.
> >
> > I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
> > put_sched_groups() instead, with overlapping domains taking a group
> > reference reference as well so they can put both sg/sgc rather than put
> > one free the other. ÂThose places that want an explicit free can pass
> > free to only explicitly free sg (or use two functions). ÂMinimalist
> > approach works (minus signs, yay), but could perhaps use some "pretty".
> >
> > sched/topology: fix domain reconstruction memory leakage
>
> I was sitting on this one:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=sched/core&id=c63d18dd6ea59eec5cba857835f788943ff9f0d5

The comment in the patch reads better to me like so:

@@ -345,15 +346,12 @@ static void free_sched_groups(struct sch
static void destroy_sched_domain(struct sched_domain *sd)
{
/*
- * If its an overlapping domain it has private groups, iterate and
- * nuke them all.
+ * A normal sched domain may have multiple group references, an
+ * overlapping domain, having private groups, only one. Iterate,
+ * dropping group/capacity references, freeing where none remain.
*/
- if (sd->flags & SD_OVERLAP) {
- free_sched_groups(sd->groups, 1);
- } else if (atomic_dec_and_test(&sd->groups->ref)) {
- kfree(sd->groups->sgc);
- kfree(sd->groups);
- }
+ free_sched_groups(sd->groups, 1);
+
if (sd->shared && atomic_dec_and_test(&sd->shared->ref))
kfree(sd->shared);
kfree(sd);