Re: [BUG] Something goes wrong with timer statistics.

From: Eric Dumazet
Date: Wed May 30 2007 - 08:45:00 EST


On Wed, 30 May 2007 13:38:25 +0200
Björn Steinbrink <B.Steinbrink@xxxxxx> wrote:

> On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > > Hi Ian,
> > >
> > > (Thomas "The Wizard of Time" added to CC :))
> >
> > Added more wizards :)
> >
> > > On 29/05/07, Ian Kumlien <iank@xxxxxxxxxxxx> wrote:
> > > > Hi,
> > > >
> > > > As the daystar sets, i try to play some with my new would be
> > > > firewall/server, but since this will be running for quite some time i
> > > > have been experimenting with powertop to find out what i can do to limit
> > > > it's power usage.
> > > >
> > > > But, if i run powertop for too long or a few times to many... this
> > > > happens:
> > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> >
> > This was reported before. It's incredibly hard to reproduce.
>
> OK, second try, this time with a patch. In timer_stats_update_stats,
> input is allocated on the stack, so it is uninitialized, in particular
> the "next" field is random. Now in tstat_lookup, the new entry "curr" is
> initialized with the values from "input" (passed as "entry") and "next"
> is set to NULL _after_ it is added to the list, so if a second CPU is
> running the fastpath lookup while we're inserting the new element, it
> might get the garbage value in "next". The patch below fixes that.
>
> Björn
>
>
>
> Initialize the "next" field of a timer stats entry before it is inserted
> into the list to avoid a race with the fastpath lookup.
>
> Signed-off-by: Björn Steinbrink <B.Steinbrink@xxxxxx>
> ---
> diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> index 868f1bc..5bc8f91 100644
> --- a/kernel/time/timer_stats.c
> +++ b/kernel/time/timer_stats.c
> @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
> if (curr) {
> *curr = *entry;
> curr->count = 0;
> + curr->next = NULL;
> memcpy(curr->comm, comm, TASK_COMM_LEN);
> if (prev)
> prev->next = curr;
> else
> *head = curr;
> - curr->next = NULL;
> }
> out_unlock:
> spin_unlock(&table_lock);
>

Your analysis might be right, not the fix.

You *cannot* assume curr->next = NULL will be done before insert.

You probably also need a memory barrier.

curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);

smp_mb(); /* commit all writes to curr before insert */

if (prev)
prev->next = curr;
else
*head = curr;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/