Re: 2.6.32.21 - uptime related crashes?

From: Ingo Molnar
Date: Thu Jul 21 2011 - 03:25:20 EST



* john stultz <johnstul@xxxxxxxxxx> wrote:

> On Fri, 2011-07-15 at 12:01 +0200, Peter Zijlstra wrote:
> > On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote:
> > >
> > > Peter/Ingo: Can you take a look at the above and let me know if you find
> > > it too disagreeable?
> >
> > +static unsigned long long __cycles_2_ns(unsigned long long cyc)
> > +{
> > + unsigned long long ns = 0;
> > + struct x86_sched_clock_data *data;
> > + int cpu = smp_processor_id();
> > +
> > + rcu_read_lock();
> > + data = rcu_dereference(per_cpu(cpu_sched_clock_data, cpu));
> > +
> > + if (unlikely(!data))
> > + goto out;
> > +
> > + ns = ((cyc - data->base_cycles) * data->mult) >> CYC2NS_SCALE_FACTOR;
> > + ns += data->accumulated_ns;
> > +out:
> > + rcu_read_unlock();
> > + return ns;
> > +}
> >
> > The way I read that we're still not wrapping properly if freq scaling
> > 'never' happens.
>
> Right, this doesn't address the mult overflow behavior. As I mentioned
> in the patch that the rework allows for solving that in the future using
> a (possibly very rare) timer that would accumulate cycles to ns.
>
> This rework just really addresses the multiplication overflow->negative
> roll under that currently occurs with the cyc2ns_offset value.
>
> > Because then we're wrapping on accumulated_ns + 2^54.
> >
> > Something like resetting base, and adding ns to accumulated_ns and
> > returning the latter would make more sense.
>
> Although we have to update the base_cycles and accumulated_ns
> atomically, so its probably not something to do in the sched_clock path.

Ping, what's going on with this bug? Systems are crashing so we need
a quick fix ASAP ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/