Re: BUG - function tracing with breakpoints
From: Steven Rostedt
Date: Fri May 25 2012 - 21:37:40 EST
On Fri, 2012-05-25 at 16:51 -0400, Steven Rostedt wrote:
> On Fri, 2012-05-25 at 14:46 -0400, Steven Rostedt wrote:
> > On Fri, 2012-05-25 at 10:40 -0700, H. Peter Anvin wrote:
> > > On 05/25/2012 08:29 AM, Steven Rostedt wrote:
> > > >
> > > > This would make sense for this bug, as if modifying_ftrace_code was not
> > > > seen by other CPUs, it wouldn't go into the ftrace_int3_handler() path.
> > > > That would cause this issue. But the bug remains after the smp_mb()'s
> > > > were put in place. Although it behaves a little differently not. Maybe
> > > > there's something else I missed?
> > > >
> > >
> > > Perhaps you should make the modifying_ftrace_code modification atomic...
> > > it seems odd to have it not be atomic when it is clearly accessed across
> > > CPUs that way.
> >
> > I guess I can make it atomic. Not really a big deal as this (and soon
> > one other place) is the only place that changes its value.
> >
> > I've found another place that may be causing harm, and I'm currently
> > working on fixing it. Hopefully after that's done, this problem will go
> > away.
>
OK, here's another clue.
I've added tons of debug, and filtering out functions to trace. I even
added a 64 entry buffer that records what spots are being hit by the
breakpoint (it always dies with just the breakpoints, it never makes it
to the update instruction part).
Basically it always crashes with this path:
[ 45.050159] [ffffffff8109c1a9 ktime_get+0x19/0xe0]
[ 45.050159] [ffffffff810a3921 update_ts_time_stats+0x11/0xa0]
[ 45.050159] [ffffffff8104ef29 ns_to_timeval+0x9/0x40]
[ 45.050159] [ffffffff8104eea9 ns_to_timespec+0x9/0x80]
[ 45.050159] [ffffffff8104ef29 ns_to_timeval+0x9/0x40]
[ 45.050159] [ffffffff8104eea9 ns_to_timespec+0x9/0x80]
[ 45.050159] [ffffffff814bda45 __cpufreq_driver_getavg+0x15/0x80]
[ 45.050159] [ffffffff814bd735 cpufreq_cpu_get+0x15/0xd0]
[ 45.050159] [ffffffff810b6aad try_module_get+0x1d/0x140]
[ 45.050159] [ffffffff814c387c cpufreq_get_measured_perf+0xc/0xa0]
[ 45.050159] [ffffffff810b2dbd smp_call_function_single+0x1d/0x1c0]
[ 45.050159] [ffffffff814c391a read_measured_perf_ctrs+0xa/0x70]
[ 45.050159] [ffffffff814bcfe5 cpufreq_cpu_put+0x5/0x30]
[ 45.050159] [ffffffff810b67ad module_put+0x1d/0x130]
[ 45.050159] [ffffffff814bcf55 __cpufreq_driver_target+0x15/0xa0]
[ 45.050159] [ffffffff814c3ef2 acpi_cpufreq_target+0x12/0x380]
[ 45.050159] [ffffffff814c21b2 cpufreq_frequency_table_target+0x12/0x1a0]
[ 45.050159] [ffffffff814beced cpufreq_notify_transition+0x1d/0x2c0]
[ 45.050159] [ffffffff81074365 srcu_notifier_call_chain+0x5/0x20]
[ 45.050159] [ffffffff810742bd __srcu_notifier_call_chain+0x1d/0xc0]
[ 45.050159] [ffffffff810736d8 __srcu_read_lock+0x8/0x70]
[ 45.050159] [ffffffff81616072 notifier_call_chain.isra.1+0x12/0xb0]
[ 45.050159] [ffffffff81010efa save_stack_trace+0xa/0x50]
I see it always hitting the cpufreq code just before the crash.
Actually, it never finishes the cpufreq code. Something about this code
causes issues with breakpoints.
/me continues the hunt.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/