Re: [PATCH 3/5] hw-breakpoints: Rewrite the hw-breakpoints layeron top of perf counters

From: Frederic Weisbecker
Date: Sun Sep 13 2009 - 23:41:48 EST


On Sat, Sep 12, 2009 at 12:09:40AM +0200, Jan Kiszka wrote:
> Frederic Weisbecker wrote:
> > This patch rebase the implementation of the breakpoints API on top of
> > perf counters instances.
> >
> > The core breakpoint API has changed a bit:
> >
> > - register_kernel_hw_breakpoint() now takes a cpu as a parameter. For
> > now it doesn't support all cpu wide breakpoints but this may be
> > implemented soon.
> >
> > - unregister_kernel_hw_breakpoint() and unregister_user_hw_breakpoint()
> > have been unified in a single unregister_hw_breakpoint()
> >
> > Each breakpoints now match a perf counter which now handles the
> > register scheduling, thread/cpu attachment, etc..
> >
> > The new layering is now made as follows:
> >
> > ptrace kgdb ftrace perf syscall
> > \ | / /
> > \ | / /
>
> kgdb doesn't fit here as it requires nmi-safe services.
>
> I don't think you want to make the whole stack nmi-safe but rather
> provide a separate interface that allows kgdb to announce to the kernel
> when it uses some slot. Those slots should simply be excluded from
> hardware updates. That's roughly the logic we use in KVM for guest
> debugging: when the host starts to use debug registers for that purpose,
> the guest's setting will not effect the real hardware anymore.



I don't quite understand what must be NMI-safe here. Is it when
we request a breakpoint or when we hit one?



> Still on my wishlist for KVM is a cheap & easy way to obtain the current
> register content or to refresh it in hardware. It's not yet clear to me
> where to hook this in the given design. It looks like this information
> can be scattered over the current thread and some perf counters.


With this design approach, the debug registers are not anymore stored
in the thread structure. They are not stored anymore actually.

Especially because the breakpoint are not anymore assigned to a
specific address register. This one is decided when the counter
is enabled. And the counter is often toggled on/off, depending
if we start/end profiling the desired context. It can be a single task,
in which case the counter is enabled while the task is sched in, and
disabled when it is sched out.
And between two sched atoms, the register used for a breakpoint
can be different.

The arch informations about the breakpoints (len/type/addr) are stored
in the counter structure, and the address/control registers contents
are now dynamically computed.

For your needs, basically the control must be done from perfcounters.
When you switch from host to guest, the counter must be sched out.
And in the reverse direction, it must be sched in.
Then perf will take care of that by itself.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/