Re: Performance Events hangs with Intel P4 system

From: Cyrill Gorcunov
Date: Fri May 14 2010 - 09:53:02 EST


On Fri, May 14, 2010 at 01:56:55PM +0200, Ingo Molnar wrote:
>
> * Lin Ming <ming.m.lin@xxxxxxxxx> wrote:
>
> > p4_event_bind::cntr is "unsigned char".
> > But p4_next_cntr has return type of "int".
> > So the explicit conversion is needed to get the correct result.
>
> > @@ -780,7 +780,7 @@ static int p4_pmu_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign
> > if (unlikely(escr_idx == -1))
> > goto done;
> >
> > - if (hwc->idx != -1 && !p4_should_swap_ts(hwc->config, cpu)) {
> > + if (hwc->idx != (unsigned char)-1 && !p4_should_swap_ts(hwc->config, cpu)) {
>
> That cast is _extremely_ ugly. Please fix the signedness of the types instead.
>
> Ingo
>

Ingo, what about this one? Jaswinder could you give it a shot (untested)?

-- Cyrill
---
[PATCH -tip/master] x86,perf: P4 PMU - fix counters allocation logic and sign issue

Jaswinder reported GP:
|
| Message from syslogd@ht at May 14 09:39:32 ...
| kernel:[ 314.908612] EIP: [<c100ccca>]
| x86_perf_event_set_period+0x19d/0x1b2 SS:ESP 0068:edac3d70
|

Ming has narrowed it down to comparation issue between signed/unsigned values.
As result event index reaches value 255 which in turn leads to GP fault.

Also it was found that p4_next_cntr has a broken logic and should return
counter index if only it was not yet borrowed for another event.

Reported-by: Jaswinder Singh Rajput <jaswinderlinux@xxxxxxxxx>
Reported-by: Lin Ming <ming.m.lin@xxxxxxxxx>
Bisected-by: Lin Ming <ming.m.lin@xxxxxxxxx>
CC: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
CC: Ingo Molnar <mingo@xxxxxxx>
CC: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Signed-off-by: Cyrill Gorcunov <gorcunov@xxxxxxxxxx>
---
arch/x86/kernel/cpu/perf_event_p4.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
=====================================================================
--- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
@@ -18,7 +18,7 @@
struct p4_event_bind {
unsigned int opcode; /* Event code and ESCR selector */
unsigned int escr_msr[2]; /* ESCR MSR for this event */
- unsigned char cntr[2][P4_CNTR_LIMIT]; /* counter index (offset), -1 on abscence */
+ char cntr[2][P4_CNTR_LIMIT]; /* counter index (offset), -1 on abscence */
};

struct p4_cache_event_bind {
@@ -747,11 +747,11 @@ static int p4_get_escr_idx(unsigned int
static int p4_next_cntr(int thread, unsigned long *used_mask,
struct p4_event_bind *bind)
{
- int i = 0, j;
+ int i, j;

for (i = 0; i < P4_CNTR_LIMIT; i++) {
- j = bind->cntr[thread][i++];
- if (j == -1 || !test_bit(j, used_mask))
+ j = (int)bind->cntr[thread][i];
+ if (j != -1 && !test_bit(j, used_mask))
return j;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/