Re: Tracing of power:power_start events doesn't work

From: Ronny TschÃter
Date: Fri Jun 11 2010 - 06:27:11 EST


On 12/05/2010 10:57, Robert SchÃne wrote:
On Wed, den 05.05.2010-05-05, 07:33 -0700 schrieb Arjan van de Ven:
On 5/5/2010 7:31, Steven Rostedt wrote:
On Wed, 2010-05-05 at 16:11 +0200, Ronny Tschïter wrote:

/*
* Include the apic definitions for x86 to have the APIC timer related
defines
@@ -796,6 +797,18 @@ static int acpi_idle_bm_check(void)
*/
static inline void acpi_idle_do_entry(struct acpi_processor_cx *cx)
{
+ switch (cx->type) {
+ case ACPI_STATE_C1:
+ trace_power_start(POWER_CSTATE, 1);
+ break;
+ case ACPI_STATE_C2:
+ trace_power_start(POWER_CSTATE, 2);
+ break;
+ case ACPI_STATE_C3:
+ trace_power_start(POWER_CSTATE, 3);
+ break;
+ }
+
Depending on gcc, the above can bloat the code since each call to
trace_power_start() is a macro expanded. Try to call it just once.
Perhaps one of the following:
the code is also incorrect fundamentally.
You need to pass in the mwait value or equivalent; the ACPI STATE type is
pretty much useless random garbage and should completely be ignored.
You're correct for your case (switching to c-state via monitor/mwait).
But the current implementation is broken too. It works only if your
kernel uses processor_idle AND monitor/mwait. For all other cases it
fails - may it be on Ronnys system, that uses IO port based C-states or
my system, that uses cpu_idle (not processor_idle).

I'd suggest, that it should work for IO port based switches and could be
included in the syntactic block of that case.
Afaik, halt means that the processor goes to C1. So in this block, there
could be a power_start event too with 1 as new state. (Maybe there are
problems with AMDs C1E which would be reported as C1.)

Robert
Unfortunately the bug still exists in kernel 2.6.34. On my system I'm not able to trace power:power_start events, because they are not reported. Regarding to the preceding comments I moved my code snippet into the IO port based C-state part of acpi_idle_do_entry(). Therefore it should not interfere with the other two C-state calls in this function.
Furthermore a lot of other code sequences in processor_idle.c use the ACPI STATE type in if- or switch-statements. So I don't think that ACPI STATE type is useless garbage.

Finally here is my new patch:


diff --git a/old/processor_idle.c b/new/processor_idle.c
index 5939e7f..5818522 100644
--- a/old/processor_idle.c
+++ b/new/processor_idle.c
@@ -43,6 +43,7 @@
#include <linux/clockchips.h>
#include <linux/cpuidle.h>
#include <linux/irqflags.h>
+#include <trace/events/power.h>

/*
* Include the apic definitions for x86 to have the APIC timer related defines
@@ -807,6 +808,10 @@ static inline void acpi_idle_do_entry(struct acpi_processor_cx *cx)
} else {
int unused;
/* IO port based C-state */
+ trace_power_start(POWER_CSTATE,
+ cx->type == ACPI_STATE_C1 ? 1 :
+ cx->type == ACPI_STATE_C2 ? 2 :
+ 3);
inb(cx->address);
/* Dummy wait op - must do something useless after P_LVL2 read
because chipsets cannot guarantee that STPCLK# signal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/