Re: [PATCH] RTC: Add an alarm disable quirk

From: John Stultz
Date: Mon Jul 22 2013 - 18:17:46 EST


On 07/22/2013 02:27 PM, Borislav Petkov wrote:
On Mon, Jul 22, 2013 at 02:15:57PM -0700, John Stultz wrote:
On 07/22/2013 02:12 PM, Borislav Petkov wrote:
On Mon, Jul 22, 2013 at 01:59:01PM -0700, John Stultz wrote:
So did this work some of the time, but not all? Or was the behavior
totally unchanged with this?
Yep, some of the time. The first couple of runs it worked and I was
euphoric and then it rebooted and I almost threw the box out the window
:-)
I can understand your frustration. :)

But its interesting it sort of worked, no? The bit you discovered
earlier with the dump_stack debugging call, where we're actually
disabling the irq twice was interesting.

If you use the debugging patch with this change, does it show any
different in logic between the working cases and the instant-reboot
case?
Ok, I'm kinda confused with so many experiments I did, what we actually
want to try:

Do we want to use the filter thingy:

---
diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index be06d7150de5..bb265f1651e7 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -304,6 +304,9 @@ static void cmos_irq_enable(struct cmos_rtc *cmos, unsigned char mask)
rtc_control = CMOS_READ(RTC_CONTROL);
cmos_checkintr(cmos, rtc_control);

+ if (rtc_control == mask)
+ return;
+
rtc_control |= mask;
CMOS_WRITE(rtc_control, RTC_CONTROL);
hpet_set_rtc_irq_bit(mask);
@@ -316,6 +319,10 @@ static void cmos_irq_disable(struct cmos_rtc *cmos, unsigned char mask)
unsigned char rtc_control;

rtc_control = CMOS_READ(RTC_CONTROL);
+
+ if (!(rtc_control & mask))
+ return;
+
rtc_control &= ~mask;
CMOS_WRITE(rtc_control, RTC_CONTROL);
hpet_mask_rtc_irq_bit(mask);
--

and also add dump_stack to see what calls cmos_irq_disable?

In the run I had, the first call came from rtc_dev_ioctl so I'm guessing
userspace and the following one was rtc workqueue rtc_timer_do_work.

Just let me know what exactly we want to try and I'll do it tomorrow, on
a clear head and not half asleep now :-)

So we probably want to do the following:
* Add your printk debug messages in rtc_cmos_read/write
* Add dump_stack to cmos_irq_disable
* Then on two machines (one working normally, the other broken): Boot the systems & shut them down after 5 minutes.
* Send out the full logs for both.
* Add the filtering logic above to the broken machine.
* Boot the system, shut it down after 5 minutes. Do this repeatedly until you get a failure (instant reboot) and a pass (stays off)
* Send out the full logs.

Hopefully from this we can sort out what exactly is going on.

Thanks again for your interest in hunting this down.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/