Re: is minimum udelay() not respected in preemptible SMPkernel-2.6.23?

From: Andrew Morton
Date: Wed Nov 07 2007 - 15:31:43 EST


> On Wed, 7 Nov 2007 19:21:52 +0200 Marin Mitov <mitov@xxxxxxxxxxx> wrote:
> Hi all,
>
> I have written a linux device driver for a frame grabber I use in my
> every day experimental work.
>
> In my device driver I have to write to a MMIO register, wait for a while
> (using udelay(65)) for data being written to an internal register (i2c?)
> and test a flag (in another MMIO register) if the operation has completed.
> (The hardware guarantees that the operation has completed in less than
> 65 usec). If the flag is not reset I write a message via printk.
> After switching to the kernel-2.6.23 (compiled as PREEMPTIBLE SMP i686)
> (AMD dual core) I see this message in dmesg output sometime.
>
> Testing with rdtscll() before and after udelay(65) shows the expected
> delay of 65 usec (after dividing by CPU frequency) when all is OK, but
> gives a big value (in the tenths msec range) when the error message
> shows itself in dmesg.
>
> Bracketing udelay(65) by:
>
> local_irq_disable();
> udelay(65);
> local_irq_enable();
>
> as well as by
>
> preempt_disable();
> udelay(65);
> preempt_enable();
>
> leads to message disappearing.
>
> I believe the hardware is working correctly, so if the flag is not reset
> I think udelay(65) returns prematurely (the flag clears some time latter)
> And it does not matter if I use udelay(65) or udelay(100).
>
> What could be the reason for such a behavior?
> Is this a bug in udelay() due to preemption?
> (udelay() being preempted and migrated to another processor)
>
> All my previous kernels used were SMP (but not PREEMPTIBLE)
>
> My kernel is compiled with:
> CONFIG_PREEMPT=y
> CONFIG_IRQBALANCE=y
> CONFIG_HPET_TIMER=y
>
> And I have this line in dmesg:
> Time: acpi_pm clocksource has been installed.
> Switched to high resolution mode on CPU 0
> Switched to high resolution mode on CPU 1
>
> The south bridge is: VIA VT8237 (Asus A8V Delux)
>
> Thank you in advance for your help in understanding where
> the problem is coming from.
>

Ow. Yes, from my reading delay_tsc() can return early (or after
heat-death-of-the-universe) if the TSCs are offset and if preemption
migrates the calling task between CPUs.

I suppose a lameo fix would be to disable preemption in delay_tsc().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/