Re: Too much error in __const_udelay() ?

From: Dominik Brodowski
Date: Sat Jun 05 2004 - 10:26:56 EST


Hi,

> However I've started to see some problems w/ 2.6 and USB on x440/x445s,
> both of which use the 100Mhz cyclone time source. Further digging has
> pointed to the fact that certain important udelay()s in the USB
> subsystem aren't actually waiting long enough.

Certain? AFAICS _no_ call to a delay routine actually passed a big enough
argument. Or am I missing something? Also, __ndelay seems to be affected
as well: it returns zero for 550 nsec even for the TSC variant in your
test.c.

> So I'm no math wiz. What's the proper fix here?

Below are three changes I'd like to discuss. I'll build a fresh kernel with
all three changes enabled + PM_TIMER soon.

Change 1:

Move the multiplication with HZ up into the mull instruction:

unsigned long __const_udelay(unsigned long xloops)
{
int d0;
__asm__("mull %0"
:"=d" (xloops), "=&a" (d0)
:"1" (xloops),"0" (LPJ * HZ));
return __delay(xloops);
}

1 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 99
1 usec: LPJ: 1500000 __udelay: 1000 vs my_udelay: 1499

2 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 199
2 usec: LPJ: 1500000 __udelay: 2000 vs my_udelay: 2999

5 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 499
5 usec: LPJ: 1500000 __udelay: 7000 vs my_udelay: 7498

10 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 999
10 usec: LPJ: 1500000 __udelay: 14000 vs my_udelay: 14996

20 usec: LPJ: 100000 __udelay: 1000 vs my_udelay: 1999
20 usec: LPJ: 1500000 __udelay: 29000 vs my_udelay: 29993

50 usec: LPJ: 100000 __udelay: 4000 vs my_udelay: 4998
50 usec: LPJ: 1500000 __udelay: 74000 vs my_udelay: 74983

100 usec: LPJ: 100000 __udelay: 9000 vs my_udelay: 9997
100 usec: LPJ: 1500000 __udelay: 149000 vs my_udelay: 149966

20000 usec: LPJ: 100000 __udelay: 1999000 vs my_udelay: 1999549
20000 usec: LPJ: 1500000 __udelay: 29993000 vs my_udelay: 29993243


Change 2:

Round up in __udelay. While it can be argued that some time is also
spent in the delay functions, it's better to spend _at least_ the specified
time sleeping, in my humble opinion.


- return __const_udelay2(usecs * 0x000010c6); /* 2**32 / 1000000 */
+ return __const_udelay2(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up)*/

1 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 100
1 usec: LPJ: 1500000 __udelay: 1000 vs my_udelay: 1500

2 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 200
2 usec: LPJ: 1500000 __udelay: 2000 vs my_udelay: 3000

5 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 500
5 usec: LPJ: 1500000 __udelay: 7000 vs my_udelay: 7500

10 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 1000
10 usec: LPJ: 1500000 __udelay: 14000 vs my_udelay: 15000

20 usec: LPJ: 100000 __udelay: 1000 vs my_udelay: 2000
20 usec: LPJ: 1500000 __udelay: 29000 vs my_udelay: 30000

50 usec: LPJ: 100000 __udelay: 4000 vs my_udelay: 5000
50 usec: LPJ: 1500000 __udelay: 74000 vs my_udelay: 75000

100 usec: LPJ: 100000 __udelay: 9000 vs my_udelay: 10000
100 usec: LPJ: 1500000 __udelay: 149000 vs my_udelay: 150001

20000 usec: LPJ: 100000 __udelay: 1999000 vs my_udelay: 2000015
20000 usec: LPJ: 1500000 __udelay: 29993000 vs my_udelay: 30000228


Change 3:

Asserting at least 1 loop is spent: in really small ndelay() calls to
low-mhz timers, this might be better.

return __delay(xloops ? xloops : 1);

Before:
1 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 0
2 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 0
5 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 0
10 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 1
20 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 2
50 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 5

After:
1 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
2 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
5 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
10 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
20 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 2
50 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 5



Dominik
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/