Re: [linux-usb-devel] Re: serious 2.6 bug in USB subsystem?

From: David Brownell
Date: Mon Mar 08 2004 - 17:02:32 EST


David Mosberger wrote:

consistency = coherency + ordering

That's one of the things I meant when I said the docs weren't
quite there yet ... the dma_*() API docs don't address how
to achieve the ordering, or even mention that it's an issue.

Plus, there are definitions of coherency that include event
ordering; it's not clear which definition is intended.


Anyway, see the attached patch. It goes on top of 2.6.4-bk2
and passes my unlink tests (the ones that haven't been run
much on recent kernels!) on several different OHCI hosts.

Other issues may be lurking, but this seems to fix a handful
of nasties that I could reproduce.

- Dave

Fixes for some recent OHCI instability.

- Never munge ed->hwTailP once it's set up, except when appending
to the queue.

- Don't clear control/bulk "current ED" registers when taking
entries off the queue. The chip may still be using the value.

- Revert part of previous bk1 patch, which removed a fast path
that makes a visible difference in some common operations.
Removing it just slowed timings, and didn't fix anything.

- Make the "unlink during submit" path behave better (SMP).


--- 1.50/drivers/usb/host/ohci-q.c Tue Mar 2 22:48:07 2004
+++ edited/drivers/usb/host/ohci-q.c Mon Mar 8 13:15:08 2004
@@ -282,7 +282,6 @@
if (!ed->hwNextED) {
ohci->hc_control &= ~OHCI_CTRL_CLE;
writel (ohci->hc_control, &ohci->regs->control);
- writel (0, &ohci->regs->ed_controlcurrent);
// post those pci writes
(void) readl (&ohci->regs->control);
}
@@ -306,7 +305,6 @@
if (!ed->hwNextED) {
ohci->hc_control &= ~OHCI_CTRL_BLE;
writel (ohci->hc_control, &ohci->regs->control);
- writel (0, &ohci->regs->ed_bulkcurrent);
// post those pci writes
(void) readl (&ohci->regs->control);
}
@@ -331,6 +329,19 @@
periodic_unlink (ohci, ed);
break;
}
+
+ /* NOTE: Except for a couple of exceptionally clean unlink cases
+ * (like unlinking the only c/b ED, with no TDs) HCs may still be
+ * caching this operational ED (or its address). Safe unlinking
+ * involves not marking it ED_IDLE till INTR_SF; we always do that
+ * if td_list isn't empty. Otherwise the race is small; but ...
+ */
+ if (ed->state == ED_OPER) {
+ ed->state = ED_IDLE;
+ ed->hwINFO &= ~(ED_SKIP | ED_DEQUEUE);
+ ed->hwHeadP &= ~ED_H;
+ wmb ();
+ }
}


@@ -794,8 +805,6 @@
next->next_dl_td = rev;
rev = next;

- if (ed->hwTailP == cpu_to_le32 (next->td_dma))
- ed->hwTailP = next->hwNextTD;
ed->hwHeadP = next->hwNextTD | toggle;
}

@@ -941,12 +950,7 @@
continue;
}

- /* patch pointers hc uses ... tail, if we're removing
- * an otherwise active td, and whatever td pointer
- * points to this td
- */
- if (ed->hwTailP == cpu_to_le32 (td->td_dma))
- ed->hwTailP = td->hwNextTD;
+ /* patch pointer hc uses */
savebits = *prev & ~cpu_to_le32 (TD_MASK);
*prev = td->hwNextTD | savebits;

@@ -1041,7 +1045,7 @@

/* clean schedule: unlink EDs that are no longer busy */
if (list_empty (&ed->td_list))
- start_ed_unlink (ohci, ed);
+ ed_deschedule (ohci, ed);
/* ... reenabling halted EDs only after fault cleanup */
else if ((ed->hwINFO & (ED_SKIP | ED_DEQUEUE)) == ED_SKIP) {
td = list_entry (ed->td_list.next, struct td, td_list);
--- 1.57/drivers/usb/host/ohci-hcd.c Tue Mar 2 22:48:07 2004
+++ edited/drivers/usb/host/ohci-hcd.c Mon Mar 8 12:39:24 2004
@@ -233,7 +233,7 @@
spin_lock (&urb->lock);
if (urb->status != -EINPROGRESS) {
spin_unlock (&urb->lock);
-
+ urb->hcpriv = urb_priv;
finish_urb (ohci, urb, 0);
retval = 0;
goto fail;