Re: [PATCH] Document Linux's memory barriers [try #2]

From: Paul Mackerras
Date: Wed Mar 08 2006 - 22:42:56 EST


Linus Torvalds writes:

> x86 mmiowb would have to be a real op too if there were any multi-pathed
> PCI buses out there for x86, methinks.

Not if the manufacturers wanted to be able to run existing standard
x86 operating systems on it, surely.

I presume that on x86 the PCI host bridges and caches are all part of
the coherence domain, and that the rule about stores being observed in
order applies to what the PCI host bridge can see as much as it does
to any other agent in the coherence domain. And if I have understood
you correctly, the store ordering rule applies both to stores to
regular cacheable memory and stores to noncacheable nonprefetchable
MMIO registers without distinction.

If that is so, then I don't see how the writel's can get out of order.
Put another way, we expect spinlock regions to order stores to regular
memory, and AFAICS the x86 ordering rules mean that the same guarantee
should apply to stores to MMIO registers. (It's entirely possible
that I don't fully understand the x86 memory ordering rules, of
course. :)

> Basically, the issue boils down to one thing: no "normal" barrier will
> _ever_ show up on the bus on x86 (ie ia64, afaik). That, together with any

A spin_lock does show up on the bus, doesn't it?

> Would I want the hard-to-think-about IO ordering on a regular desktop
> platform? No.

In fact I think that mmiowb can actually be useful on PPC, if we can
be sure that all the drivers we care about will use it correctly.

If we can have the following rules:

* If you have stores to regular memory, followed by an MMIO store, and
you want the device to see the stores to regular memory at the point
where it receives the MMIO store, then you need a wmb() between the
stores to regular memory and the MMIO store.

* If you have PIO or MMIO accesses, and you need to ensure the
PIO/MMIO accesses don't get reordered with respect to PIO/MMIO
accesses on another CPU, put the accesses inside a spin-locked
region, and put a mmiowb() between the last access and the
spin_unlock.

* smp_wmb() doesn't necessarily do any ordering of MMIO accesses
vs. other accesses, and in that sense it is weaker than wmb().

... then I can remove the sync from write*, which would be nice, and
make mmiowb() be a sync. I wonder how long we're going to spend
chasing driver bugs after that, though. :)

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/