Re: PCI resource problems caused by improper address rounding

From: Linus Torvalds
Date: Thu Dec 20 2007 - 20:03:22 EST




On Thu, 20 Dec 2007, Richard Henderson wrote:
>
> This breaks in odd cases where the amount of memory in the system
> is not a nice round number. Like throwing two 128MB sticks into
> a system that already has 2gb. A 512MB allocation will get placed
> back at 2gb, on top of the end of ram.

No, no, you misunderstand.

The kernel *always* takes known memory allocations into account. The
"minimum PCI starting allocation" value is not there to protect memory we
know about: the resource management already does that!

So if you have real memory of 2GB+128MB, and you want a 512MB allocation,
then yes, maybe the "preferred starting point" would be rounded back down
to 2GB, but the resource allocator would still take known resources into
account, and skip that address as being a conflict, and then try the next
address that suits the alignment requirements, and try to see if there's a
big enough hole at the 2.5GB mark.

So it would all work fine.

The reason we have that "min" parameter is not because of those _known_
resources, it's exactly because we have been bitten too many times by
BIOSes that lay out magic undocumented resources in memory that we simply
don't know about, because they aren't standard BAR resources, but some
other special magic stuff. Things like the special ACPI areas that the
northbridge recognizes, but aren't exposed as regular BAR's, but as just
magic registers hidden in some undocumented NB register space.

So the reason we have those PCIBIOS_MIN_IO/MEM things is not because we'd
trample on top of memory without them, it's because we might trample the
BIOS resources that it never told us about!

Quite often, that's things like stolen RAM that doesn't show up in the
e820 tables (it *should* show up as "reserved", but BIOS writers are
generally incompetent drug-addicts picked up from the streets, who just
randomly change BIOS tables until Windows boots on the machine), or the
afore-mentioned magic IO registers for some special motherboard resource.

> In order to get this kind of thing to work, you'd have to have a hard
> and a soft minimum.

We do have that "hard limit" - the resource management keeps track of all
the resources it already knows about. The "soft limit" is exactly that
PCIBIOS_MIN_MEM (which on a PC is that "pci_mem_start" variable). It's
just a hint, but it's a pretty important one, exactly because we've been
burned so many times by crap firmware and undocumented memory and MMIO
ranges.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/