Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)

From: H. Peter Anvin
Date: Wed Jun 22 2011 - 16:34:08 EST

On 06/22/2011 01:30 PM, Stefan Assmann wrote:
> On 22.06.2011 20:15, H. Peter Anvin wrote:
>> On 06/22/2011 04:18 AM, Stefan Assmann wrote:
>>> The idea is to allow the user to specify RAM addresses that shouldn't be
>>> touched by the OS, because they are broken in some way. Not all machines have
>>> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>>> use bitmasks to mask address patterns with the new "badram" kernel command line
>>> parameter.
>>> Memtest86 has an option to generate these patterns since v2.3 so the only thing
>>> for the user to do should be:
>>> - run Memtest86
>>> - note down the pattern
>>> - add badram=<pattern> to the kernel command line
>> We already support the equivalent functionality with
>> memmap=<address>$<length> for those with only a few ranges... this has
>> been supported for ages, literally. For those with a lot of ranges,
>> like Google, the command line is insufficient.
> Right, I think this has been discussed a while ago. So the advantages I
> see in this approach are. It allows to break down memory exclusion to
> the page level with a pattern of non-consecutive pages. So if every
> other page would be considered bad that's a bit tough to deal with using
> memmap.
> Secondly patterns can be easily generated by running Memtest86 and thus
> easily be fed to the kernel by command line. Making it much more feasible
> for the average user to take advantage of it.

How common are nontrivial patterns on real hardware? This would be
interesting to hear from Google or another large user.

If so, we should probably introduce this as another linked-list data
structure; we can allow it to be preprocessed from the command line if
need be.

I have to say I think Google's point that truncating the list is
unacceptable... that would mean running in a known-bad configuration,
and even a hard crash would be better.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at