Re: Workaround for Intel MPS errata

From: Jon Mason
Date: Thu Sep 29 2011 - 22:51:34 EST


On Thu, Sep 29, 2011 at 9:21 PM, Jesse Brandeburg
<jesse.brandeburg@xxxxxxxxx> wrote:
> On Thu, Sep 29, 2011 at 5:16 PM, Jon Mason <mason@xxxxxxxx> wrote:
>> Hey Avi,
>> Can you try this patch?  It should resolve the issue you are seeing.
>>
>> Thanks,
>> Jon
>>
>>    PCI: Workaround for Intel MPS errata
>>
>>    Intel 5000 and 5100 series memory controllers have a known issue if read
>>    completion coalescing is enabled (the default setting) and the PCI-E
>>    Maximum Payload Size is set to 256B.  To work around this issue, disable
>>    read completion coalescing if the MPS is 256B.
>
> Hey Jon, glad I could help out by pointing this erratum out on IRC
> today.  The patch looks mostly fine, with one nit, see below.

Sorry, I thought I gave you and Ben props in the commit log. That
will be corrected in the version I push (assuming it fixes the issue).
You really saved my rear :)

>> +       /* Disable read completion coalescing to allow an MPS of 256 */
>> +       if (mps == 256) {
>> +               int err;
>> +               u16 rcc;
>> +
>> +               /* Intel errata specifies bits to change but does not say what
>> +                * they are.  Keeping them magical until such time as the
>> +                * registers and values can be explained.
>> +                */
>> +               err = pci_read_config_word(dev, 0x48, &rcc);
>> +               if (err) {
>> +                       dev_err(&dev->dev, "Error attempting to read the read "
>> +                               "completion coalescing register.\n");
>> +                       return;
>> +               }
>> +
>> +               rcc &= ~(1 << 10);
>> +
>> +               err = pci_write_config_word(dev, 0x48, rcc);
>> +               if (err) {
>> +                       dev_err(&dev->dev, "Error attempting to read the read "
>
> this should be "to write the read "

Good catch.

>
>> +                               "completion coalescing register.\n");
>> +                       return;
>> +               }
>
> do you have to do anything to change the bit back to 1 if something
> sets the mps back to 128?

The only time this would really be called is at boot time. Its
possible that someone could hotplug an adapter with 256, then remove
it (and thereby reduce the performance). The problem is that nothing
is called to reset the MPS when a device is hotplug removed, and it
seems like overkill to add a callback only for this. However, it
should be documented in the code that the hole exists.

Thanks,
Jon

> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/