Re: Question Regarding ERMS memcpy

From: Logan Gunthorpe
Date: Mon Mar 06 2017 - 02:02:57 EST



On Sun, Mar 05, 2017 at 11:19:42AM -0800, Linus Torvalds wrote:
>> But it is *not* the right thing to use on IO memory, because the CPU
>> only does the magic cacheline access optimizations on cacheable
>> memory!

Yes, and actually this is where I started. I thought my memcpy was using
byte accesses on purpose and I needed to create a patch for a different
IO memcpy because obviously byte accesses over the PCI bus would be very
un-ideal. However, when I found my system wasn't intentionally using
that implementation that was no longer my focus.

So, I have no way to test this, but it sounds like any Ivy bridge system
using the ERMS version of memcpy could have the same slow PCI memcpy
performance I've been seeing (unless the microcode fixes it up?). So it
sounds like it would be a good idea to revert the change Linus is
talking about.

>> So I think we should re-introduce that old "__inline_memcpy()" as that
>> special "safe memcpy" thing. Not just for KMEMCHECK, and not just for
>> 64-bit.

On 05/03/17 12:54 PM, Borislav Petkov wrote:
> Logan, wanna give that a try, see if it takes care of your issue?

Well honestly my issue was solved by fixing my kernel config. I have no
idea why I had optimize for size in there in the first place.

Thanks,

Logan