Re: [RFC Patch] use MTRR for write combining if PAT is not available

From: Thomas Schlichter
Date: Mon Oct 12 2009 - 15:57:35 EST


Thomas Hellstrom wrote:
> Hi!
>
> One problem with this patch is that it conflicts with the way graphics
> drivers traditionally handles
> the situation, namely
>
> 1) Set up mtrr
> 2) Map. If fallback to uncached minus we will still have write-combined
> access.
>
> I think mtrr-add used in this fashion will typically fail due to the
> alignment constraints. In particular,
> for set_memory_wc() the typical usage pattern is a large number of pages
> in a fragmented physical address space.

Yes, maybe this patch tries to change current behavior too less. Indeed, if
setting up MTRR entries it simply behaves as today, and userspace does not see
that write combining is not correctly enabled.

> So if we were to fix the problem with libpciaccess in the kernel, I
> think the best option would be to fail the user-space mapping when we
> can't make it write-combined.

One idea to do this would be the attached patch. It simply returns an error if
PAT is not available. It does not even try to use MTRR on its own. But maybe
even better would be to combine both patches to something like this:
1. try to use PAT
2. if this fails try to set up MTRR
3. if this also fails, return error

Kind regards,
Thomas
From afb48e1a1ef035c4580a5ce59a956b54a56a5c18 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@xxxxxx>
Date: Thu, 8 Oct 2009 00:42:47 +0200
Subject: [PATCH] Do not mmap/ioremap uncached when WC is requested

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, we
fall back to uncached mmap. Then libpciaccess thinks it succeeded mapping
with write combining anabled and does not set up suited MTRR entries. ;-(

So instead of silently falling back to uncached mapping, we better fail. In
this case libpciaccess mmaps via /sys/bus/pci/devices/*/resource0 and correctly
sets up MTRR entries.

Aditionally modify ioremap_wc and set_memory_wc to match this behavior.

Signed-off-by: Thomas Schlichter <thomas.schlichter@xxxxxx>
---
arch/x86/mm/ioremap.c | 10 +++++-----
arch/x86/mm/pageattr.c | 2 +-
arch/x86/pci/i386.c | 6 ++++++
3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 334e63c..293581e 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -268,11 +268,11 @@ EXPORT_SYMBOL(ioremap_nocache);
*/
void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
{
- if (pat_enabled)
- return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
- __builtin_return_address(0));
- else
- return ioremap_nocache(phys_addr, size);
+ if (!pat_enabled)
+ return NULL;
+
+ return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
+ __builtin_return_address(0));
}
EXPORT_SYMBOL(ioremap_wc);

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index dd38bfb..b1287a9 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1011,7 +1011,7 @@ int set_memory_wc(unsigned long addr, int numpages)
int ret;

if (!pat_enabled)
- return set_memory_uc(addr, numpages);
+ return -EINVAL;

ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
_PAGE_CACHE_WC, NULL);
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..cf63f9c 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -281,6 +281,12 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
if (mmap_state == pci_mmap_io)
return -EINVAL;

+ /* We cannot mmap write combining (WC) without PAT enabled.
+ * So better fail and let the user map without WC and use MTRR.
+ */
+ if (!pat_enabled && write_combine)
+ return -EINVAL;
+
prot = pgprot_val(vma->vm_page_prot);
if (pat_enabled && write_combine)
prot |= _PAGE_CACHE_WC;
--
1.6.4.4