Re: [PATCH] disable CPU side GART accesses
From: Bob Montgomery
Date: Thu Oct 16 2008 - 13:00:47 EST
On Thu, 2008-10-16 at 00:22 +0000, Yinghai Lu wrote:
> Ingo Molnar wrote:
> > (Cc:-ed the GART folks.)
> > * Bob Montgomery <bob.montgomery@xxxxxx> wrote:
> >> This patch prevents improper access of the GART aperture from kdump
> >> kernels running on AMD systems.
> >> This patch changes the initialization of the GART to set the
> >> DISGARTCPU bit in the GART Aperture Control Register
> >> (AMD64_GARTAPERTURECTL). Setting the bit prevents requests from the
> >> CPUs from accessing the GART. In other words, CPU memory accesses to
> >> the aperture address range will not cause the GART to perform an
> >> address translation. The aperture area is currently being unmapped at
> >> the kernel level with set_memory_np() in gart_iommu_init to prevent
> >> accesses from the CPU, but that kernel level unmapping is not in
> >> effect in the kexec'd kdump kernel. By disabling the CPU-side
> >> accesses within the GART, which does persist through the kexec of the
> >> kdump kernel, the kdump kernel is prevented from interacting with the
> >> GART during accesses to the dump memory areas which include the
> >> address range of the GART aperture.
> >> Although the patch can be applied
> >> to the kdump kernel, it is not exercised there because the kdump
> >> kernel doesn't attempt to initialize the GART, since it typically runs
> >> in less than 4 GB of memory.
> how about area is not used by IOMMU in GART?
> the code only set np to the iommu window.
I think you are not seeing the difference between kexec and kdump. The
kdump kernel runs out of a pre-allocated area of memory that was taken
away from the original kernel, for example:
01000000-08ffffff : Crash kernel
The kdump kernel does not try to re-use any of the original kernel's
memory, it only wants to copy it to a dump file. The kdump kernel is
running in a small area of memory, so during initialization it ignores
the GART, since it doesn't need IOMMU translation to do IO to its memory
in the Crash kernel area.
The problem occurs when the copy operation reads from the GART aperture
(iommu window) and wakes up the GART translation hardware. This patch
stops that by telling the GART to ignore addresses that come from the
CPU and to only translate addresses from the IO side.
> also following patch should fix the problem with kexec/kdump already. that patch is in mainline from 2.6.25-rc1.
This problem was confirmed and then fixed by my patch on a 2.6.27 kernel
and independently on a 2.6.27-rc8 kernel by Chandru. So it seems that
your patch does not fix this kdump-related problem.
> commit aaf230424204864e2833dcc1da23e2cb0b9f39cd
> Author: Yinghai Lu <Yinghai.Lu@xxxxxxx>
> Date: Wed Jan 30 13:33:09 2008 +0100
> x86: disable the GART early, 64-bit
> For K8 system: 4G RAM with memory hole remapping enabled, or more than
> 4G RAM installed.
> when try to use kexec second kernel, and the first doesn't include
> gart_shutdown. the second kernel could have different aper position than
> the first kernel. and second kernel could use that hole as RAM that is
> still used by GART set by the first kernel. esp. when try to kexec
> 2.6.24 with sparse mem enable from previous kernel (from RHEL 5 or SLES
> 10). the new kernel will use aper by GART (set by first kernel) for
> vmemmap. and after new kernel setting one new GART. the position will be
> real RAM. the _mapcount set is lost.
> Bad page state in process 'swapper'
> page:ffffe2000e600020 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
> Trying to fix it up, but a reboot is needed
> Pid: 0, comm: swapper Not tainted 2.6.24-rc7-smp-gcdf71a10-dirty #13
> Call Trace:
> [<ffffffff8026401f>] bad_page+0x63/0x8d
> [<ffffffff80264169>] __free_pages_ok+0x7c/0x2a5
> [<ffffffff80ba75d1>] free_all_bootmem_core+0xd0/0x198
> [<ffffffff80ba3a42>] numa_free_all_bootmem+0x3b/0x76
> [<ffffffff80ba3461>] mem_init+0x3b/0x152
> [<ffffffff80b959d3>] start_kernel+0x236/0x2c2
> [<ffffffff80b9511a>] _sinittext+0x11a/0x121
> [ffffe2000e600000-ffffe2000e7fffff] PMD ->ffff81001c200000 on node 0
> phys addr is : 0x1c200000
> RHEL 5.1 kernel -53 said:
> PCI-DMA: aperture base @ 1c000000 size 65536 KB
> new kernel said:
> Mapping aperture over 65536 KB of RAM @ 3c000000
> So could try to disable that GART if possible.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/