Re: [PATCH] x86/kexec: Exclude GART aperture from vmcore

From: Baoquan He
Date: Sun Nov 12 2017 - 05:24:37 EST


On 11/12/17 at 04:04pm, Baoquan He wrote:
> On 11/07/17 at 04:34pm, Jiri Bohac wrote:
> > On Tue, Nov 07, 2017 at 02:42:12PM +0100, Jiri Bohac wrote:
> > > On Tue, Nov 07, 2017 at 07:39:56PM +0800, Baoquan He wrote:
> > > > don't worry about the user space kexec utility either.
> > >
> > > What's the problem with the userspace kexec? The bug is in
> > > reading /proc/vmcore by makedumpfile. kexec would only operate
> > > within the preallocated crashkernel area, right?
> >
> > right, I see it (without -s the kexec userspace creates the ELF header
> > later used the second kernel for /proc/vmcore).
>
> Yes, I meant this. In kernel, you can define global variable to store
> the starting address and end of GART aperture. While, in user space
> there's no way to know that. Now the non '-s' kexec are still being used
> by most of people.
>
> I roughly went through agp3.0 doc and GART code, the root cause for this
> issue should be:
>
> AMD system with GART need be enabled in BIOS in principle. Then firmware
> will arrange a hole in system address space, defaultly it's 64MB for GART
> aperture mapping, below and close to 4G usually. GART stands for Graphic
> Address Remap Table, each of its entry can be used to refer to a address
> region in the 64M of aperture for iommu usage.
>
> However, in your testing AMD system, you don't enable GART IOMMU support
> in BIOS setting. So the current implementation in kernel is to find a
> region which is occupied by system RAM and configre the starting addr
> and size into GART cofig registers' AMD64_GARTAPERTURECTL and
> AMD64_GARTAPERTUREBASE. And this happens in the first kernel. I believe
> in kdump kernel, since only resereved crashkernel region is taken as
> available system RAM, the rest of original RAM space is seen as hole.
> So kdump kernel will still use the 1st kernel's aperture region for GART,
> and it also has been set in GART register, kdump kernel think it as BIOS
> has reserved hole for GART aperture.
>
> Now the problem is that those pages reserved for GART aperture have been
> added into mm subsystem. GART is located on North Bridge. But when CPU
> try to access these them, will check North Bridge chip firstly, then
> hardware error occured that region has been set in GART registers which
^since (missed)
> locates in NB.
>
> Solution:
> 1) Remove the code which support GART IOMMU when it's not enabled in
> BIOS. This has been done in the new generation of hardware IOMMU like
> intel vt-d IOMMU and amd-Vi IOMMU. We should not make GART IOMMU be
> exceptional.
>
> 2) Remove those pages from mm subsystem since they are not seen any more
> though they have been added into mm subsystem, because CPU can't see
> them.
>
> 3) Remove the apreture region from /proc/iomem so that pages in that
> region can't be seen by kdump kernel. This is easier, but just a work
> around.
>
> Hi Yinghai, Joerg, and Bjorn
>
> Found patches you contributed to GART IOMMU, do you have any suggestion
> about this issue? Or any comment about these 3 options?
>
> I personally prefer the 1st one.
>
> Thanks
> Baoquan
>
> >
> > No idea how to fix that nicely...
> >
> > --
> > Jiri Bohac <jbohac@xxxxxxx>
> > SUSE Labs, Prague, Czechia
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@xxxxxxxxxxxxxxxxxxx
> > http://lists.infradead.org/mailman/listinfo/kexec
>
> _______________________________________________
> kexec mailing list
> kexec@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/kexec