Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

From: Guenter Roeck
Date: Thu May 21 2020 - 19:02:17 EST


On 5/20/20 12:51 PM, Mike Rapoport wrote:
> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>> On 5/20/20 10:03 AM, Mike Rapoport wrote:
>>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>>> Now that the page table allocator can free page table allocations
>>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>>> to avoid needlessly wasting memory.
>>>>>>>
>>>>>>> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
>>>>>>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>>>>>> Signed-off-by: Will Deacon <will@xxxxxxxxxx>
>>>>>>
>>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>>> does reverting the rest of the series.
>>>>>>
>>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>>>>> the results here as well as soon as I have it.
>>>>
>>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>>> issues. However, linux-next is a different story, where I don't get very far
>>>> at all:
>>>>
>>>> BUG: Bad page state in process swapper pfn:005b4
>>>
>>> This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
>>> regions rather that check each PFN"). The commit sha is valid for
>>> v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
>>>
>>> As it seems, sparc32 never registered the memory occupied by the kernel
>>> image with memblock_add() and it only reserves this memory with
>>> meblock_reserve().
>>>
>>> I don't know what would happen on real HW, but with
>>>
>>> qemu-system-sparc -kernel /path/to/kernel
>>>
>>> the memory occupied by the kernel is reserved in openbios and removed
>>> from mem.available. The prom setup code in the kernel used mem.available
>>> to set up the memory banks and essentially there is a hole for the
>>> memory occupied by the kernel.
>>>
>>> Later in bootmem_init() this memory is memblock_reserve()d.
>>>
>>> Before the problematic commit, memmap initialization would call
>>> __init_single_page() for the pages in that hole, the
>>> free_low_memory_core_early() would mark them as resrved and everything
>>> would be Ok.
>>>
>>> After the change in memmap initialization, the hole is skipped and the
>>> page structs for it are not inited. And when they are passed from
>>> memblock to page allocator as reserved it gets confused.
>>>
>>> Simply registering the memory occupied by the kernel with memblock_add()
>>> resolves this issue, at least for qemu-system-arm and I cannot see how
>>> it can harm any other setup.
>>>
>>> If all that makes sense I'll send a proper patch :)
>>>
>>> diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
>>> index 906eda1158b4..3cb3dffcbcdc 100644
>>> --- a/arch/sparc/mm/init_32.c
>>> +++ b/arch/sparc/mm/init_32.c
>>> @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
>>> /* Reserve the kernel text/data/bss. */
>>> size = (start_pfn << PAGE_SHIFT) - phys_base;
>>> memblock_reserve(phys_base, size);
>>> + memblock_add(phys_base, size);
>>>
>>> size = memblock_phys_mem_size() - memblock_reserved_size();
>>> *pages_avail = (size >> PAGE_SHIFT) - high_pages;
>>>
>>>> Will
>>>
>>
>> With above patch applied on top of Ira's patch, I get:
>>
>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>> [f0067a64 :
>> do_raw_spin_lock+0xa8/0xd8 ]
>> [f00d5034 :
>> copy_page_range+0x328/0x804 ]
>> [f0025be4 :
>> dup_mm+0x334/0x434 ]
>> [f0027124 :
>> copy_process+0x1224/0x12b0 ]
>> [f0027344 :
>> _do_fork+0x54/0x30c ]
>> [f0027670 :
>> do_fork+0x5c/0x6c ]
>> [f000de44 :
>> sparc_do_fork+0x18/0x38 ]
>> [f000b7f4 :
>> do_syscall+0x34/0x40 ]
>> [5010cd4c :
>> 0x5010cd4c ]
>>
>> Looks like yet another problem.
>
> I've checked the patch above on top of the mmots which already has Ira's
> patches and it booted fine. I've used sparc32_defconfig to build the
> kernel and qemu-system-sparc with default machine and CPU.
>

Try sparc32_defconfig+SMP.

>> I can not revert c03584e30534 because it results in a compile failure.
>
> Here's the "revert" of c03584e30534:
>

Same problem (spinlock recursion) after applying it.

Guenter