[RFC] Question about TLB flush while set Stage-2 huge pages

From: Zheng Xiang
Date: Mon Mar 11 2019 - 12:31:40 EST


Hi all,

While a page is merged into a transparent huge page, KVM will invalidate Stage-2 for
the base address of the huge page and the whole of Stage-1.
However, this just only invalidates the first page within the huge page and the other
pages are not invalidated, see bellow:

+---------------+--------------+
|abcde 2MB-Page |
+---------------+--------------+

TLB before setting new pmd:
+---------------+--------------+
| VA | PAGESIZE |
+---------------+--------------+
| a | 4KB |
+---------------+--------------+
| b | 4KB |
+---------------+--------------+
| c | 4KB |
+---------------+--------------+
| d | 4KB |
+---------------+--------------+

TLB after setting new pmd:
+---------------+--------------+
| VA | PAGESIZE |
+---------------+--------------+
| a | 2MB |
+---------------+--------------+
| b | 4KB |
+---------------+--------------+
| c | 4KB |
+---------------+--------------+
| d | 4KB |
+---------------+--------------+

When VM access *b* address, it will hit the TLB and result in TLB conflict aborts or other potential exceptions.

For example, we need to keep tracking of the VM memory dirty pages when VM is in live migration.
KVM will set the memslot READONLY and split the huge pages.
After live migration is canceled and abort, the pages will be merged into THP.
The later access to these pages which are READONLY will cause level-3 Permission Fault until they are invalidated.

So should we invalidate the tlb entries for all relative pages(e.g a,b,c,d), like __flush_tlb_range()?
Or we can call __kvm_tlb_flush_vmid() to invalidate all tlb entries.



--

Thanks,
Xiang