1 .. SPDX-License-Identifier: GPL-2.0 2 3 ============== 4 5-level paging 5 ============== 6 7 Overview 8 ======== 9 Original x86-64 was limited by 4-level paging to 256 TiB of virtual address 10 space and 64 TiB of physical address space. We are already bumping into 11 this limit: some vendors offer servers with 64 TiB of memory today. 12 13 To overcome the limitation upcoming hardware will introduce support for 14 5-level paging. It is a straight-forward extension of the current page 15 table structure adding one more layer of translation. 16 17 It bumps the limits to 128 PiB of virtual address space and 4 PiB of 18 physical address space. This "ought to be enough for anybody" ©. 19 20 QEMU 2.9 and later support 5-level paging. 21 22 Virtual memory layout for 5-level paging is described in 23 Documentation/arch/x86/x86_64/mm.rst 24 25 26 Enabling 5-level paging 27 ======================= 28 CONFIG_X86_5LEVEL=y enables the feature. 29 30 Kernel with CONFIG_X86_5LEVEL=y still able to boot on 4-level hardware. 31 In this case additional page table level -- p4d -- will be folded at 32 runtime. 33 34 User-space and large virtual address space 35 ========================================== 36 On x86, 5-level paging enables 56-bit userspace virtual address space. 37 Not all user space is ready to handle wide addresses. It's known that 38 at least some JIT compilers use higher bits in pointers to encode their 39 information. It collides with valid pointers with 5-level paging and 40 leads to crashes. 41 42 To mitigate this, we are not going to allocate virtual address space 43 above 47-bit by default. 44 45 But userspace can ask for allocation from full address space by 46 specifying hint address (with or without MAP_FIXED) above 47-bits. 47 48 If hint address set above 47-bit, but MAP_FIXED is not specified, we try 49 to look for unmapped area by specified address. If it's already 50 occupied, we look for unmapped area in *full* address space, rather than 51 from 47-bit window. 52 53 A high hint address would only affect the allocation in question, but not 54 any future mmap()s. 55 56 Specifying high hint address on older kernel or on machine without 5-level 57 paging support is safe. The hint will be ignored and kernel will fall back 58 to allocation from 47-bit address space. 59 60 This approach helps to easily make application's memory allocator aware 61 about large address space without manually tracking allocated virtual 62 address space. 63 64 One important case we need to handle here is interaction with MPX. 65 MPX (without MAWA extension) cannot handle addresses above 47-bit, so we 66 need to make sure that MPX cannot be enabled we already have VMA above 67 the boundary and forbid creating such VMAs once MPX is enabled.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.