1 .. SPDX-License-Identifier: GPL-2.0 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 ================ 3 ================ 4 Page Table Check 4 Page Table Check 5 ================ 5 ================ 6 6 7 Introduction 7 Introduction 8 ============ 8 ============ 9 9 10 Page table check allows to harden the kernel b 10 Page table check allows to harden the kernel by ensuring that some types of 11 the memory corruptions are prevented. 11 the memory corruptions are prevented. 12 12 13 Page table check performs extra verifications 13 Page table check performs extra verifications at the time when new pages become 14 accessible from the userspace by getting their 14 accessible from the userspace by getting their page table entries (PTEs PMDs 15 etc.) added into the table. 15 etc.) added into the table. 16 16 17 In case of most detected corruption, the kerne 17 In case of most detected corruption, the kernel is crashed. There is a small 18 performance and memory overhead associated wit 18 performance and memory overhead associated with the page table check. Therefore, 19 it is disabled by default, but can be optional 19 it is disabled by default, but can be optionally enabled on systems where the 20 extra hardening outweighs the performance cost 20 extra hardening outweighs the performance costs. Also, because page table check 21 is synchronous, it can help with debugging dou 21 is synchronous, it can help with debugging double map memory corruption issues, 22 by crashing kernel at the time wrong mapping o 22 by crashing kernel at the time wrong mapping occurs instead of later which is 23 often the case with memory corruptions bugs. 23 often the case with memory corruptions bugs. 24 24 25 It can also be used to do page table entry che 25 It can also be used to do page table entry checks over various flags, dump 26 warnings when illegal combinations of entry fl 26 warnings when illegal combinations of entry flags are detected. Currently, 27 userfaultfd is the only user of such to sanity 27 userfaultfd is the only user of such to sanity check wr-protect bit against 28 any writable flags. Illegal flag combinations 28 any writable flags. Illegal flag combinations will not directly cause data 29 corruption in this case immediately, but that 29 corruption in this case immediately, but that will cause read-only data to 30 be writable, leading to corrupt when the page 30 be writable, leading to corrupt when the page content is later modified. 31 31 32 Double mapping detection logic 32 Double mapping detection logic 33 ============================== 33 ============================== 34 34 35 +-------------------+-------------------+----- 35 +-------------------+-------------------+-------------------+------------------+ 36 | Current Mapping | New mapping | Perm 36 | Current Mapping | New mapping | Permissions | Rule | 37 +===================+===================+===== 37 +===================+===================+===================+==================+ 38 | Anonymous | Anonymous | Read 38 | Anonymous | Anonymous | Read | Allow | 39 +-------------------+-------------------+----- 39 +-------------------+-------------------+-------------------+------------------+ 40 | Anonymous | Anonymous | Read 40 | Anonymous | Anonymous | Read / Write | Prohibit | 41 +-------------------+-------------------+----- 41 +-------------------+-------------------+-------------------+------------------+ 42 | Anonymous | Named | Any 42 | Anonymous | Named | Any | Prohibit | 43 +-------------------+-------------------+----- 43 +-------------------+-------------------+-------------------+------------------+ 44 | Named | Anonymous | Any 44 | Named | Anonymous | Any | Prohibit | 45 +-------------------+-------------------+----- 45 +-------------------+-------------------+-------------------+------------------+ 46 | Named | Named | Any 46 | Named | Named | Any | Allow | 47 +-------------------+-------------------+----- 47 +-------------------+-------------------+-------------------+------------------+ 48 48 49 Enabling Page Table Check 49 Enabling Page Table Check 50 ========================= 50 ========================= 51 51 52 Build kernel with: 52 Build kernel with: 53 53 54 - PAGE_TABLE_CHECK=y 54 - PAGE_TABLE_CHECK=y 55 Note, it can only be enabled on platforms wh 55 Note, it can only be enabled on platforms where ARCH_SUPPORTS_PAGE_TABLE_CHECK 56 is available. 56 is available. 57 57 58 - Boot with 'page_table_check=on' kernel param 58 - Boot with 'page_table_check=on' kernel parameter. 59 59 60 Optionally, build kernel with PAGE_TABLE_CHECK 60 Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have page 61 table support without extra kernel parameter. 61 table support without extra kernel parameter. 62 62 63 Implementation notes 63 Implementation notes 64 ==================== 64 ==================== 65 65 66 We specifically decided not to use VMA informa 66 We specifically decided not to use VMA information in order to avoid relying on 67 MM states (except for limited "struct page" in 67 MM states (except for limited "struct page" info). The page table check is a 68 separate from Linux-MM state machine that veri 68 separate from Linux-MM state machine that verifies that the user accessible 69 pages are not falsely shared. 69 pages are not falsely shared. 70 70 71 PAGE_TABLE_CHECK depends on EXCLUSIVE_SYSTEM_R 71 PAGE_TABLE_CHECK depends on EXCLUSIVE_SYSTEM_RAM. The reason is that without 72 EXCLUSIVE_SYSTEM_RAM, users are allowed to map 72 EXCLUSIVE_SYSTEM_RAM, users are allowed to map arbitrary physical memory 73 regions into the userspace via /dev/mem. At th 73 regions into the userspace via /dev/mem. At the same time, pages may change 74 their properties (e.g., from anonymous pages t 74 their properties (e.g., from anonymous pages to named pages) while they are 75 still being mapped in the userspace, leading t 75 still being mapped in the userspace, leading to "corruption" detected by the 76 page table check. 76 page table check. 77 77 78 Even with EXCLUSIVE_SYSTEM_RAM, I/O pages may 78 Even with EXCLUSIVE_SYSTEM_RAM, I/O pages may be still allowed to be mapped via 79 /dev/mem. However, these pages are always cons 79 /dev/mem. However, these pages are always considered as named pages, so they 80 won't break the logic used in the page table c 80 won't break the logic used in the page table check.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.