~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/mm/memory-model.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/mm/memory-model.rst (Version linux-6.12-rc7) and /Documentation/mm/memory-model.rst (Version linux-6.1.116)


  1 .. SPDX-License-Identifier: GPL-2.0                 1 .. SPDX-License-Identifier: GPL-2.0
  2                                                     2 
                                                   >>   3 .. _physical_memory_model:
                                                   >>   4 
  3 =====================                               5 =====================
  4 Physical Memory Model                               6 Physical Memory Model
  5 =====================                               7 =====================
  6                                                     8 
  7 Physical memory in a system may be addressed i      9 Physical memory in a system may be addressed in different ways. The
  8 simplest case is when the physical memory star     10 simplest case is when the physical memory starts at address 0 and
  9 spans a contiguous range up to the maximal add     11 spans a contiguous range up to the maximal address. It could be,
 10 however, that this range contains small holes      12 however, that this range contains small holes that are not accessible
 11 for the CPU. Then there could be several conti     13 for the CPU. Then there could be several contiguous ranges at
 12 completely distinct addresses. And, don't forg     14 completely distinct addresses. And, don't forget about NUMA, where
 13 different memory banks are attached to differe     15 different memory banks are attached to different CPUs.
 14                                                    16 
 15 Linux abstracts this diversity using one of th     17 Linux abstracts this diversity using one of the two memory models:
 16 FLATMEM and SPARSEMEM. Each architecture defin     18 FLATMEM and SPARSEMEM. Each architecture defines what
 17 memory models it supports, what the default me     19 memory models it supports, what the default memory model is and
 18 whether it is possible to manually override th     20 whether it is possible to manually override that default.
 19                                                    21 
 20 All the memory models track the status of phys     22 All the memory models track the status of physical page frames using
 21 struct page arranged in one or more arrays.        23 struct page arranged in one or more arrays.
 22                                                    24 
 23 Regardless of the selected memory model, there     25 Regardless of the selected memory model, there exists one-to-one
 24 mapping between the physical page frame number     26 mapping between the physical page frame number (PFN) and the
 25 corresponding `struct page`.                       27 corresponding `struct page`.
 26                                                    28 
 27 Each memory model defines :c:func:`pfn_to_page     29 Each memory model defines :c:func:`pfn_to_page` and :c:func:`page_to_pfn`
 28 helpers that allow the conversion from PFN to      30 helpers that allow the conversion from PFN to `struct page` and vice
 29 versa.                                             31 versa.
 30                                                    32 
 31 FLATMEM                                            33 FLATMEM
 32 =======                                            34 =======
 33                                                    35 
 34 The simplest memory model is FLATMEM. This mod     36 The simplest memory model is FLATMEM. This model is suitable for
 35 non-NUMA systems with contiguous, or mostly co     37 non-NUMA systems with contiguous, or mostly contiguous, physical
 36 memory.                                            38 memory.
 37                                                    39 
 38 In the FLATMEM memory model, there is a global     40 In the FLATMEM memory model, there is a global `mem_map` array that
 39 maps the entire physical memory. For most arch     41 maps the entire physical memory. For most architectures, the holes
 40 have entries in the `mem_map` array. The `stru     42 have entries in the `mem_map` array. The `struct page` objects
 41 corresponding to the holes are never fully ini     43 corresponding to the holes are never fully initialized.
 42                                                    44 
 43 To allocate the `mem_map` array, architecture      45 To allocate the `mem_map` array, architecture specific setup code should
 44 call :c:func:`free_area_init` function. Yet, t     46 call :c:func:`free_area_init` function. Yet, the mappings array is not
 45 usable until the call to :c:func:`memblock_fre     47 usable until the call to :c:func:`memblock_free_all` that hands all the
 46 memory to the page allocator.                      48 memory to the page allocator.
 47                                                    49 
 48 An architecture may free parts of the `mem_map     50 An architecture may free parts of the `mem_map` array that do not cover the
 49 actual physical pages. In such case, the archi     51 actual physical pages. In such case, the architecture specific
 50 :c:func:`pfn_valid` implementation should take     52 :c:func:`pfn_valid` implementation should take the holes in the
 51 `mem_map` into account.                            53 `mem_map` into account.
 52                                                    54 
 53 With FLATMEM, the conversion between a PFN and     55 With FLATMEM, the conversion between a PFN and the `struct page` is
 54 straightforward: `PFN - ARCH_PFN_OFFSET` is an     56 straightforward: `PFN - ARCH_PFN_OFFSET` is an index to the
 55 `mem_map` array.                                   57 `mem_map` array.
 56                                                    58 
 57 The `ARCH_PFN_OFFSET` defines the first page f     59 The `ARCH_PFN_OFFSET` defines the first page frame number for
 58 systems with physical memory starting at addre     60 systems with physical memory starting at address different from 0.
 59                                                    61 
 60 SPARSEMEM                                          62 SPARSEMEM
 61 =========                                          63 =========
 62                                                    64 
 63 SPARSEMEM is the most versatile memory model a     65 SPARSEMEM is the most versatile memory model available in Linux and it
 64 is the only memory model that supports several     66 is the only memory model that supports several advanced features such
 65 as hot-plug and hot-remove of the physical mem     67 as hot-plug and hot-remove of the physical memory, alternative memory
 66 maps for non-volatile memory devices and defer     68 maps for non-volatile memory devices and deferred initialization of
 67 the memory map for larger systems.                 69 the memory map for larger systems.
 68                                                    70 
 69 The SPARSEMEM model presents the physical memo     71 The SPARSEMEM model presents the physical memory as a collection of
 70 sections. A section is represented with struct     72 sections. A section is represented with struct mem_section
 71 that contains `section_mem_map` that is, logic     73 that contains `section_mem_map` that is, logically, a pointer to an
 72 array of struct pages. However, it is stored w     74 array of struct pages. However, it is stored with some other magic
 73 that aids the sections management. The section     75 that aids the sections management. The section size and maximal number
 74 of section is specified using `SECTION_SIZE_BI     76 of section is specified using `SECTION_SIZE_BITS` and
 75 `MAX_PHYSMEM_BITS` constants defined by each a     77 `MAX_PHYSMEM_BITS` constants defined by each architecture that
 76 supports SPARSEMEM. While `MAX_PHYSMEM_BITS` i     78 supports SPARSEMEM. While `MAX_PHYSMEM_BITS` is an actual width of a
 77 physical address that an architecture supports     79 physical address that an architecture supports, the
 78 `SECTION_SIZE_BITS` is an arbitrary value.         80 `SECTION_SIZE_BITS` is an arbitrary value.
 79                                                    81 
 80 The maximal number of sections is denoted `NR_     82 The maximal number of sections is denoted `NR_MEM_SECTIONS` and
 81 defined as                                         83 defined as
 82                                                    84 
 83 .. math::                                          85 .. math::
 84                                                    86 
 85    NR\_MEM\_SECTIONS = 2 ^ {(MAX\_PHYSMEM\_BIT     87    NR\_MEM\_SECTIONS = 2 ^ {(MAX\_PHYSMEM\_BITS - SECTION\_SIZE\_BITS)}
 86                                                    88 
 87 The `mem_section` objects are arranged in a tw     89 The `mem_section` objects are arranged in a two-dimensional array
 88 called `mem_sections`. The size and placement      90 called `mem_sections`. The size and placement of this array depend
 89 on `CONFIG_SPARSEMEM_EXTREME` and the maximal      91 on `CONFIG_SPARSEMEM_EXTREME` and the maximal possible number of
 90 sections:                                          92 sections:
 91                                                    93 
 92 * When `CONFIG_SPARSEMEM_EXTREME` is disabled,     94 * When `CONFIG_SPARSEMEM_EXTREME` is disabled, the `mem_sections`
 93   array is static and has `NR_MEM_SECTIONS` ro     95   array is static and has `NR_MEM_SECTIONS` rows. Each row holds a
 94   single `mem_section` object.                     96   single `mem_section` object.
 95 * When `CONFIG_SPARSEMEM_EXTREME` is enabled,      97 * When `CONFIG_SPARSEMEM_EXTREME` is enabled, the `mem_sections`
 96   array is dynamically allocated. Each row con     98   array is dynamically allocated. Each row contains PAGE_SIZE worth of
 97   `mem_section` objects and the number of rows     99   `mem_section` objects and the number of rows is calculated to fit
 98   all the memory sections.                        100   all the memory sections.
 99                                                   101 
100 The architecture setup code should call sparse    102 The architecture setup code should call sparse_init() to
101 initialize the memory sections and the memory     103 initialize the memory sections and the memory maps.
102                                                   104 
103 With SPARSEMEM there are two possible ways to     105 With SPARSEMEM there are two possible ways to convert a PFN to the
104 corresponding `struct page` - a "classic spars    106 corresponding `struct page` - a "classic sparse" and "sparse
105 vmemmap". The selection is made at build time     107 vmemmap". The selection is made at build time and it is determined by
106 the value of `CONFIG_SPARSEMEM_VMEMMAP`.          108 the value of `CONFIG_SPARSEMEM_VMEMMAP`.
107                                                   109 
108 The classic sparse encodes the section number     110 The classic sparse encodes the section number of a page in page->flags
109 and uses high bits of a PFN to access the sect    111 and uses high bits of a PFN to access the section that maps that page
110 frame. Inside a section, the PFN is the index     112 frame. Inside a section, the PFN is the index to the array of pages.
111                                                   113 
112 The sparse vmemmap uses a virtually mapped mem    114 The sparse vmemmap uses a virtually mapped memory map to optimize
113 pfn_to_page and page_to_pfn operations. There     115 pfn_to_page and page_to_pfn operations. There is a global `struct
114 page *vmemmap` pointer that points to a virtua    116 page *vmemmap` pointer that points to a virtually contiguous array of
115 `struct page` objects. A PFN is an index to th    117 `struct page` objects. A PFN is an index to that array and the
116 offset of the `struct page` from `vmemmap` is     118 offset of the `struct page` from `vmemmap` is the PFN of that
117 page.                                             119 page.
118                                                   120 
119 To use vmemmap, an architecture has to reserve    121 To use vmemmap, an architecture has to reserve a range of virtual
120 addresses that will map the physical pages con    122 addresses that will map the physical pages containing the memory
121 map and make sure that `vmemmap` points to tha    123 map and make sure that `vmemmap` points to that range. In addition,
122 the architecture should implement :c:func:`vme    124 the architecture should implement :c:func:`vmemmap_populate` method
123 that will allocate the physical memory and cre    125 that will allocate the physical memory and create page tables for the
124 virtual memory map. If an architecture does no    126 virtual memory map. If an architecture does not have any special
125 requirements for the vmemmap mappings, it can     127 requirements for the vmemmap mappings, it can use default
126 :c:func:`vmemmap_populate_basepages` provided     128 :c:func:`vmemmap_populate_basepages` provided by the generic memory
127 management.                                       129 management.
128                                                   130 
129 The virtually mapped memory map allows storing    131 The virtually mapped memory map allows storing `struct page` objects
130 for persistent memory devices in pre-allocated    132 for persistent memory devices in pre-allocated storage on those
131 devices. This storage is represented with stru    133 devices. This storage is represented with struct vmem_altmap
132 that is eventually passed to vmemmap_populate(    134 that is eventually passed to vmemmap_populate() through a long chain
133 of function calls. The vmemmap_populate() impl    135 of function calls. The vmemmap_populate() implementation may use the
134 `vmem_altmap` along with :c:func:`vmemmap_allo    136 `vmem_altmap` along with :c:func:`vmemmap_alloc_block_buf` helper to
135 allocate memory map on the persistent memory d    137 allocate memory map on the persistent memory device.
136                                                   138 
137 ZONE_DEVICE                                       139 ZONE_DEVICE
138 ===========                                       140 ===========
139 The `ZONE_DEVICE` facility builds upon `SPARSE    141 The `ZONE_DEVICE` facility builds upon `SPARSEMEM_VMEMMAP` to offer
140 `struct page` `mem_map` services for device dr    142 `struct page` `mem_map` services for device driver identified physical
141 address ranges. The "device" aspect of `ZONE_D    143 address ranges. The "device" aspect of `ZONE_DEVICE` relates to the fact
142 that the page objects for these address ranges    144 that the page objects for these address ranges are never marked online,
143 and that a reference must be taken against the    145 and that a reference must be taken against the device, not just the page
144 to keep the memory pinned for active use. `ZON    146 to keep the memory pinned for active use. `ZONE_DEVICE`, via
145 :c:func:`devm_memremap_pages`, performs just e    147 :c:func:`devm_memremap_pages`, performs just enough memory hotplug to
146 turn on :c:func:`pfn_to_page`, :c:func:`page_t    148 turn on :c:func:`pfn_to_page`, :c:func:`page_to_pfn`, and
147 :c:func:`get_user_pages` service for the given    149 :c:func:`get_user_pages` service for the given range of pfns. Since the
148 page reference count never drops below 1 the p    150 page reference count never drops below 1 the page is never tracked as
149 free memory and the page's `struct list_head l    151 free memory and the page's `struct list_head lru` space is repurposed
150 for back referencing to the host device / driv    152 for back referencing to the host device / driver that mapped the memory.
151                                                   153 
152 While `SPARSEMEM` presents memory as a collect    154 While `SPARSEMEM` presents memory as a collection of sections,
153 optionally collected into memory blocks, `ZONE    155 optionally collected into memory blocks, `ZONE_DEVICE` users have a need
154 for smaller granularity of populating the `mem    156 for smaller granularity of populating the `mem_map`. Given that
155 `ZONE_DEVICE` memory is never marked online it    157 `ZONE_DEVICE` memory is never marked online it is subsequently never
156 subject to its memory ranges being exposed thr    158 subject to its memory ranges being exposed through the sysfs memory
157 hotplug api on memory block boundaries. The im    159 hotplug api on memory block boundaries. The implementation relies on
158 this lack of user-api constraint to allow sub-    160 this lack of user-api constraint to allow sub-section sized memory
159 ranges to be specified to :c:func:`arch_add_me    161 ranges to be specified to :c:func:`arch_add_memory`, the top-half of
160 memory hotplug. Sub-section support allows for    162 memory hotplug. Sub-section support allows for 2MB as the cross-arch
161 common alignment granularity for :c:func:`devm    163 common alignment granularity for :c:func:`devm_memremap_pages`.
162                                                   164 
163 The users of `ZONE_DEVICE` are:                   165 The users of `ZONE_DEVICE` are:
164                                                   166 
165 * pmem: Map platform persistent memory to be u    167 * pmem: Map platform persistent memory to be used as a direct-I/O target
166   via DAX mappings.                               168   via DAX mappings.
167                                                   169 
168 * hmm: Extend `ZONE_DEVICE` with `->page_fault    170 * hmm: Extend `ZONE_DEVICE` with `->page_fault()` and `->page_free()`
169   event callbacks to allow a device-driver to     171   event callbacks to allow a device-driver to coordinate memory management
170   events related to device-memory, typically G    172   events related to device-memory, typically GPU memory. See
171   Documentation/mm/hmm.rst.                       173   Documentation/mm/hmm.rst.
172                                                   174 
173 * p2pdma: Create `struct page` objects to allo    175 * p2pdma: Create `struct page` objects to allow peer devices in a
174   PCI/-E topology to coordinate direct-DMA ope    176   PCI/-E topology to coordinate direct-DMA operations between themselves,
175   i.e. bypass host memory.                        177   i.e. bypass host memory.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php