~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/mm/concepts.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/admin-guide/mm/concepts.rst (Version linux-6.12-rc7) and /Documentation/admin-guide/mm/concepts.rst (Version linux-4.15.18)


  1 =================                                 
  2 Concepts overview                                 
  3 =================                                 
  4                                                   
  5 The memory management in Linux is a complex sy    
  6 years and included more and more functionality    
  7 systems from MMU-less microcontrollers to supe    
  8 management for systems without an MMU is calle    
  9 definitely deserves a dedicated document, whic    
 10 eventually written. Yet, although some of the     
 11 here we assume that an MMU is available and a     
 12 address to a physical address.                    
 13                                                   
 14 .. contents:: :local:                             
 15                                                   
 16 Virtual Memory Primer                             
 17 =====================                             
 18                                                   
 19 The physical memory in a computer system is a     
 20 even for systems that support memory hotplug t    
 21 the amount of memory that can be installed. Th    
 22 necessarily contiguous; it might be accessible    
 23 address ranges. Besides, different CPU archite    
 24 different implementations of the same architec    
 25 of how these address ranges are defined.          
 26                                                   
 27 All this makes dealing directly with physical     
 28 to avoid this complexity a concept of virtual     
 29                                                   
 30 The virtual memory abstracts the details of ph    
 31 application software, allows to keep only need    
 32 physical memory (demand paging) and provides a    
 33 protection and controlled sharing of data betw    
 34                                                   
 35 With virtual memory, each and every memory acc    
 36 address. When the CPU decodes an instruction t    
 37 writes) from (or to) the system memory, it tra    
 38 address encoded in that instruction to a `phys    
 39 memory controller can understand.                 
 40                                                   
 41 The physical system memory is divided into pag    
 42 size of each page is architecture specific. So    
 43 selection of the page size from several suppor    
 44 selection is performed at the kernel build tim    
 45 appropriate kernel configuration option.          
 46                                                   
 47 Each physical memory page can be mapped as one    
 48 pages. These mappings are described by page ta    
 49 translation from a virtual address used by pro    
 50 memory address. The page tables are organized     
 51                                                   
 52 The tables at the lowest level of the hierarch    
 53 addresses of actual pages used by the software    
 54 levels contain physical addresses of the pages    
 55 levels. The pointer to the top level page tabl    
 56 register. When the CPU performs the address tr    
 57 register to access the top level page table. T    
 58 virtual address are used to index an entry in     
 59 table. That entry is then used to access the n    
 60 hierarchy with the next bits of the virtual ad    
 61 that level page table. The lowest bits in the     
 62 the offset inside the actual page.                
 63                                                   
 64 Huge Pages                                        
 65 ==========                                        
 66                                                   
 67 The address translation requires several memor    
 68 accesses are slow relatively to CPU speed. To     
 69 processor cycles on the address translation, C    
 70 such translations called Translation Lookaside    
 71 TLB). Usually TLB is pretty scarce resource an    
 72 large memory working set will experience perfo    
 73 TLB misses.                                       
 74                                                   
 75 Many modern CPU architectures allow mapping of    
 76 directly by the higher levels in the page tabl    
 77 it is possible to map 2M and even 1G pages usi    
 78 and the third level page tables. In Linux such    
 79 `huge`. Usage of huge pages significantly redu    
 80 improves TLB hit-rate and thus improves overal    
 81                                                   
 82 There are two mechanisms in Linux that enable     
 83 memory with the huge pages. The first one is `    
 84 hugetlbfs. It is a pseudo filesystem that uses    
 85 store. For the files created in this filesyste    
 86 the memory and mapped using huge pages. The hu    
 87 Documentation/admin-guide/mm/hugetlbpage.rst.     
 88                                                   
 89 Another, more recent, mechanism that enables u    
 90 called `Transparent HugePages`, or THP. Unlike    
 91 requires users and/or system administrators to    
 92 the system memory should and can be mapped by     
 93 manages such mappings transparently to the use    
 94 name. See Documentation/admin-guide/mm/transhu    
 95 about THP.                                        
 96                                                   
 97 Zones                                             
 98 =====                                             
 99                                                   
100 Often hardware poses restrictions on how diffe    
101 ranges can be accessed. In some cases, devices    
102 all the addressable memory. In other cases, th    
103 memory exceeds the maximal addressable size of    
104 special actions are required to access portion    
105 groups memory pages into `zones` according to     
106 usage. For example, ZONE_DMA will contain memo    
107 devices for DMA, ZONE_HIGHMEM will contain mem    
108 permanently mapped into kernel's address space    
109 contain normally addressed pages.                 
110                                                   
111 The actual layout of the memory zones is hardw    
112 architectures define all zones, and requiremen    
113 for different platforms.                          
114                                                   
115 Nodes                                             
116 =====                                             
117                                                   
118 Many multi-processor machines are NUMA - Non-U    
119 systems. In such systems the memory is arrange    
120 different access latency depending on the "dis    
121 processor. Each bank is referred to as a `node    
122 constructs an independent memory management su    
123 own set of zones, lists of free and used pages    
124 counters. You can find more details about NUMA    
125 Documentation/mm/numa.rst` and in                 
126 Documentation/admin-guide/mm/numa_memory_polic    
127                                                   
128 Page cache                                        
129 ==========                                        
130                                                   
131 The physical memory is volatile and the common    
132 into the memory is to read it from files. When    
133 data is put into the `page cache` to avoid exp    
134 the subsequent reads. Similarly, when one writ    
135 is placed in the page cache and eventually get    
136 storage device. The written pages are marked a    
137 decides to reuse them for other purposes, it m    
138 the file contents on the device with the updat    
139                                                   
140 Anonymous Memory                                  
141 ================                                  
142                                                   
143 The `anonymous memory` or `anonymous mappings`    
144 is not backed by a filesystem. Such mappings a    
145 for program's stack and heap or by explicit ca    
146 call. Usually, the anonymous mappings only def    
147 that the program is allowed to access. The rea    
148 in creation of a page table entry that referen    
149 page filled with zeroes. When the program perf    
150 physical page will be allocated to hold the wr    
151 will be marked dirty and if the kernel decides    
152 the dirty page will be swapped out.               
153                                                   
154 Reclaim                                           
155 =======                                           
156                                                   
157 Throughout the system lifetime, a physical pag    
158 different types of data. It can be kernel inte    
159 DMA'able buffers for device drivers use, data     
160 memory allocated by user space processes etc.     
161                                                   
162 Depending on the page usage it is treated diff    
163 memory management. The pages that can be freed    
164 because they cache the data available elsewher    
165 hard disk, or because they can be swapped out,    
166 disk, are called `reclaimable`. The most notab    
167 reclaimable pages are page cache and anonymous    
168                                                   
169 In most cases, the pages holding internal kern    
170 buffers cannot be repurposed, and they remain     
171 their user. Such pages are called `unreclaimab    
172 circumstances, even pages occupied with kernel    
173 reclaimed. For instance, in-memory caches of f    
174 be re-read from the storage device and therefo    
175 discard them from the main memory when system     
176 pressure.                                         
177                                                   
178 The process of freeing the reclaimable physica    
179 repurposing them is called (surprise!) `reclai    
180 pages either asynchronously or synchronously,     
181 of the system. When the system is not loaded,     
182 and allocation requests will be satisfied imme    
183 pages supply. As the load increases, the amoun    
184 down and when it reaches a certain threshold (    
185 allocation request will awaken the ``kswapd``     
186 asynchronously scan memory pages and either ju    
187 they contain is available elsewhere, or evict     
188 device (remember those dirty pages?). As memor    
189 more and reaches another threshold - min water    
190 will trigger `direct reclaim`. In this case al    
191 until enough memory pages are reclaimed to sat    
192                                                   
193 Compaction                                        
194 ==========                                        
195                                                   
196 As the system runs, tasks allocate and free th    
197 fragmented. Although with virtual memory it is    
198 scattered physical pages as virtually contiguo    
199 necessary to allocate large physically contigu    
200 need may arise, for instance, when a device dr    
201 buffer for DMA, or when THP allocates a huge p    
202 addresses the fragmentation issue. This mechan    
203 from the lower part of a memory zone to free p    
204 of the zone. When a compaction scan is finishe    
205 together at the beginning of the zone and allo    
206 physically contiguous areas become possible.      
207                                                   
208 Like reclaim, the compaction may happen asynch    
209 daemon or synchronously as a result of a memor    
210                                                   
211 OOM killer                                        
212 ==========                                        
213                                                   
214 It is possible that on a loaded machine memory    
215 kernel will be unable to reclaim enough memory    
216 order to save the rest of the system, it invok    
217                                                   
218 The `OOM killer` selects a task to sacrifice f    
219 system health. The selected task is killed in     
220 enough memory will be freed to continue normal    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php