~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/mm/numa.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/mm/numa.rst (Version linux-6.12-rc7) and /Documentation/mm/numa.rst (Version policy-sample)


  1 Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com    
  2                                                   
  3 =============                                     
  4 What is NUMA?                                     
  5 =============                                     
  6                                                   
  7 This question can be answered from a couple of    
  8 hardware view and the Linux software view.        
  9                                                   
 10 From the hardware perspective, a NUMA system i    
 11 comprises multiple components or assemblies ea    
 12 or more CPUs, local memory, and/or IO buses.      
 13 disambiguate the hardware view of these physic    
 14 from the software abstraction thereof, we'll c    
 15 'cells' in this document.                         
 16                                                   
 17 Each of the 'cells' may be viewed as an SMP [s    
 18 of the system--although some components necess    
 19 may not be populated on any given cell.   The     
 20 connected together with some sort of system in    
 21 point-to-point link are common types of NUMA s    
 22 these types of interconnects can be aggregated    
 23 cells at multiple distances from other cells.     
 24                                                   
 25 For Linux, the NUMA platforms of interest are     
 26 Coherent NUMA or ccNUMA systems.   With ccNUMA    
 27 to and accessible from any CPU attached to any    
 28 is handled in hardware by the processor caches    
 29                                                   
 30 Memory access time and effective memory bandwi    
 31 away the cell containing the CPU or IO bus mak    
 32 cell containing the target memory.  For exampl    
 33 attached to the same cell will experience fast    
 34 bandwidths than accesses to memory on other, r    
 35 can have cells at multiple remote distances fr    
 36                                                   
 37 Platform vendors don't build NUMA systems just    
 38 lives interesting.  Rather, this architecture     
 39 memory bandwidth.  However, to achieve scalabl    
 40 application software must arrange for a large     
 41 [cache misses] to be to "local" memory--memory    
 42 to the closest cell with memory.                  
 43                                                   
 44 This leads to the Linux software view of a NUM    
 45                                                   
 46 Linux divides the system's hardware resources     
 47 abstractions called "nodes".  Linux maps the n    
 48 of the hardware platform, abstracting away som    
 49 architectures.  As with physical cells, softwa    
 50 CPUs, memory and/or IO buses.  And, again, mem    
 51 "closer" nodes--nodes that map to closer cells    
 52 faster access times and higher effective bandw    
 53 remote cells.                                     
 54                                                   
 55 For some architectures, such as x86, Linux wil    
 56 physical cell that has no memory attached, and    
 57 that cell to a node representing a cell that d    
 58 these architectures, one cannot assume that al    
 59 a given node will see the same local memory ac    
 60                                                   
 61 In addition, for some architectures, again x86    
 62 the emulation of additional nodes.  For NUMA e    
 63 the existing nodes--or the system memory for n    
 64 nodes.  Each emulated node will manage a fract    
 65 physical memory.  NUMA emulation is useful for    
 66 application features on non-NUMA platforms, an    
 67 management mechanism when used together with c    
 68 [see Documentation/admin-guide/cgroup-v1/cpuse    
 69                                                   
 70 For each node with memory, Linux constructs an    
 71 subsystem, complete with its own free page lis    
 72 statistics and locks to mediate access.  In ad    
 73 each memory zone [one or more of DMA, DMA32, N    
 74 an ordered "zonelist".  A zonelist specifies t    
 75 selected zone/node cannot satisfy the allocati    
 76 when a zone has no available memory to satisfy    
 77 "overflow" or "fallback".                         
 78                                                   
 79 Because some nodes contain multiple zones cont    
 80 memory, Linux must decide whether to order the    
 81 fall back to the same zone type on a different    
 82 type on the same node.  This is an important c    
 83 such as DMA or DMA32, represent relatively sca    
 84 a default Node ordered zonelist. This means it    
 85 from the same node before using remote nodes w    
 86                                                   
 87 By default, Linux will attempt to satisfy memo    
 88 node to which the CPU that executes the reques    
 89 Linux will attempt to allocate from the first     
 90 for the node where the request originates.  Th    
 91 If the "local" node cannot satisfy the request    
 92 nodes' zones in the selected zonelist looking     
 93 that can satisfy the request.                     
 94                                                   
 95 Local allocation will tend to keep subsequent     
 96 "local" to the underlying physical resources a    
 97 as long as the task on whose behalf the kernel    
 98 later migrate away from that memory.  The Linu    
 99 NUMA topology of the platform--embodied in the    
100 structures [see Documentation/scheduler/sched-    
101 attempts to minimize task migration to distant    
102 the scheduler does not take a task's NUMA foot    
103 Thus, under sufficient imbalance, tasks can mi    
104 from their initial node and kernel data struct    
105                                                   
106 System administrators and application designer    
107 to improve NUMA locality using various CPU aff    
108 such as taskset(1) and numactl(1), and program    
109 sched_setaffinity(2).  Further, one can modify    
110 allocation behavior using Linux NUMA memory po    
111 Documentation/admin-guide/mm/numa_memory_polic    
112                                                   
113 System administrators can restrict the CPUs an    
114 privileged user can specify in the scheduling     
115 using control groups and CPUsets.  [see Docume    
116                                                   
117 On architectures that do not hide memoryless n    
118 zones [nodes] with memory in the zonelists.  T    
119 node the "local memory node"--the node of the     
120 zonelist--will not be the node itself.  Rather    
121 kernel selected as the nearest node with memor    
122 So, default, local allocations will succeed wi    
123 closest available memory.  This is a consequen    
124 allows such allocations to fallback to other n    
125 does contain memory overflows.                    
126                                                   
127 Some kernel allocations do not want or cannot     
128 behavior.  Rather they want to be sure they ge    
129 or get notified that the node has no free memo    
130 a subsystem allocates per CPU memory resources    
131                                                   
132 A typical model for making such an allocation     
133 node to which the "current CPU" is attached us    
134 numa_node_id() or CPU_to_node() functions and     
135 the node id returned.  When such an allocation    
136 may revert to its own fallback path.  The slab    
137 example of this.  Or, the subsystem may choose    
138 itself on allocation failure.  The kernel prof    
139 this.                                             
140                                                   
141 If the architecture supports--does not hide--m    
142 attached to memoryless nodes would always incu    
143 or some subsystems would fail to initialize if    
144 memory exclusively from a node without memory.    
145 architectures transparently, kernel subsystems    
146 or cpu_to_mem() function to locate the "local     
147 specified CPU.  Again, this is the same node f    
148 allocations will be attempted.                    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php