~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/driver-api/edac.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 Error Detection And Correction (EDAC) Devices
  2 =============================================
  3 
  4 Main Concepts used at the EDAC subsystem
  5 ----------------------------------------
  6 
  7 There are several things to be aware of that aren't at all obvious, like
  8 *sockets, *socket sets*, *banks*, *rows*, *chip-select rows*, *channels*,
  9 etc...
 10 
 11 These are some of the many terms that are thrown about that don't always
 12 mean what people think they mean (Inconceivable!).  In the interest of
 13 creating a common ground for discussion, terms and their definitions
 14 will be established.
 15 
 16 * Memory devices
 17 
 18 The individual DRAM chips on a memory stick.  These devices commonly
 19 output 4 and 8 bits each (x4, x8). Grouping several of these in parallel
 20 provides the number of bits that the memory controller expects:
 21 typically 72 bits, in order to provide 64 bits + 8 bits of ECC data.
 22 
 23 * Memory Stick
 24 
 25 A printed circuit board that aggregates multiple memory devices in
 26 parallel.  In general, this is the Field Replaceable Unit (FRU) which
 27 gets replaced, in the case of excessive errors. Most often it is also
 28 called DIMM (Dual Inline Memory Module).
 29 
 30 * Memory Socket
 31 
 32 A physical connector on the motherboard that accepts a single memory
 33 stick. Also called as "slot" on several datasheets.
 34 
 35 * Channel
 36 
 37 A memory controller channel, responsible to communicate with a group of
 38 DIMMs. Each channel has its own independent control (command) and data
 39 bus, and can be used independently or grouped with other channels.
 40 
 41 * Branch
 42 
 43 It is typically the highest hierarchy on a Fully-Buffered DIMM memory
 44 controller. Typically, it contains two channels. Two channels at the
 45 same branch can be used in single mode or in lockstep mode. When
 46 lockstep is enabled, the cacheline is doubled, but it generally brings
 47 some performance penalty. Also, it is generally not possible to point to
 48 just one memory stick when an error occurs, as the error correction code
 49 is calculated using two DIMMs instead of one. Due to that, it is capable
 50 of correcting more errors than on single mode.
 51 
 52 * Single-channel
 53 
 54 The data accessed by the memory controller is contained into one dimm
 55 only. E. g. if the data is 64 bits-wide, the data flows to the CPU using
 56 one 64 bits parallel access. Typically used with SDR, DDR, DDR2 and DDR3
 57 memories. FB-DIMM and RAMBUS use a different concept for channel, so
 58 this concept doesn't apply there.
 59 
 60 * Double-channel
 61 
 62 The data size accessed by the memory controller is interlaced into two
 63 dimms, accessed at the same time. E. g. if the DIMM is 64 bits-wide (72
 64 bits with ECC), the data flows to the CPU using a 128 bits parallel
 65 access.
 66 
 67 * Chip-select row
 68 
 69 This is the name of the DRAM signal used to select the DRAM ranks to be
 70 accessed. Common chip-select rows for single channel are 64 bits, for
 71 dual channel 128 bits. It may not be visible by the memory controller,
 72 as some DIMM types have a memory buffer that can hide direct access to
 73 it from the Memory Controller.
 74 
 75 * Single-Ranked stick
 76 
 77 A Single-ranked stick has 1 chip-select row of memory. Motherboards
 78 commonly drive two chip-select pins to a memory stick. A single-ranked
 79 stick, will occupy only one of those rows. The other will be unused.
 80 
 81 .. _doubleranked:
 82 
 83 * Double-Ranked stick
 84 
 85 A double-ranked stick has two chip-select rows which access different
 86 sets of memory devices.  The two rows cannot be accessed concurrently.
 87 
 88 * Double-sided stick
 89 
 90 **DEPRECATED TERM**, see :ref:`Double-Ranked stick <doubleranked>`.
 91 
 92 A double-sided stick has two chip-select rows which access different sets
 93 of memory devices. The two rows cannot be accessed concurrently.
 94 "Double-sided" is irrespective of the memory devices being mounted on
 95 both sides of the memory stick.
 96 
 97 * Socket set
 98 
 99 All of the memory sticks that are required for a single memory access or
100 all of the memory sticks spanned by a chip-select row.  A single socket
101 set has two chip-select rows and if double-sided sticks are used these
102 will occupy those chip-select rows.
103 
104 * Bank
105 
106 This term is avoided because it is unclear when needing to distinguish
107 between chip-select rows and socket sets.
108 
109 * High Bandwidth Memory (HBM)
110 
111 HBM is a new memory type with low power consumption and ultra-wide
112 communication lanes. It uses vertically stacked memory chips (DRAM dies)
113 interconnected by microscopic wires called "through-silicon vias," or
114 TSVs.
115 
116 Several stacks of HBM chips connect to the CPU or GPU through an ultra-fast
117 interconnect called the "interposer". Therefore, HBM's characteristics
118 are nearly indistinguishable from on-chip integrated RAM.
119 
120 Memory Controllers
121 ------------------
122 
123 Most of the EDAC core is focused on doing Memory Controller error detection.
124 The :c:func:`edac_mc_alloc`. It uses internally the struct ``mem_ctl_info``
125 to describe the memory controllers, with is an opaque struct for the EDAC
126 drivers. Only the EDAC core is allowed to touch it.
127 
128 .. kernel-doc:: include/linux/edac.h
129 
130 .. kernel-doc:: drivers/edac/edac_mc.h
131 
132 PCI Controllers
133 ---------------
134 
135 The EDAC subsystem provides a mechanism to handle PCI controllers by calling
136 the :c:func:`edac_pci_alloc_ctl_info`. It will use the struct
137 :c:type:`edac_pci_ctl_info` to describe the PCI controllers.
138 
139 .. kernel-doc:: drivers/edac/edac_pci.h
140 
141 EDAC Blocks
142 -----------
143 
144 The EDAC subsystem also provides a generic mechanism to report errors on
145 other parts of the hardware via :c:func:`edac_device_alloc_ctl_info` function.
146 
147 The structures :c:type:`edac_dev_sysfs_block_attribute`,
148 :c:type:`edac_device_block`, :c:type:`edac_device_instance` and
149 :c:type:`edac_device_ctl_info` provide a generic or abstract 'edac_device'
150 representation at sysfs.
151 
152 This set of structures and the code that implements the APIs for the same, provide for registering EDAC type devices which are NOT standard memory or
153 PCI, like:
154 
155 - CPU caches (L1 and L2)
156 - DMA engines
157 - Core CPU switches
158 - Fabric switch units
159 - PCIe interface controllers
160 - other EDAC/ECC type devices that can be monitored for
161   errors, etc.
162 
163 It allows for a 2 level set of hierarchy.
164 
165 For example, a cache could be composed of L1, L2 and L3 levels of cache.
166 Each CPU core would have its own L1 cache, while sharing L2 and maybe L3
167 caches. On such case, those can be represented via the following sysfs
168 nodes::
169 
170         /sys/devices/system/edac/..
171 
172         pci/            <existing pci directory (if available)>
173         mc/             <existing memory device directory>
174         cpu/cpu0/..     <L1 and L2 block directory>
175                 /L1-cache/ce_count
176                          /ue_count
177                 /L2-cache/ce_count
178                          /ue_count
179         cpu/cpu1/..     <L1 and L2 block directory>
180                 /L1-cache/ce_count
181                          /ue_count
182                 /L2-cache/ce_count
183                          /ue_count
184         ...
185 
186         the L1 and L2 directories would be "edac_device_block's"
187 
188 .. kernel-doc:: drivers/edac/edac_device.h
189 
190 
191 Heterogeneous system support
192 ----------------------------
193 
194 An AMD heterogeneous system is built by connecting the data fabrics of
195 both CPUs and GPUs via custom xGMI links. Thus, the data fabric on the
196 GPU nodes can be accessed the same way as the data fabric on CPU nodes.
197 
198 The MI200 accelerators are data center GPUs. They have 2 data fabrics,
199 and each GPU data fabric contains four Unified Memory Controllers (UMC).
200 Each UMC contains eight channels. Each UMC channel controls one 128-bit
201 HBM2e (2GB) channel (equivalent to 8 X 2GB ranks).  This creates a total
202 of 4096-bits of DRAM data bus.
203 
204 While the UMC is interfacing a 16GB (8high X 2GB DRAM) HBM stack, each UMC
205 channel is interfacing 2GB of DRAM (represented as rank).
206 
207 Memory controllers on AMD GPU nodes can be represented in EDAC thusly:
208 
209         GPU DF / GPU Node -> EDAC MC
210         GPU UMC           -> EDAC CSROW
211         GPU UMC channel   -> EDAC CHANNEL
212 
213 For example: a heterogeneous system with 1 AMD CPU is connected to
214 4 MI200 (Aldebaran) GPUs using xGMI.
215 
216 Some more heterogeneous hardware details:
217 
218 - The CPU UMC (Unified Memory Controller) is mostly the same as the GPU UMC.
219   They have chip selects (csrows) and channels. However, the layouts are different
220   for performance, physical layout, or other reasons.
221 - CPU UMCs use 1 channel, In this case UMC = EDAC channel. This follows the
222   marketing speak. CPU has X memory channels, etc.
223 - CPU UMCs use up to 4 chip selects, So UMC chip select = EDAC CSROW.
224 - GPU UMCs use 1 chip select, So UMC = EDAC CSROW.
225 - GPU UMCs use 8 channels, So UMC channel = EDAC channel.
226 
227 The EDAC subsystem provides a mechanism to handle AMD heterogeneous
228 systems by calling system specific ops for both CPUs and GPUs.
229 
230 AMD GPU nodes are enumerated in sequential order based on the PCI
231 hierarchy, and the first GPU node is assumed to have a Node ID value
232 following those of the CPU nodes after latter are fully populated::
233 
234         $ ls /sys/devices/system/edac/mc/
235                 mc0   - CPU MC node 0
236                 mc1  |
237                 mc2  |- GPU card[0] => node 0(mc1), node 1(mc2)
238                 mc3  |
239                 mc4  |- GPU card[1] => node 0(mc3), node 1(mc4)
240                 mc5  |
241                 mc6  |- GPU card[2] => node 0(mc5), node 1(mc6)
242                 mc7  |
243                 mc8  |- GPU card[3] => node 0(mc7), node 1(mc8)
244 
245 For example, a heterogeneous system with one AMD CPU is connected to
246 four MI200 (Aldebaran) GPUs using xGMI. This topology can be represented
247 via the following sysfs entries::
248 
249         /sys/devices/system/edac/mc/..
250 
251         CPU                     # CPU node
252         ├── mc 0
253 
254         GPU Nodes are enumerated sequentially after CPU nodes have been populated
255         GPU card 1              # Each MI200 GPU has 2 nodes/mcs
256         ├── mc 1          # GPU node 0 == mc1, Each MC node has 4 UMCs/CSROWs
257         │   ├── csrow 0               # UMC 0
258         │   │   ├── channel 0     # Each UMC has 8 channels
259         │   │   ├── channel 1   # size of each channel is 2 GB, so each UMC has 16 GB
260         │   │   ├── channel 2
261         │   │   ├── channel 3
262         │   │   ├── channel 4
263         │   │   ├── channel 5
264         │   │   ├── channel 6
265         │   │   ├── channel 7
266         │   ├── csrow 1               # UMC 1
267         │   │   ├── channel 0
268         │   │   ├── ..
269         │   │   ├── channel 7
270         │   ├── ..            ..
271         │   ├── csrow 3               # UMC 3
272         │   │   ├── channel 0
273         │   │   ├── ..
274         │   │   ├── channel 7
275         │   ├── rank 0
276         │   ├── ..            ..
277         │   ├── rank 31               # total 32 ranks/dimms from 4 UMCs
278279         ├── mc 2          # GPU node 1 == mc2
280         │   ├── ..            # each GPU has total 64 GB
281 
282         GPU card 2
283         ├── mc 3
284         │   ├── ..
285         ├── mc 4
286         │   ├── ..
287 
288         GPU card 3
289         ├── mc 5
290         │   ├── ..
291         ├── mc 6
292         │   ├── ..
293 
294         GPU card 4
295         ├── mc 7
296         │   ├── ..
297         ├── mc 8
298         │   ├── ..

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php