1 ==================================== 2 Coherent Accelerator Interface (CXL) 3 ==================================== 4 5 Introduction 6 ============ 7 8 The coherent accelerator interface is desi 9 coherent connection of accelerators (FPGAs 10 POWER system. These devices need to adhere 11 Accelerator Interface Architecture (CAIA). 12 13 IBM refers to this as the Coherent Acceler 14 or CAPI. In the kernel it's referred to by 15 confusion with the ISDN CAPI subsystem. 16 17 Coherent in this context means that the ac 18 both access system memory directly and wit 19 addresses. 20 21 22 Hardware overview 23 ================= 24 25 :: 26 27 POWER8/9 FPGA 28 +----------+ +---------+ 29 | | | | 30 | CPU | | AFU | 31 | | | | 32 | | | | 33 | | | | 34 +----------+ +---------+ 35 | PHB | | | 36 | +------+ | PSL | 37 | | CAPP |<------>| | 38 +---+------+ PCIE +---------+ 39 40 The POWER8/9 chip has a Coherently Attache 41 unit which is part of the PCIe Host Bridge 42 by Linux by calls into OPAL. Linux doesn't 43 CAPP. 44 45 The FPGA (or coherently attached device) c 46 The POWER Service Layer (PSL) and the Acce 47 (AFU). The AFU is used to implement specif 48 the PSL. The PSL, among other things, prov 49 translation services to allow each AFU dir 50 memory. 51 52 The AFU is the core part of the accelerato 53 crypto etc function). The kernel has no kn 54 of the AFU. Only userspace interacts direc 55 56 The PSL provides the translation and inter 57 AFU needs. This is what the kernel interac 58 the AFU needs to read a particular effecti 59 that address to the PSL, the PSL then tran 60 data from memory and returns it to the AFU 61 translation miss, it interrupts the kernel 62 the fault. The context to which this fault 63 who owns that acceleration function. 64 65 - POWER8 and PSL Version 8 are compliant t 66 - POWER9 and PSL Version 9 are compliant t 67 68 This PSL Version 9 provides new features s 69 70 * Interaction with the nest MMU on the P9 71 * Native DMA support. 72 * Supports sending ASB_Notify messages for 73 * Supports Atomic operations. 74 * etc. 75 76 Cards with a PSL9 won't work on a POWER8 s 77 PSL8 won't work on a POWER9 system. 78 79 AFU Modes 80 ========= 81 82 There are two programming modes supported 83 and AFU directed. AFU may support one or b 84 85 When using dedicated mode only one MMU con 86 this mode, only one userspace process can 87 time. 88 89 When using AFU directed mode, up to 16K si 90 be supported. This means up to 16K simulta 91 applications may use the accelerator (alth 92 support fewer). In this mode, the AFU send 93 with each of its requests. This tells the 94 associated with each operation. If the PSL 95 operation, the ID can also be accessed by 96 determine the userspace context associated 97 98 99 MMIO space 100 ========== 101 102 A portion of the accelerator MMIO space ca 103 from the AFU to userspace. Either the whol 104 just a per context portion. The hardware i 105 the kernel can determine the offset and si 106 portion. 107 108 109 Interrupts 110 ========== 111 112 AFUs may generate interrupts that are dest 113 are received by the kernel as hardware int 114 userspace by a read syscall documented bel 115 116 Data storage faults and error interrupts a 117 driver. 118 119 120 Work Element Descriptor (WED) 121 ============================= 122 123 The WED is a 64-bit parameter passed to th 124 started. Its format is up to the AFU hence 125 knowledge of what it represents. Typically 126 effective address of a work queue or statu 127 and userspace can share control and status 128 129 130 131 132 User API 133 ======== 134 135 1. AFU character devices 136 ^^^^^^^^^^^^^^^^^^^^^^^^ 137 138 For AFUs operating in AFU directed mode, t 139 files will be created. /dev/cxl/afu0.0m wi 140 master context and /dev/cxl/afu0.0s will c 141 context. Master contexts have access to th 142 AFU provides. Slave contexts have access t 143 MMIO space an AFU provides. 144 145 For AFUs operating in dedicated process mo 146 only create a single character device per 147 /dev/cxl/afu0.0d. This will have access to 148 that the AFU provides (like master context 149 150 The types described below are defined in i 151 152 The following file operations are supporte 153 master devices. 154 155 A userspace library libcxl is available he 156 157 https://github.com/ibm-capi/libcxl 158 159 This provides a C interface to this kernel 160 161 open 162 ---- 163 164 Opens the device and allocates a file desc 165 the rest of the API. 166 167 A dedicated mode AFU only has one context 168 device to be opened once. 169 170 An AFU directed mode AFU can have many con 171 opened once for each context that is avail 172 173 When all available contexts are allocated 174 and return -ENOSPC. 175 176 Note: 177 IRQs need to be allocated for each c 178 the number of contexts that can be c 179 how many times the device can be ope 180 supports 2040 IRQs and 3 are used by 181 left. If 1 IRQ is needed per context 182 contexts can be allocated. If 4 IRQs 183 then only 2037/4 = 509 contexts can 184 185 186 ioctl 187 ----- 188 189 CXL_IOCTL_START_WORK: 190 Starts the AFU context and associates 191 process. Once this ioctl is successful 192 mapped into this process is accessible 193 using the same effective addresses. No 194 required to map/unmap memory. The AFU 195 updated as userspace allocates and fre 196 returns once the AFU context is starte 197 198 Takes a pointer to a struct cxl_ioctl_ 199 200 :: 201 202 struct cxl_ioctl_start_work { 203 __u64 flags; 204 __u64 work_element_des 205 __u64 amr; 206 __s16 num_interrupts; 207 __s16 reserved1; 208 __s32 reserved2; 209 __u64 reserved3; 210 __u64 reserved4; 211 __u64 reserved5; 212 __u64 reserved6; 213 }; 214 215 flags: 216 Indicates which optional field 217 valid. 218 219 work_element_descriptor: 220 The Work Element Descriptor (W 221 defined by the AFU. Typically 222 address pointing to an AFU spe 223 describing what work to perfor 224 225 amr: 226 Authority Mask Register (AMR), 227 AMR. This field is only used b 228 corresponding CXL_START_WORK_A 229 flags. If not specified the ke 230 value of 0. 231 232 num_interrupts: 233 Number of userspace interrupts 234 is only used by the kernel whe 235 CXL_START_WORK_NUM_IRQS value 236 If not specified the minimum n 237 AFU will be allocated. The min 238 obtained from sysfs. 239 240 reserved fields: 241 For ABI padding and future ext 242 243 CXL_IOCTL_GET_PROCESS_ELEMENT: 244 Get the current context id, also known 245 The value is returned from the kernel 246 247 248 mmap 249 ---- 250 251 An AFU may have an MMIO space to facilitat 252 AFU. If it does, the MMIO space can be acc 253 and contents of this area are specific to 254 size can be discovered via sysfs. 255 256 In AFU directed mode, master contexts are 257 the MMIO space and slave contexts are allo 258 process MMIO space associated with the con 259 process mode the entire MMIO space can alw 260 261 This mmap call must be done after the STAR 262 263 Care should be taken when accessing MMIO s 264 accesses are supported by POWER8. Also, th 265 with a specific endianness, so all MMIO ac 266 endianness (recommend endian(3) variants l 267 be64toh() etc). These endian issues equall 268 queues the WED may describe. 269 270 271 read 272 ---- 273 274 Reads events from the AFU. Blocks if no ev 275 (unless O_NONBLOCK is supplied). Returns - 276 unrecoverable error or if the card is remo 277 278 read() will always return an integral numb 279 280 The buffer passed to read() must be at lea 281 282 The result of the read will be a buffer of 283 each event is of type struct cxl_event, of 284 285 struct cxl_event { 286 struct cxl_event_header he 287 union { 288 struct cxl_event_a 289 struct cxl_event_d 290 struct cxl_event_a 291 }; 292 }; 293 294 The struct cxl_event_header is defined as 295 296 :: 297 298 struct cxl_event_header { 299 __u16 type; 300 __u16 size; 301 __u16 process_element; 302 __u16 reserved1; 303 }; 304 305 type: 306 This defines the type of event. Th 307 the rest of the event is structure 308 described below and defined by enu 309 310 size: 311 This is the size of the event in b 312 struct cxl_event_header. The start 313 be found at this offset from the s 314 event. 315 316 process_element: 317 Context ID of the event. 318 319 reserved field: 320 For future extensions and padding. 321 322 If the event type is CXL_EVENT_AFU_INTERRU 323 structure is defined as 324 325 :: 326 327 struct cxl_event_afu_interrupt { 328 __u16 flags; 329 __u16 irq; /* Raised AFU i 330 __u32 reserved1; 331 }; 332 333 flags: 334 These flags indicate which optiona 335 in this struct. Currently all fiel 336 337 irq: 338 The IRQ number sent by the AFU. 339 340 reserved field: 341 For future extensions and padding. 342 343 If the event type is CXL_EVENT_DATA_STORAG 344 structure is defined as 345 346 :: 347 348 struct cxl_event_data_storage { 349 __u16 flags; 350 __u16 reserved1; 351 __u32 reserved2; 352 __u64 addr; 353 __u64 dsisr; 354 __u64 reserved3; 355 }; 356 357 flags: 358 These flags indicate which optiona 359 this struct. Currently all fields 360 361 address: 362 The address that the AFU unsuccess 363 access. Valid accesses will be han 364 kernel but invalid accesses will g 365 366 dsisr: 367 This field gives information on th 368 copy of the DSISR from the PSL har 369 fault occurred. The form of the DS 370 CAIA. 371 372 reserved fields: 373 For future extensions 374 375 If the event type is CXL_EVENT_AFU_ERROR t 376 is defined as 377 378 :: 379 380 struct cxl_event_afu_error { 381 __u16 flags; 382 __u16 reserved1; 383 __u32 reserved2; 384 __u64 error; 385 }; 386 387 flags: 388 These flags indicate which optiona 389 this struct. Currently all fields 390 391 error: 392 Error status from the AFU. Defined 393 394 reserved fields: 395 For future extensions and padding 396 397 398 2. Card character device (powerVM guest only) 399 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 400 401 In a powerVM guest, an extra character dev 402 card. The device is only used to write (fl 403 FPGA accelerator. Once the image is writte 404 device tree is updated and the card is res 405 image. 406 407 open 408 ---- 409 410 Opens the device and allocates a file desc 411 the rest of the API. The device can only b 412 413 ioctl 414 ----- 415 416 CXL_IOCTL_DOWNLOAD_IMAGE / CXL_IOCTL_VALIDATE_ 417 Starts and controls flashing a new FPGA im 418 reconfiguration is not supported (yet), so 419 a copy of the PSL and AFU(s). Since an ima 420 the caller may have to iterate, splitting 421 chunks. 422 423 Takes a pointer to a struct cxl_adapter_im 424 425 struct cxl_adapter_image { 426 __u64 flags; 427 __u64 data; 428 __u64 len_data; 429 __u64 len_image; 430 __u64 reserved1; 431 __u64 reserved2; 432 __u64 reserved3; 433 __u64 reserved4; 434 }; 435 436 flags: 437 These flags indicate which optional fi 438 this struct. Currently all fields are 439 440 data: 441 Pointer to a buffer with part of the i 442 card. 443 444 len_data: 445 Size of the buffer pointed to by data. 446 447 len_image: 448 Full size of the image. 449 450 451 Sysfs Class 452 =========== 453 454 A cxl sysfs class is added under /sys/clas 455 enumeration and tuning of the accelerators 456 described in Documentation/ABI/testing/sys 457 458 459 Udev rules 460 ========== 461 462 The following udev rules could be used to 463 most logical chardev to use in any program 464 dedicated, afuX.Ys for afu directed), sinc 465 identical for each:: 466 467 SUBSYSTEM=="cxl", ATTRS{mode}=="dedica 468 SUBSYSTEM=="cxl", ATTRS{mode}=="afu_di 469 KERNEL=="afu[0-9]*.[
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.