1 .. SPDX-License-Identifier: GPL-2.0-only 2 3 ============= 4 QAIC driver 5 ============= 6 7 The QAIC driver is the Kernel Mode Driver (KMD 8 accelerator products. 9 10 Interrupts 11 ========== 12 13 IRQ Storm Mitigation 14 -------------------- 15 16 While the AIC100 DMA Bridge hardware implement 17 mechanism, it is still possible for an IRQ sto 18 if the workload is particularly quick, and the 19 can drain the response FIFO as quickly as the 20 it, then the device will frequently transition 21 non-empty and generate MSIs at a rate equivale 22 workload's ability to process inputs. The lprn 23 workload is known to trigger this condition, a 24 MSIs per second. It has been observed that mos 25 for long, and will crash due to some form of w 26 the interrupt controller interrupting the host 27 28 To mitigate this issue, the QAIC driver implem 29 QAIC receives an IRQ, it disables that line. T 30 controller from interrupting the CPU. Then AIC 31 is drained, QAIC implements a "last chance" po 32 sleep for a time to see if the workload will g 33 line remains disabled during this time. If no 34 polling mode and reenables the IRQ line. 35 36 This mitigation in QAIC is very effective. The 37 generates 100k IRQs per second (per /proc/inte 38 IRQs over 5 minutes while keeping the host sys 39 workload throughput performance (within run to 40 41 Single MSI Mode 42 --------------- 43 44 MultiMSI is not well supported on all systems; 45 (circa 2023). Between hypervisors masking the 46 large memory requirements for vIOMMUs (require 47 useful to be able to fall back to a single MSI 48 49 To support this fallback, we allow the case wh 50 allocated, and share that one MSI between MHI 51 when only one MSI has been configured and dire 52 to the interrupt normally used for MHI. Unfort 53 interrupt handlers for every DBC and MHI wake 54 arrives; however, the DBC threaded irq handler 55 done is detected (MHI will always start its th 56 57 If the DBC is configured to force MSI interrup 58 software IRQ storm mitigation mentioned above. 59 never disabled, allowing each new entry to the 60 61 62 Neural Network Control (NNC) Protocol 63 ===================================== 64 65 The implementation of NNC is split between the 66 QAIC understands how to encode/decode NNC wire 67 protocol which require kernel space knowledge 68 host memory to device IOVAs). QAIC understands 69 all of the transactions. QAIC does not underst 70 passthrough transaction). 71 72 QAIC handles and enforces the required little 73 to the degree that it can. Since QAIC does not 74 passthrough transaction, it relies on the UMD 75 76 The terminate transaction is of particular use 77 the resources that are loaded onto a device si 78 occurs within NNC commands. As a result, QAIC 79 roll back userspace activity. To ensure that a 80 are fully released in the case of a process cr 81 terminate command to let QSM know when a user 82 can be released. 83 84 QSM can report a version number of the NNC pro 85 form of a Major number and a Minor number. 86 87 Major number updates indicate changes to the N 88 message format, or transactions (impacts QAIC) 89 90 Minor number updates indicate changes to the N 91 commands (does not impact QAIC). 92 93 uAPI 94 ==== 95 96 QAIC creates an accel device per physical PCIe 97 for as long as the PCIe device is known to Lin 98 99 The PCIe device may not be in the state to acc 100 all times. QAIC will trigger KOBJ_ONLINE/OFFLI 101 device can accept requests (ONLINE) and when t 102 requests (OFFLINE) because of a reset or other 103 104 QAIC defines a number of driver specific IOCTL 105 106 DRM_IOCTL_QAIC_MANAGE 107 This IOCTL allows userspace to send a NNC re 108 block until a response is received, or the r 109 110 DRM_IOCTL_QAIC_CREATE_BO 111 This IOCTL allows userspace to allocate a bu 112 or receive data from a workload. The call wi 113 represents the allocated buffer. The BO is n 114 sliced (see DRM_IOCTL_QAIC_ATTACH_SLICE_BO). 115 116 DRM_IOCTL_QAIC_MMAP_BO 117 This IOCTL allows userspace to prepare an al 118 userspace process. 119 120 DRM_IOCTL_QAIC_ATTACH_SLICE_BO 121 This IOCTL allows userspace to slice a BO in 122 to the device. Slicing is the operation of d 123 get sent where to a workload. This requires 124 DMA Bridge, and as such, locks the BO to a s 125 126 DRM_IOCTL_QAIC_EXECUTE_BO 127 This IOCTL allows userspace to submit a set 128 call is non-blocking. Success only indicates 129 to the device, but does not guarantee they h 130 131 DRM_IOCTL_QAIC_PARTIAL_EXECUTE_BO 132 This IOCTL operates like DRM_IOCTL_QAIC_EXEC 133 to shrink the BOs sent to the device for thi 134 typically has N inputs, but only a subset of 135 allows userspace to indicate that only the f 136 sent to the device to minimize data transfer 137 recomputes the slicing, and therefore has so 138 BOs can be queued to the device. 139 140 DRM_IOCTL_QAIC_WAIT_BO 141 This IOCTL allows userspace to determine whe 142 processed by the device. The call will block 143 processed and can be re-queued to the device 144 145 DRM_IOCTL_QAIC_PERF_STATS_BO 146 This IOCTL allows userspace to collect perfo 147 recent execution of a BO. This allows usersp 148 timeline of the BO processing for a performa 149 150 DRM_IOCTL_QAIC_DETACH_SLICE_BO 151 This IOCTL allows userspace to remove the sl 152 was originally provided by a call to DRM_IOC 153 is the inverse of DRM_IOCTL_QAIC_ATTACH_SLIC 154 DRM_IOCTL_QAIC_DETACH_SLICE_BO to be called. 155 operation the BO may have new slicing inform 156 to DRM_IOCTL_QAIC_ATTACH_SLICE_BO. After det 157 executed until after a new attach slice oper 158 and detach slice calls allows userspace to u 159 160 Userspace Client Isolation 161 ========================== 162 163 AIC100 supports multiple clients. Multiple DBC 164 client, and multiple clients can each consume 165 may contain sensitive information therefore on 166 workload should be allowed to interface with t 167 168 Clients are identified by the instance associa 169 may only use memory they allocate, and DBCs th 170 workloads. Attempts to access resources assign 171 rejected. 172 173 Module parameters 174 ================= 175 176 QAIC supports the following module parameters: 177 178 **datapath_polling (bool)** 179 180 Configures QAIC to use a polling thread for da 181 on the device interrupts. Useful for platforms 182 set at QAIC driver initialization. Default is 183 184 **mhi_timeout_ms (unsigned int)** 185 186 Sets the timeout value for MHI operations in m 187 at the time the driver detects a device. Defau 188 189 **control_resp_timeout_s (unsigned int)** 190 191 Sets the timeout value for QSM responses to NN 192 be set at the time the driver is sending a req 193 minute). 194 195 **wait_exec_default_timeout_ms (unsigned int)* 196 197 Sets the default timeout for the wait_exec ioc 198 set prior to the waic_exec ioctl call. A value 199 overrides this for that call. Default is 5000 200 201 **datapath_poll_interval_us (unsigned int)** 202 203 Sets the polling interval in microseconds (us) 204 Takes effect at the next polling interval. Def 205 206 **timesync_delay_ms (unsigned int)** 207 208 Sets the time interval in milliseconds (ms) be 209 operations. Default is 1000 (1000 ms).
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.