1 ========================================== 2 Reducing OS jitter due to per-cpu kthreads 3 ========================================== 4 5 This document lists per-CPU kthreads in the Li 6 options to control their OS jitter. Note that 7 not listed here. To reduce OS jitter from non 8 them to a "housekeeping" CPU dedicated to such 9 10 References 11 ========== 12 13 - Documentation/core-api/irq/irq-affinit 14 15 - Documentation/admin-guide/cgroup-v1: 16 17 - man taskset: Using the taskset comman 18 of CPUs. 19 20 - man sched_setaffinity: Using the sche 21 call to bind tasks to sets of CPUs. 22 23 - /sys/devices/system/cpu/cpuN/online: 24 writing "0" to offline and "1" to onli 25 26 - In order to locate kernel-generated OS 27 28 cd /sys/kernel/tracing 29 echo 1 > max_graph_depth # Inc 30 echo function_graph > current_ 31 # run workload 32 cat per_cpu/cpuN/trace 33 34 kthreads 35 ======== 36 37 Name: 38 ehca_comp/%u 39 40 Purpose: 41 Periodically process Infiniband-related work 42 43 To reduce its OS jitter, do any of the followi 44 45 1. Don't use eHCA Infiniband hardware, in 46 that does not require per-CPU kthreads 47 kthreads from being created in the fir 48 work for most people, as this hardware 49 relatively old and is produced in rela 50 2. Do all eHCA-Infiniband-related work on 51 interrupts. 52 3. Rework the eHCA driver so that its per 53 provisioned only on selected CPUs. 54 55 56 Name: 57 irq/%d-%s 58 59 Purpose: 60 Handle threaded interrupts. 61 62 To reduce its OS jitter, do the following: 63 64 1. Use irq affinity to force the irq thre 65 some other CPU. 66 67 Name: 68 kcmtpd_ctr_%d 69 70 Purpose: 71 Handle Bluetooth work. 72 73 To reduce its OS jitter, do one of the followi 74 75 1. Don't use Bluetooth, in which case the 76 created in the first place. 77 2. Use irq affinity to force Bluetooth-re 78 occur on some other CPU and furthermor 79 Bluetooth activity on some other CPU. 80 81 Name: 82 ksoftirqd/%u 83 84 Purpose: 85 Execute softirq handlers when threaded or wh 86 87 To reduce its OS jitter, each softirq vector m 88 separately as follows: 89 90 TIMER_SOFTIRQ 91 ------------- 92 93 Do all of the following: 94 95 1. To the extent possible, keep the CPU o 96 is non-idle, for example, by avoiding 97 both kernel threads and interrupts to 98 2. Build with CONFIG_HOTPLUG_CPU=y. Afte 99 the CPU offline, then bring it back on 100 recurring timers to migrate elsewhere. 101 with multiple CPUs, force them all off 102 first one back online. Once you have 103 do not offline any other CPUs, because 104 timer back onto one of the CPUs in que 105 106 NET_TX_SOFTIRQ and NET_RX_SOFTIRQ 107 --------------------------------- 108 109 Do all of the following: 110 111 1. Force networking interrupts onto other 112 2. Initiate any network I/O on other CPUs 113 3. Once your application has started, pre 114 from being initiated from tasks that m 115 be de-jittered. (It is OK to force th 116 bring it back online before you start 117 118 BLOCK_SOFTIRQ 119 ------------- 120 121 Do all of the following: 122 123 1. Force block-device interrupts onto som 124 2. Initiate any block I/O on other CPUs. 125 3. Once your application has started, pre 126 from being initiated from tasks that m 127 be de-jittered. (It is OK to force th 128 bring it back online before you start 129 130 IRQ_POLL_SOFTIRQ 131 ---------------- 132 133 Do all of the following: 134 135 1. Force block-device interrupts onto som 136 2. Initiate any block I/O and block-I/O p 137 3. Once your application has started, pre 138 from being initiated from tasks that m 139 be de-jittered. (It is OK to force th 140 bring it back online before you start 141 142 TASKLET_SOFTIRQ 143 --------------- 144 145 Do one or more of the following: 146 147 1. Avoid use of drivers that use tasklets 148 calls to things like tasklet_schedule( 149 2. Convert all drivers that you must use 150 3. Force interrupts for drivers using tas 151 and also do I/O involving these driver 152 153 SCHED_SOFTIRQ 154 ------------- 155 156 Do all of the following: 157 158 1. Avoid sending scheduler IPIs to the CP 159 for example, ensure that at most one r 160 on that CPU. If a thread that expects 161 CPU awakens, the scheduler will send a 162 a subsequent SCHED_SOFTIRQ. 163 2. CONFIG_NO_HZ_FULL=y and ensure that th 164 is marked as an adaptive-ticks CPU usi 165 boot parameter. This reduces the numb 166 interrupts that the de-jittered CPU re 167 chances of being selected to do the lo 168 runs in SCHED_SOFTIRQ context. 169 3. To the extent possible, keep the CPU o 170 is non-idle, for example, by avoiding 171 forcing both kernel threads and interr 172 This further reduces the number of sch 173 received by the de-jittered CPU. 174 175 HRTIMER_SOFTIRQ 176 --------------- 177 178 Do all of the following: 179 180 1. To the extent possible, keep the CPU o 181 is non-idle. For example, avoid syste 182 kernel threads and interrupts to execu 183 2. Build with CONFIG_HOTPLUG_CPU=y. Once 184 CPU offline, then bring it back online 185 timers to migrate elsewhere. If you a 186 CPUs, force them all offline before br 187 back online. Once you have onlined th 188 offline any other CPUs, because doing 189 back onto one of the CPUs in question. 190 191 RCU_SOFTIRQ 192 ----------- 193 194 Do at least one of the following: 195 196 1. Offload callbacks and keep the CPU in 197 adaptive-ticks state by doing all of t 198 199 a. CONFIG_NO_HZ_FULL=y and ensure 200 de-jittered is marked as an ad 201 "nohz_full=" boot parameter. 202 housekeeping CPUs, which can t 203 b. To the extent possible, keep t 204 when it is non-idle, for examp 205 calls and by forcing both kern 206 to execute elsewhere. 207 208 2. Enable RCU to do its processing remote 209 doing all of the following: 210 211 a. Build with CONFIG_NO_HZ=y. 212 b. Ensure that the CPU goes idle 213 CPUs to detect that it has pas 214 state. If the kernel is built 215 userspace execution also allow 216 the CPU in question has passed 217 c. To the extent possible, keep t 218 when it is non-idle, for examp 219 calls and by forcing both kern 220 to execute elsewhere. 221 222 Name: 223 kworker/%u:%d%s (cpu, id, priority) 224 225 Purpose: 226 Execute workqueue requests 227 228 To reduce its OS jitter, do any of the followi 229 230 1. Run your workload at a real-time prior 231 preempting the kworker daemons. 232 2. A given workqueue can be made visible 233 by passing the WQ_SYSFS to that workqu 234 Such a workqueue can be confined to a 235 CPUs using the ``/sys/devices/virtual/ 236 files. The set of WQ_SYSFS workqueues 237 "ls /sys/devices/virtual/workqueue". 238 maintainer would like to caution peopl 239 sprinkling WQ_SYSFS across all the wor 240 caution is that it is easy to add WQ_S 241 part of the formal user/kernel API, it 242 to remove it, even if its addition was 243 3. Do any of the following needed to avoi 244 application cannot tolerate: 245 246 a. Avoid using oprofile, thus avo 247 wq_sync_buffer(). 248 b. Limit your CPU frequency so th 249 governor is not required, poss 250 special heatsinks or other coo 251 correctly, and if you CPU arch 252 be able to build your kernel w 253 avoid the CPU-frequency govern 254 on each CPU, including cs_dbs_ 255 256 WARNING: Please check your CP 257 make sure that this is safe on 258 c. As of v3.18, Christoph Lameter 259 commit prevents OS jitter due 260 CONFIG_SMP=y systems. Before 261 to entirely get rid of the OS 262 decrease its frequency by writ 263 /proc/sys/vm/stat_interval. T 264 for an interval of one second. 265 will make your virtual-memory 266 slowly. Of course, you can al 267 a real-time priority, thus pre 268 but if your workload is CPU-bo 269 However, there is an RFC patch 270 (based on an earlier one from 271 reduces or even eliminates vms 272 workloads at https://lore.kern 273 d. If running on high-end powerpc 274 CONFIG_PPC_RTAS_DAEMON=n. Thi 275 daemon from running on each CP 276 (This will require editing Kco 277 this platform's RAS functional 278 due to the rtas_event_scan() f 279 WARNING: Please check your CP 280 make sure that this is safe on 281 e. If running on Cell Processor, 282 CBE_CPUFREQ_SPU_GOVERNOR=n to 283 spu_gov_work(). 284 WARNING: Please check your CP 285 make sure that this is safe on 286 f. If running on PowerMAC, build 287 CONFIG_PMAC_RACKMETER=n to dis 288 avoiding OS jitter from rackme 289 290 Name: 291 rcuc/%u 292 293 Purpose: 294 Execute RCU callbacks in CONFIG_RCU_BOOST=y 295 296 To reduce its OS jitter, do at least one of th 297 298 1. Build the kernel with CONFIG_PREEMPT=n 299 kthreads from being created in the fir 300 the need for RCU priority boosting. T 301 for workloads that do not require high 302 2. Build the kernel with CONFIG_RCU_BOOST 303 kthreads from being created in the fir 304 is feasible only if your workload neve 305 boosting, for example, if you ensure f 306 CPUs that might execute within the ker 307 3. Build with CONFIG_RCU_NOCB_CPU=y and b 308 boot parameter offloading RCU callback 309 to OS jitter. This approach prevents 310 having any work to do, so that they ar 311 4. Ensure that the CPU never enters the k 312 avoid initiating any CPU hotplug opera 313 another way of preventing any callback 314 CPU, again preventing the rcuc/%u kthr 315 to do. 316 317 Name: 318 rcuop/%d and rcuos/%d 319 320 Purpose: 321 Offload RCU callbacks from the corresponding 322 323 To reduce its OS jitter, do at least one of th 324 325 1. Use affinity, cgroups, or other mechan 326 to execute on some other CPU. 327 2. Build with CONFIG_RCU_NOCB_CPU=n, whic 328 kthreads from being created in the fir 329 note that this will not eliminate OS j 330 shift it to RCU_SOFTIRQ.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.