1 SCHED_EXT EXAMPLE SCHEDULERS 2 ============================ 3 4 # Introduction 5 6 This directory contains a number of example sc 7 schedulers are meant to provide examples of di 8 that can be built using sched_ext, and illustr 9 sched_ext can be used. 10 11 Some of the examples are performant, productio 12 the correct workload and with the correct tuni 13 production environment with acceptable or poss 14 Others are just examples that in practice, wou 15 performance (though they could be improved to 16 17 This README will describe these example schedu 18 types of workloads or scenarios they're design 19 not they're production ready. For more details 20 please see the header comment in their .bpf.c 21 22 23 # Compiling the examples 24 25 There are a few toolchain dependencies for com 26 27 ## Toolchain dependencies 28 29 1. clang >= 16.0.0 30 31 The schedulers are BPF programs, and therefore 32 is actively working on adding a BPF backend co 33 missing some features such as BTF type tags wh 34 kptrs. 35 36 2. pahole >= 1.25 37 38 You may need pahole in order to generate BTF f 39 40 3. rust >= 1.70.0 41 42 Rust schedulers uses features present in the r 43 should be able to use the stable build from ru 44 work, try using the rustup nightly build. 45 46 There are other requirements as well, such as 47 non-trivial ones. 48 49 ## Compiling the kernel 50 51 In order to run a sched_ext scheduler, you'll 52 with the patches in this repository, and with 53 Kconfig options: 54 55 ``` 56 CONFIG_BPF=y 57 CONFIG_SCHED_CLASS_EXT=y 58 CONFIG_BPF_SYSCALL=y 59 CONFIG_BPF_JIT=y 60 CONFIG_DEBUG_INFO_BTF=y 61 ``` 62 63 It's also recommended that you also include th 64 65 ``` 66 CONFIG_BPF_JIT_ALWAYS_ON=y 67 CONFIG_BPF_JIT_DEFAULT_ON=y 68 CONFIG_PAHOLE_HAS_SPLIT_BTF=y 69 CONFIG_PAHOLE_HAS_BTF_TAG=y 70 ``` 71 72 There is a `Kconfig` file in this directory wh 73 your local `.config` file, as long as there ar 74 options in the file. 75 76 ## Getting a vmlinux.h file 77 78 You may notice that most of the example schedu 79 This is a large, auto-generated header file th 80 defined in some vmlinux binary that was compil 81 [BTF](https://docs.kernel.org/bpf/btf.html) (i 82 options specified above). 83 84 The header file is created using `bpftool`, by 85 compiled with BTF as follows: 86 87 ```bash 88 $ bpftool btf dump file /path/to/vmlinux forma 89 ``` 90 91 `bpftool` analyzes all of the BTF encodings in 92 header file that can be included by BPF progra 93 example, using vmlinux.h allows a scheduler to 94 in vmlinux as follows: 95 96 ```c 97 #include "vmlinux.h" 98 // vmlinux.h is also implicitly included by sc 99 #include "scx_common.bpf.h" 100 101 /* 102 * vmlinux.h provides definitions for struct t 103 * struct scx_enable_args. 104 */ 105 void BPF_STRUCT_OPS(example_enable, struct tas 106 struct scx_enable_args *ar 107 { 108 bpf_printk("Task %s enabled in example 109 } 110 111 // vmlinux.h provides the definition for struc 112 SEC(".struct_ops.link") 113 struct sched_ext_ops example_ops { 114 .enable = (void *)example_enable, 115 .name = "example", 116 } 117 ``` 118 119 The scheduler build system will generate this 120 scheduler build pipeline. It looks for a vmlin 121 dependency order: 122 123 1. If the O= environment variable is defined, 124 2. If the KBUILD_OUTPUT= environment variable 125 `$KBUILD_OUTPUT/vmlinux` 126 3. At `../../vmlinux` (i.e. at the root of the 127 compiling the schedulers) 128 3. `/sys/kernel/btf/vmlinux` 129 4. `/boot/vmlinux-$(uname -r)` 130 131 In other words, if you have compiled a kernel 132 file will be used to generate vmlinux.h. Other 133 the kernel you're currently running on. This m 134 kernel with sched_ext support, you may not nee 135 all. 136 137 ### Aside on CO-RE 138 139 One of the cooler features of BPF is that it s 140 [CO-RE](https://nakryiko.com/posts/bpf-core-re 141 Everywhere). This feature allows you to refere 142 types defined internal to the kernel, and not 143 BPF program on a different kernel with the fie 144 example above, we print out a task name with ` 145 relocations for that access when the program i 146 referencing the correct offset for the current 147 148 ## Compiling the schedulers 149 150 Once you have your toolchain setup, and a vmli 151 a full vmlinux.h file, you can compile the sch 152 153 ```bash 154 $ make -j($nproc) 155 ``` 156 157 # Example schedulers 158 159 This directory contains the following example 160 for testing and demonstrating different aspect 161 useful in limited scenarios, they are not inte 162 163 For more scheduler implementations, tools and 164 https://github.com/sched-ext/scx. 165 166 ## scx_simple 167 168 A simple scheduler that provides an example of 169 scx_simple can be run in either global weighte 170 171 Though very simple, in limited scenarios, this 172 well on single-socket systems with a unified L 173 174 ## scx_qmap 175 176 Another simple, yet slightly more complex sche 177 a basic weighted FIFO queuing policy. It also 178 useful BPF features, such as sleepable per-tas 179 `ops.prep_enable()` callback, and using the `B 180 enqueue tasks. It also illustrates how core-sc 181 182 ## scx_central 183 184 A "central" scheduler where scheduling decisio 185 This scheduler illustrates how scheduling deci 186 single CPU, allowing other cores to run with i 187 ticks, and without having to incur the overhea 188 189 The approach demonstrated by this scheduler ma 190 benefits from minimizing scheduling overhead a 191 where this could be particularly useful is run 192 infinite slices and no timer ticks allows the 193 vmexits. 194 195 ## scx_flatcg 196 197 A flattened cgroup hierarchy scheduler. This s 198 weight-based cgroup CPU control by flattening 199 layer, by compounding the active weight share 200 is a much more performant CPU controller, whic 201 cgroup trees in order to properly compute a cg 202 203 Similar to scx_simple, in limited scenarios, t 204 reasonably well on single socket-socket system 205 significantly lowered hierarchical scheduling 206 207 208 # Troubleshooting 209 210 There are a number of common issues that you m 211 schedulers. We'll go over some of the common o 212 213 ## Build Failures 214 215 ### Old version of clang 216 217 ``` 218 error: static assertion failed due to requirem 219 _Static_assert(SCX_DSQ_FLAG_BUILTIN, 220 ^~~~~~~~~~~~~~~~~~~~ 221 1 error generated. 222 ``` 223 224 This means you built the kernel or the schedul 225 clang than what's supported (i.e. older than 1 226 227 1. `which clang` to make sure you're using a s 228 229 2. `make fullclean` in the root path of the re 230 and schedulers. 231 232 3. Rebuild the kernel, and then your example s 233 234 The schedulers are also cleaned if you invoke 235 directory of the tree. 236 237 ### Stale kernel build / incomplete vmlinux.h 238 239 As described above, you'll need a `vmlinux.h` 240 vmlinux built with BTF, and with sched_ext sup 241 you'll see errors such as the following which 242 referenced in a scheduler is unknown: 243 244 ``` 245 /path/to/sched_ext/tools/sched_ext/user_exit_i 246 247 const struct scx_exit_info *ei) 248 249 ^ 250 ``` 251 252 In order to resolve this, please follow the st 253 [Getting a vmlinux.h file](#getting-a-vmlinuxh 254 schedulers are using a vmlinux.h file that inc 255 256 ## Misc 257 258 ### llvm: [OFF] 259 260 You may see the following output when building 261 262 ``` 263 Auto-detecting system features: 264 ... clang-bpf-co-re: [ 265 ... llvm: [ 266 ... libcap: [ 267 ... libbfd: [ 268 ``` 269 270 Seeing `llvm: [ OFF ]` here is not an issue. Y
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.