~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/tools/sched_ext/README.md

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /tools/sched_ext/README.md (Architecture sparc64) and /tools/sched_ext/README.md (Architecture ppc)


  1 SCHED_EXT EXAMPLE SCHEDULERS                        1 SCHED_EXT EXAMPLE SCHEDULERS
  2 ============================                        2 ============================
  3                                                     3 
  4 # Introduction                                      4 # Introduction
  5                                                     5 
  6 This directory contains a number of example sc      6 This directory contains a number of example sched_ext schedulers. These
  7 schedulers are meant to provide examples of di      7 schedulers are meant to provide examples of different types of schedulers
  8 that can be built using sched_ext, and illustr      8 that can be built using sched_ext, and illustrate how various features of
  9 sched_ext can be used.                              9 sched_ext can be used.
 10                                                    10 
 11 Some of the examples are performant, productio     11 Some of the examples are performant, production-ready schedulers. That is, for
 12 the correct workload and with the correct tuni     12 the correct workload and with the correct tuning, they may be deployed in a
 13 production environment with acceptable or poss     13 production environment with acceptable or possibly even improved performance.
 14 Others are just examples that in practice, wou     14 Others are just examples that in practice, would not provide acceptable
 15 performance (though they could be improved to      15 performance (though they could be improved to get there).
 16                                                    16 
 17 This README will describe these example schedu     17 This README will describe these example schedulers, including describing the
 18 types of workloads or scenarios they're design     18 types of workloads or scenarios they're designed to accommodate, and whether or
 19 not they're production ready. For more details     19 not they're production ready. For more details on any of these schedulers,
 20 please see the header comment in their .bpf.c      20 please see the header comment in their .bpf.c file.
 21                                                    21 
 22                                                    22 
 23 # Compiling the examples                           23 # Compiling the examples
 24                                                    24 
 25 There are a few toolchain dependencies for com     25 There are a few toolchain dependencies for compiling the example schedulers.
 26                                                    26 
 27 ## Toolchain dependencies                          27 ## Toolchain dependencies
 28                                                    28 
 29 1. clang >= 16.0.0                                 29 1. clang >= 16.0.0
 30                                                    30 
 31 The schedulers are BPF programs, and therefore     31 The schedulers are BPF programs, and therefore must be compiled with clang. gcc
 32 is actively working on adding a BPF backend co     32 is actively working on adding a BPF backend compiler as well, but are still
 33 missing some features such as BTF type tags wh     33 missing some features such as BTF type tags which are necessary for using
 34 kptrs.                                             34 kptrs.
 35                                                    35 
 36 2. pahole >= 1.25                                  36 2. pahole >= 1.25
 37                                                    37 
 38 You may need pahole in order to generate BTF f     38 You may need pahole in order to generate BTF from DWARF.
 39                                                    39 
 40 3. rust >= 1.70.0                                  40 3. rust >= 1.70.0
 41                                                    41 
 42 Rust schedulers uses features present in the r     42 Rust schedulers uses features present in the rust toolchain >= 1.70.0. You
 43 should be able to use the stable build from ru     43 should be able to use the stable build from rustup, but if that doesn't
 44 work, try using the rustup nightly build.          44 work, try using the rustup nightly build.
 45                                                    45 
 46 There are other requirements as well, such as      46 There are other requirements as well, such as make, but these are the main /
 47 non-trivial ones.                                  47 non-trivial ones.
 48                                                    48 
 49 ## Compiling the kernel                            49 ## Compiling the kernel
 50                                                    50 
 51 In order to run a sched_ext scheduler, you'll      51 In order to run a sched_ext scheduler, you'll have to run a kernel compiled
 52 with the patches in this repository, and with      52 with the patches in this repository, and with a minimum set of necessary
 53 Kconfig options:                                   53 Kconfig options:
 54                                                    54 
 55 ```                                                55 ```
 56 CONFIG_BPF=y                                       56 CONFIG_BPF=y
 57 CONFIG_SCHED_CLASS_EXT=y                           57 CONFIG_SCHED_CLASS_EXT=y
 58 CONFIG_BPF_SYSCALL=y                               58 CONFIG_BPF_SYSCALL=y
 59 CONFIG_BPF_JIT=y                                   59 CONFIG_BPF_JIT=y
 60 CONFIG_DEBUG_INFO_BTF=y                            60 CONFIG_DEBUG_INFO_BTF=y
 61 ```                                                61 ```
 62                                                    62 
 63 It's also recommended that you also include th     63 It's also recommended that you also include the following Kconfig options:
 64                                                    64 
 65 ```                                                65 ```
 66 CONFIG_BPF_JIT_ALWAYS_ON=y                         66 CONFIG_BPF_JIT_ALWAYS_ON=y
 67 CONFIG_BPF_JIT_DEFAULT_ON=y                        67 CONFIG_BPF_JIT_DEFAULT_ON=y
 68 CONFIG_PAHOLE_HAS_SPLIT_BTF=y                      68 CONFIG_PAHOLE_HAS_SPLIT_BTF=y
 69 CONFIG_PAHOLE_HAS_BTF_TAG=y                        69 CONFIG_PAHOLE_HAS_BTF_TAG=y
 70 ```                                                70 ```
 71                                                    71 
 72 There is a `Kconfig` file in this directory wh     72 There is a `Kconfig` file in this directory whose contents you can append to
 73 your local `.config` file, as long as there ar     73 your local `.config` file, as long as there are no conflicts with any existing
 74 options in the file.                               74 options in the file.
 75                                                    75 
 76 ## Getting a vmlinux.h file                        76 ## Getting a vmlinux.h file
 77                                                    77 
 78 You may notice that most of the example schedu     78 You may notice that most of the example schedulers include a "vmlinux.h" file.
 79 This is a large, auto-generated header file th     79 This is a large, auto-generated header file that contains all of the types
 80 defined in some vmlinux binary that was compil     80 defined in some vmlinux binary that was compiled with
 81 [BTF](https://docs.kernel.org/bpf/btf.html) (i     81 [BTF](https://docs.kernel.org/bpf/btf.html) (i.e. with the BTF-related Kconfig
 82 options specified above).                          82 options specified above).
 83                                                    83 
 84 The header file is created using `bpftool`, by     84 The header file is created using `bpftool`, by passing it a vmlinux binary
 85 compiled with BTF as follows:                      85 compiled with BTF as follows:
 86                                                    86 
 87 ```bash                                            87 ```bash
 88 $ bpftool btf dump file /path/to/vmlinux forma     88 $ bpftool btf dump file /path/to/vmlinux format c > vmlinux.h
 89 ```                                                89 ```
 90                                                    90 
 91 `bpftool` analyzes all of the BTF encodings in     91 `bpftool` analyzes all of the BTF encodings in the binary, and produces a
 92 header file that can be included by BPF progra     92 header file that can be included by BPF programs to access those types.  For
 93 example, using vmlinux.h allows a scheduler to     93 example, using vmlinux.h allows a scheduler to access fields defined directly
 94 in vmlinux as follows:                             94 in vmlinux as follows:
 95                                                    95 
 96 ```c                                               96 ```c
 97 #include "vmlinux.h"                               97 #include "vmlinux.h"
 98 // vmlinux.h is also implicitly included by sc     98 // vmlinux.h is also implicitly included by scx_common.bpf.h.
 99 #include "scx_common.bpf.h"                        99 #include "scx_common.bpf.h"
100                                                   100 
101 /*                                                101 /*
102  * vmlinux.h provides definitions for struct t    102  * vmlinux.h provides definitions for struct task_struct and
103  * struct scx_enable_args.                        103  * struct scx_enable_args.
104  */                                               104  */
105 void BPF_STRUCT_OPS(example_enable, struct tas    105 void BPF_STRUCT_OPS(example_enable, struct task_struct *p,
106                     struct scx_enable_args *ar    106                     struct scx_enable_args *args)
107 {                                                 107 {
108         bpf_printk("Task %s enabled in example    108         bpf_printk("Task %s enabled in example scheduler", p->comm);
109 }                                                 109 }
110                                                   110 
111 // vmlinux.h provides the definition for struc    111 // vmlinux.h provides the definition for struct sched_ext_ops.
112 SEC(".struct_ops.link")                           112 SEC(".struct_ops.link")
113 struct sched_ext_ops example_ops {                113 struct sched_ext_ops example_ops {
114         .enable = (void *)example_enable,         114         .enable = (void *)example_enable,
115         .name   = "example",                      115         .name   = "example",
116 }                                                 116 }
117 ```                                               117 ```
118                                                   118 
119 The scheduler build system will generate this     119 The scheduler build system will generate this vmlinux.h file as part of the
120 scheduler build pipeline. It looks for a vmlin    120 scheduler build pipeline. It looks for a vmlinux file in the following
121 dependency order:                                 121 dependency order:
122                                                   122 
123 1. If the O= environment variable is defined,     123 1. If the O= environment variable is defined, at `$O/vmlinux`
124 2. If the KBUILD_OUTPUT= environment variable     124 2. If the KBUILD_OUTPUT= environment variable is defined, at
125    `$KBUILD_OUTPUT/vmlinux`                       125    `$KBUILD_OUTPUT/vmlinux`
126 3. At `../../vmlinux` (i.e. at the root of the    126 3. At `../../vmlinux` (i.e. at the root of the kernel tree where you're
127    compiling the schedulers)                      127    compiling the schedulers)
128 3. `/sys/kernel/btf/vmlinux`                      128 3. `/sys/kernel/btf/vmlinux`
129 4. `/boot/vmlinux-$(uname -r)`                    129 4. `/boot/vmlinux-$(uname -r)`
130                                                   130 
131 In other words, if you have compiled a kernel     131 In other words, if you have compiled a kernel in your local repo, its vmlinux
132 file will be used to generate vmlinux.h. Other    132 file will be used to generate vmlinux.h. Otherwise, it will be the vmlinux of
133 the kernel you're currently running on. This m    133 the kernel you're currently running on. This means that if you're running on a
134 kernel with sched_ext support, you may not nee    134 kernel with sched_ext support, you may not need to compile a local kernel at
135 all.                                              135 all.
136                                                   136 
137 ### Aside on CO-RE                                137 ### Aside on CO-RE
138                                                   138 
139 One of the cooler features of BPF is that it s    139 One of the cooler features of BPF is that it supports
140 [CO-RE](https://nakryiko.com/posts/bpf-core-re    140 [CO-RE](https://nakryiko.com/posts/bpf-core-reference-guide/) (Compile Once Run
141 Everywhere). This feature allows you to refere    141 Everywhere). This feature allows you to reference fields inside of structs with
142 types defined internal to the kernel, and not     142 types defined internal to the kernel, and not have to recompile if you load the
143 BPF program on a different kernel with the fie    143 BPF program on a different kernel with the field at a different offset. In our
144 example above, we print out a task name with `    144 example above, we print out a task name with `p->comm`. CO-RE would perform
145 relocations for that access when the program i    145 relocations for that access when the program is loaded to ensure that it's
146 referencing the correct offset for the current    146 referencing the correct offset for the currently running kernel.
147                                                   147 
148 ## Compiling the schedulers                       148 ## Compiling the schedulers
149                                                   149 
150 Once you have your toolchain setup, and a vmli    150 Once you have your toolchain setup, and a vmlinux that can be used to generate
151 a full vmlinux.h file, you can compile the sch    151 a full vmlinux.h file, you can compile the schedulers using `make`:
152                                                   152 
153 ```bash                                           153 ```bash
154 $ make -j($nproc)                                 154 $ make -j($nproc)
155 ```                                               155 ```
156                                                   156 
157 # Example schedulers                              157 # Example schedulers
158                                                   158 
159 This directory contains the following example     159 This directory contains the following example schedulers. These schedulers are
160 for testing and demonstrating different aspect    160 for testing and demonstrating different aspects of sched_ext. While some may be
161 useful in limited scenarios, they are not inte    161 useful in limited scenarios, they are not intended to be practical.
162                                                   162 
163 For more scheduler implementations, tools and     163 For more scheduler implementations, tools and documentation, visit
164 https://github.com/sched-ext/scx.                 164 https://github.com/sched-ext/scx.
165                                                   165 
166 ## scx_simple                                     166 ## scx_simple
167                                                   167 
168 A simple scheduler that provides an example of    168 A simple scheduler that provides an example of a minimal sched_ext scheduler.
169 scx_simple can be run in either global weighte    169 scx_simple can be run in either global weighted vtime mode, or FIFO mode.
170                                                   170 
171 Though very simple, in limited scenarios, this    171 Though very simple, in limited scenarios, this scheduler can perform reasonably
172 well on single-socket systems with a unified L    172 well on single-socket systems with a unified L3 cache.
173                                                   173 
174 ## scx_qmap                                       174 ## scx_qmap
175                                                   175 
176 Another simple, yet slightly more complex sche    176 Another simple, yet slightly more complex scheduler that provides an example of
177 a basic weighted FIFO queuing policy. It also     177 a basic weighted FIFO queuing policy. It also provides examples of some common
178 useful BPF features, such as sleepable per-tas    178 useful BPF features, such as sleepable per-task storage allocation in the
179 `ops.prep_enable()` callback, and using the `B    179 `ops.prep_enable()` callback, and using the `BPF_MAP_TYPE_QUEUE` map type to
180 enqueue tasks. It also illustrates how core-sc    180 enqueue tasks. It also illustrates how core-sched support could be implemented.
181                                                   181 
182 ## scx_central                                    182 ## scx_central
183                                                   183 
184 A "central" scheduler where scheduling decisio    184 A "central" scheduler where scheduling decisions are made from a single CPU.
185 This scheduler illustrates how scheduling deci    185 This scheduler illustrates how scheduling decisions can be dispatched from a
186 single CPU, allowing other cores to run with i    186 single CPU, allowing other cores to run with infinite slices, without timer
187 ticks, and without having to incur the overhea    187 ticks, and without having to incur the overhead of making scheduling decisions.
188                                                   188 
189 The approach demonstrated by this scheduler ma    189 The approach demonstrated by this scheduler may be useful for any workload that
190 benefits from minimizing scheduling overhead a    190 benefits from minimizing scheduling overhead and timer ticks. An example of
191 where this could be particularly useful is run    191 where this could be particularly useful is running VMs, where running with
192 infinite slices and no timer ticks allows the     192 infinite slices and no timer ticks allows the VM to avoid unnecessary expensive
193 vmexits.                                          193 vmexits.
194                                                   194 
195 ## scx_flatcg                                     195 ## scx_flatcg
196                                                   196 
197 A flattened cgroup hierarchy scheduler. This s    197 A flattened cgroup hierarchy scheduler. This scheduler implements hierarchical
198 weight-based cgroup CPU control by flattening     198 weight-based cgroup CPU control by flattening the cgroup hierarchy into a single
199 layer, by compounding the active weight share     199 layer, by compounding the active weight share at each level. The effect of this
200 is a much more performant CPU controller, whic    200 is a much more performant CPU controller, which does not need to descend down
201 cgroup trees in order to properly compute a cg    201 cgroup trees in order to properly compute a cgroup's share.
202                                                   202 
203 Similar to scx_simple, in limited scenarios, t    203 Similar to scx_simple, in limited scenarios, this scheduler can perform
204 reasonably well on single socket-socket system    204 reasonably well on single socket-socket systems with a unified L3 cache and show
205 significantly lowered hierarchical scheduling     205 significantly lowered hierarchical scheduling overhead.
206                                                   206 
207                                                   207 
208 # Troubleshooting                                 208 # Troubleshooting
209                                                   209 
210 There are a number of common issues that you m    210 There are a number of common issues that you may run into when building the
211 schedulers. We'll go over some of the common o    211 schedulers. We'll go over some of the common ones here.
212                                                   212 
213 ## Build Failures                                 213 ## Build Failures
214                                                   214 
215 ### Old version of clang                          215 ### Old version of clang
216                                                   216 
217 ```                                               217 ```
218 error: static assertion failed due to requirem    218 error: static assertion failed due to requirement 'SCX_DSQ_FLAG_BUILTIN': bpftool generated vmlinux.h is missing high bits for 64bit enums, upgrade clang and pahole
219         _Static_assert(SCX_DSQ_FLAG_BUILTIN,      219         _Static_assert(SCX_DSQ_FLAG_BUILTIN,
220                        ^~~~~~~~~~~~~~~~~~~~       220                        ^~~~~~~~~~~~~~~~~~~~
221 1 error generated.                                221 1 error generated.
222 ```                                               222 ```
223                                                   223 
224 This means you built the kernel or the schedul    224 This means you built the kernel or the schedulers with an older version of
225 clang than what's supported (i.e. older than 1    225 clang than what's supported (i.e. older than 16.0.0). To remediate this:
226                                                   226 
227 1. `which clang` to make sure you're using a s    227 1. `which clang` to make sure you're using a sufficiently new version of clang.
228                                                   228 
229 2. `make fullclean` in the root path of the re    229 2. `make fullclean` in the root path of the repository, and rebuild the kernel
230    and schedulers.                                230    and schedulers.
231                                                   231 
232 3. Rebuild the kernel, and then your example s    232 3. Rebuild the kernel, and then your example schedulers.
233                                                   233 
234 The schedulers are also cleaned if you invoke     234 The schedulers are also cleaned if you invoke `make mrproper` in the root
235 directory of the tree.                            235 directory of the tree.
236                                                   236 
237 ### Stale kernel build / incomplete vmlinux.h     237 ### Stale kernel build / incomplete vmlinux.h file
238                                                   238 
239 As described above, you'll need a `vmlinux.h`     239 As described above, you'll need a `vmlinux.h` file that was generated from a
240 vmlinux built with BTF, and with sched_ext sup    240 vmlinux built with BTF, and with sched_ext support enabled. If you don't,
241 you'll see errors such as the following which     241 you'll see errors such as the following which indicate that a type being
242 referenced in a scheduler is unknown:             242 referenced in a scheduler is unknown:
243                                                   243 
244 ```                                               244 ```
245 /path/to/sched_ext/tools/sched_ext/user_exit_i    245 /path/to/sched_ext/tools/sched_ext/user_exit_info.h:25:23: note: forward declaration of 'struct scx_exit_info'
246                                                   246 
247 const struct scx_exit_info *ei)                   247 const struct scx_exit_info *ei)
248                                                   248 
249 ^                                                 249 ^
250 ```                                               250 ```
251                                                   251 
252 In order to resolve this, please follow the st    252 In order to resolve this, please follow the steps above in
253 [Getting a vmlinux.h file](#getting-a-vmlinuxh    253 [Getting a vmlinux.h file](#getting-a-vmlinuxh-file) in order to ensure your
254 schedulers are using a vmlinux.h file that inc    254 schedulers are using a vmlinux.h file that includes the requisite types.
255                                                   255 
256 ## Misc                                           256 ## Misc
257                                                   257 
258 ### llvm: [OFF]                                   258 ### llvm: [OFF]
259                                                   259 
260 You may see the following output when building    260 You may see the following output when building the schedulers:
261                                                   261 
262 ```                                               262 ```
263 Auto-detecting system features:                   263 Auto-detecting system features:
264 ...                         clang-bpf-co-re: [    264 ...                         clang-bpf-co-re: [ on  ]
265 ...                                    llvm: [    265 ...                                    llvm: [ OFF ]
266 ...                                  libcap: [    266 ...                                  libcap: [ on  ]
267 ...                                  libbfd: [    267 ...                                  libbfd: [ on  ]
268 ```                                               268 ```
269                                                   269 
270 Seeing `llvm: [ OFF ]` here is not an issue. Y    270 Seeing `llvm: [ OFF ]` here is not an issue. You can safely ignore.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php