1 .. SPDX-License-Identifier: GPL-2.0 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 ============================================== 3 ============================================================ 4 Intel(R) Speed Select Technology User Guide 4 Intel(R) Speed Select Technology User Guide 5 ============================================== 5 ============================================================ 6 6 7 The Intel(R) Speed Select Technology (Intel(R) 7 The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new 8 collection of features that give more granular 8 collection of features that give more granular control over CPU performance. 9 With Intel(R) SST, one server can be configure 9 With Intel(R) SST, one server can be configured for power and performance for a 10 variety of diverse workload requirements. 10 variety of diverse workload requirements. 11 11 12 Refer to the links below for an overview of th 12 Refer to the links below for an overview of the technology: 13 13 14 - https://www.intel.com/content/www/us/en/arch 14 - https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html 15 - https://builders.intel.com/docs/networkbuild 15 - https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf 16 16 17 These capabilities are further enhanced in som 17 These capabilities are further enhanced in some of the newer generations of 18 server platforms where these features can be e 18 server platforms where these features can be enumerated and controlled 19 dynamically without pre-configuring via BIOS s 19 dynamically without pre-configuring via BIOS setup options. This dynamic 20 configuration is done via mailbox commands to 20 configuration is done via mailbox commands to the hardware. One way to enumerate 21 and configure these features is by using the I 21 and configure these features is by using the Intel Speed Select utility. 22 22 23 This document explains how to use the Intel Sp 23 This document explains how to use the Intel Speed Select tool to enumerate and 24 control Intel(R) SST features. This document g 24 control Intel(R) SST features. This document gives example commands and explains 25 how these commands change the power and perfor 25 how these commands change the power and performance profile of the system under 26 test. Using this tool as an example, customers 26 test. Using this tool as an example, customers can replicate the messaging 27 implemented in the tool in their production so 27 implemented in the tool in their production software. 28 28 29 intel-speed-select configuration tool 29 intel-speed-select configuration tool 30 ====================================== 30 ====================================== 31 31 32 Most Linux distribution packages may include t 32 Most Linux distribution packages may include the "intel-speed-select" tool. If not, 33 it can be built by downloading the Linux kerne 33 it can be built by downloading the Linux kernel tree from kernel.org. Once 34 downloaded, the tool can be built without buil 34 downloaded, the tool can be built without building the full kernel. 35 35 36 From the kernel tree, run the following comman 36 From the kernel tree, run the following commands:: 37 37 38 # cd tools/power/x86/intel-speed-select/ 38 # cd tools/power/x86/intel-speed-select/ 39 # make 39 # make 40 # make install 40 # make install 41 41 42 Getting Help 42 Getting Help 43 ------------ 43 ------------ 44 44 45 To get help with the tool, execute the command 45 To get help with the tool, execute the command below:: 46 46 47 # intel-speed-select --help 47 # intel-speed-select --help 48 48 49 The top-level help describes arguments and fea 49 The top-level help describes arguments and features. Notice that there is a 50 multi-level help structure in the tool. For ex 50 multi-level help structure in the tool. For example, to get help for the feature "perf-profile":: 51 51 52 # intel-speed-select perf-profile --help 52 # intel-speed-select perf-profile --help 53 53 54 To get help on a command, another level of hel 54 To get help on a command, another level of help is provided. For example for the command info "info":: 55 55 56 # intel-speed-select perf-profile info --help 56 # intel-speed-select perf-profile info --help 57 57 58 Summary of platform capability 58 Summary of platform capability 59 ------------------------------ 59 ------------------------------ 60 To check the current platform and driver capab 60 To check the current platform and driver capabilities, execute:: 61 61 62 #intel-speed-select --info 62 #intel-speed-select --info 63 63 64 For example on a test system:: 64 For example on a test system:: 65 65 66 # intel-speed-select --info 66 # intel-speed-select --info 67 Intel(R) Speed Select Technology 67 Intel(R) Speed Select Technology 68 Executing on CPU model: X 68 Executing on CPU model: X 69 Platform: API version : 1 69 Platform: API version : 1 70 Platform: Driver version : 1 70 Platform: Driver version : 1 71 Platform: mbox supported : 1 71 Platform: mbox supported : 1 72 Platform: mmio supported : 1 72 Platform: mmio supported : 1 73 Intel(R) SST-PP (feature perf-profile) is sup 73 Intel(R) SST-PP (feature perf-profile) is supported 74 TDP level change control is unlocked, max lev 74 TDP level change control is unlocked, max level: 4 75 Intel(R) SST-TF (feature turbo-freq) is suppo 75 Intel(R) SST-TF (feature turbo-freq) is supported 76 Intel(R) SST-BF (feature base-freq) is not su 76 Intel(R) SST-BF (feature base-freq) is not supported 77 Intel(R) SST-CP (feature core-power) is suppo 77 Intel(R) SST-CP (feature core-power) is supported 78 78 79 Intel(R) Speed Select Technology - Performance 79 Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) 80 ---------------------------------------------- 80 ------------------------------------------------------------------------ 81 81 82 This feature allows configuration of a server 82 This feature allows configuration of a server dynamically based on workload 83 performance requirements. This helps users dur 83 performance requirements. This helps users during deployment as they do not have 84 to choose a specific server configuration stat 84 to choose a specific server configuration statically. This Intel(R) Speed Select 85 Technology - Performance Profile (Intel(R) SST 85 Technology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism 86 that allows multiple optimized performance pro 86 that allows multiple optimized performance profiles per system. Each profile 87 defines a set of CPUs that need to be online a 87 defines a set of CPUs that need to be online and rest offline to sustain a 88 guaranteed base frequency. Once the user issue 88 guaranteed base frequency. Once the user issues a command to use a specific 89 performance profile and meet CPU online/offlin 89 performance profile and meet CPU online/offline requirement, the user can expect 90 a change in the base frequency dynamically. Th 90 a change in the base frequency dynamically. This feature is called 91 "perf-profile" when using the Intel Speed Sele 91 "perf-profile" when using the Intel Speed Select tool. 92 92 93 Number or performance levels 93 Number or performance levels 94 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 94 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 95 95 96 There can be multiple performance profiles on 96 There can be multiple performance profiles on a system. To get the number of 97 profiles, execute the command below:: 97 profiles, execute the command below:: 98 98 99 # intel-speed-select perf-profile get-config- 99 # intel-speed-select perf-profile get-config-levels 100 Intel(R) Speed Select Technology 100 Intel(R) Speed Select Technology 101 Executing on CPU model: X 101 Executing on CPU model: X 102 package-0 102 package-0 103 die-0 103 die-0 104 cpu-0 104 cpu-0 105 get-config-levels:4 105 get-config-levels:4 106 package-1 106 package-1 107 die-0 107 die-0 108 cpu-14 108 cpu-14 109 get-config-levels:4 109 get-config-levels:4 110 110 111 On this system under test, there are 4 perform 111 On this system under test, there are 4 performance profiles in addition to the 112 base performance profile (which is performance 112 base performance profile (which is performance level 0). 113 113 114 Lock/Unlock status 114 Lock/Unlock status 115 ~~~~~~~~~~~~~~~~~~ 115 ~~~~~~~~~~~~~~~~~~ 116 116 117 Even if there are multiple performance profile 117 Even if there are multiple performance profiles, it is possible that they 118 are locked. If they are locked, users cannot i 118 are locked. If they are locked, users cannot issue a command to change the 119 performance state. It is possible that there i 119 performance state. It is possible that there is a BIOS setup to unlock or check 120 with your system vendor. 120 with your system vendor. 121 121 122 To check if the system is locked, execute the 122 To check if the system is locked, execute the following command:: 123 123 124 # intel-speed-select perf-profile get-lock-st 124 # intel-speed-select perf-profile get-lock-status 125 Intel(R) Speed Select Technology 125 Intel(R) Speed Select Technology 126 Executing on CPU model: X 126 Executing on CPU model: X 127 package-0 127 package-0 128 die-0 128 die-0 129 cpu-0 129 cpu-0 130 get-lock-status:0 130 get-lock-status:0 131 package-1 131 package-1 132 die-0 132 die-0 133 cpu-14 133 cpu-14 134 get-lock-status:0 134 get-lock-status:0 135 135 136 In this case, lock status is 0, which means th 136 In this case, lock status is 0, which means that the system is unlocked. 137 137 138 Properties of a performance level 138 Properties of a performance level 139 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 139 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 140 140 141 To get properties of a specific performance le 141 To get properties of a specific performance level (For example for the level 0, below), execute the command below:: 142 142 143 # intel-speed-select perf-profile info -l 0 143 # intel-speed-select perf-profile info -l 0 144 Intel(R) Speed Select Technology 144 Intel(R) Speed Select Technology 145 Executing on CPU model: X 145 Executing on CPU model: X 146 package-0 146 package-0 147 die-0 147 die-0 148 cpu-0 148 cpu-0 149 perf-profile-level-0 149 perf-profile-level-0 150 cpu-count:28 150 cpu-count:28 151 enable-cpu-mask:000003ff,f0003fff 151 enable-cpu-mask:000003ff,f0003fff 152 enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10 152 enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41 153 thermal-design-power-ratio:26 153 thermal-design-power-ratio:26 154 base-frequency(MHz):2600 154 base-frequency(MHz):2600 155 speed-select-turbo-freq:disabled 155 speed-select-turbo-freq:disabled 156 speed-select-base-freq:disabled 156 speed-select-base-freq:disabled 157 ... 157 ... 158 ... 158 ... 159 159 160 Here -l option is used to specify a performanc 160 Here -l option is used to specify a performance level. 161 161 162 If the option -l is omitted, then this command 162 If the option -l is omitted, then this command will print information about all 163 the performance levels. The above command is p 163 the performance levels. The above command is printing properties of the 164 performance level 0. 164 performance level 0. 165 165 166 For this performance profile, the list of CPUs 166 For this performance profile, the list of CPUs displayed by the 167 "enable-cpu-mask/enable-cpu-list" at the max c 167 "enable-cpu-mask/enable-cpu-list" at the max can be "online." When that 168 condition is met, then base frequency of 2600 168 condition is met, then base frequency of 2600 MHz can be maintained. To 169 understand more, execute "intel-speed-select p 169 understand more, execute "intel-speed-select perf-profile info" for performance 170 level 4:: 170 level 4:: 171 171 172 # intel-speed-select perf-profile info -l 4 172 # intel-speed-select perf-profile info -l 4 173 Intel(R) Speed Select Technology 173 Intel(R) Speed Select Technology 174 Executing on CPU model: X 174 Executing on CPU model: X 175 package-0 175 package-0 176 die-0 176 die-0 177 cpu-0 177 cpu-0 178 perf-profile-level-4 178 perf-profile-level-4 179 cpu-count:28 179 cpu-count:28 180 enable-cpu-mask:000000fa,f0000faf 180 enable-cpu-mask:000000fa,f0000faf 181 enable-cpu-list:0,1,2,3,5,7,8,9,10,11, 181 enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39 182 thermal-design-power-ratio:28 182 thermal-design-power-ratio:28 183 base-frequency(MHz):2800 183 base-frequency(MHz):2800 184 speed-select-turbo-freq:disabled 184 speed-select-turbo-freq:disabled 185 speed-select-base-freq:unsupported 185 speed-select-base-freq:unsupported 186 ... 186 ... 187 ... 187 ... 188 188 189 There are fewer CPUs in the "enable-cpu-mask/e 189 There are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if 190 the user only keeps these CPUs online and the 190 the user only keeps these CPUs online and the rest "offline," then the base 191 frequency is increased to 2.8 GHz compared to 191 frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0. 192 192 193 Get current performance level 193 Get current performance level 194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 195 195 196 To get the current performance level, execute: 196 To get the current performance level, execute:: 197 197 198 # intel-speed-select perf-profile get-config- 198 # intel-speed-select perf-profile get-config-current-level 199 Intel(R) Speed Select Technology 199 Intel(R) Speed Select Technology 200 Executing on CPU model: X 200 Executing on CPU model: X 201 package-0 201 package-0 202 die-0 202 die-0 203 cpu-0 203 cpu-0 204 get-config-current_level:0 204 get-config-current_level:0 205 205 206 First verify that the base_frequency displayed 206 First verify that the base_frequency displayed by the cpufreq sysfs is correct:: 207 207 208 # cat /sys/devices/system/cpu/cpu0/cpufreq/ba 208 # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency 209 2600000 209 2600000 210 210 211 This matches the base-frequency (MHz) field va 211 This matches the base-frequency (MHz) field value displayed from the 212 "perf-profile info" command for performance le 212 "perf-profile info" command for performance level 0(cpufreq frequency is in 213 KHz). 213 KHz). 214 214 215 To check if the average frequency is equal to 215 To check if the average frequency is equal to the base frequency for a 100% busy 216 workload, disable turbo:: 216 workload, disable turbo:: 217 217 218 # echo 1 > /sys/devices/system/cpu/intel_pstat 218 # echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 219 219 220 Then runs a busy workload on all CPUs, for exa 220 Then runs a busy workload on all CPUs, for example:: 221 221 222 #stress -c 64 222 #stress -c 64 223 223 224 To verify the base frequency, run turbostat:: 224 To verify the base frequency, run turbostat:: 225 225 226 #turbostat -c 0-13 --show Package,Core,CPU,Bz 226 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 227 227 228 Package Core CPU Bzy_MHz 228 Package Core CPU Bzy_MHz 229 - - 2600 229 - - 2600 230 0 0 0 2600 230 0 0 0 2600 231 0 1 1 2600 231 0 1 1 2600 232 0 2 2 2600 232 0 2 2 2600 233 0 3 3 2600 233 0 3 3 2600 234 0 4 4 2600 234 0 4 4 2600 235 . . . . 235 . . . . 236 236 237 237 238 Changing performance level 238 Changing performance level 239 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 239 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 240 240 241 To the change the performance level to 4, exec 241 To the change the performance level to 4, execute:: 242 242 243 # intel-speed-select -d perf-profile set-conf 243 # intel-speed-select -d perf-profile set-config-level -l 4 -o 244 Intel(R) Speed Select Technology 244 Intel(R) Speed Select Technology 245 Executing on CPU model: X 245 Executing on CPU model: X 246 package-0 246 package-0 247 die-0 247 die-0 248 cpu-0 248 cpu-0 249 perf-profile 249 perf-profile 250 set_tdp_level:success 250 set_tdp_level:success 251 251 252 In the command above, "-o" is optional. If it 252 In the command above, "-o" is optional. If it is specified, then it will also 253 offline CPUs which are not present in the enab 253 offline CPUs which are not present in the enable_cpu_mask for this performance 254 level. 254 level. 255 255 256 Now if the base_frequency is checked:: 256 Now if the base_frequency is checked:: 257 257 258 #cat /sys/devices/system/cpu/cpu0/cpufreq/bas 258 #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency 259 2800000 259 2800000 260 260 261 Which shows that the base frequency now increa 261 Which shows that the base frequency now increased from 2600 MHz at performance 262 level 0 to 2800 MHz at performance level 4. As 262 level 0 to 2800 MHz at performance level 4. As a result, any workload, which can 263 use fewer CPUs, can see a boost of 200 MHz com 263 use fewer CPUs, can see a boost of 200 MHz compared to performance level 0. 264 264 265 Changing performance level via BMC Interface 265 Changing performance level via BMC Interface 266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 267 267 268 It is possible to change SST-PP level using ou 268 It is possible to change SST-PP level using out of band (OOB) agent (Via some 269 remote management console, through BMC "Basebo 269 remote management console, through BMC "Baseboard Management Controller" 270 interface). This mode is supported from the Sa 270 interface). This mode is supported from the Sapphire Rapids processor 271 generation. The kernel and tool change to supp 271 generation. The kernel and tool change to support this mode is added to Linux 272 kernel version 5.18. To enable this feature, k 272 kernel version 5.18. To enable this feature, kernel config 273 "CONFIG_INTEL_HFI_THERMAL" is required. The mi 273 "CONFIG_INTEL_HFI_THERMAL" is required. The minimum version of the tool 274 is "v1.12" to support this feature, which is p 274 is "v1.12" to support this feature, which is part of Linux kernel version 5.18. 275 275 276 To support such configuration, this tool can b 276 To support such configuration, this tool can be used as a daemon. Add 277 a command line option --oob:: 277 a command line option --oob:: 278 278 279 # intel-speed-select --oob 279 # intel-speed-select --oob 280 Intel(R) Speed Select Technology 280 Intel(R) Speed Select Technology 281 Executing on CPU model:143[0x8f] 281 Executing on CPU model:143[0x8f] 282 OOB mode is enabled and will run as daemon 282 OOB mode is enabled and will run as daemon 283 283 284 In this mode the tool will online/offline CPUs 284 In this mode the tool will online/offline CPUs based on the new performance 285 level. 285 level. 286 286 287 Check presence of other Intel(R) SST features 287 Check presence of other Intel(R) SST features 288 --------------------------------------------- 288 --------------------------------------------- 289 289 290 Each of the performance profiles also specifie 290 Each of the performance profiles also specifies weather there is support of 291 other two Intel(R) SST features (Intel(R) Spee 291 other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency 292 (Intel(R) SST-BF) and Intel(R) Speed Select Te 292 (Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel 293 SST-TF)). 293 SST-TF)). 294 294 295 For example, from the output of "perf-profile 295 For example, from the output of "perf-profile info" above, for level 0 and level 296 4: 296 4: 297 297 298 For level 0:: 298 For level 0:: 299 speed-select-turbo-freq:disabled 299 speed-select-turbo-freq:disabled 300 speed-select-base-freq:disabled 300 speed-select-base-freq:disabled 301 301 302 For level 4:: 302 For level 4:: 303 speed-select-turbo-freq:disabled 303 speed-select-turbo-freq:disabled 304 speed-select-base-freq:unsupported 304 speed-select-base-freq:unsupported 305 305 306 Given these results, the "speed-select-base-fr 306 Given these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4 307 changed from "disabled" to "unsupported" compa 307 changed from "disabled" to "unsupported" compared to performance level 0. 308 308 309 This means that at performance level 4, the "s 309 This means that at performance level 4, the "speed-select-base-freq" feature is 310 not supported. However, at performance level 0 310 not supported. However, at performance level 0, this feature is "supported", but 311 currently "disabled", meaning the user has not 311 currently "disabled", meaning the user has not activated this feature. Whereas 312 "speed-select-turbo-freq" (Intel(R) SST-TF) is 312 "speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance 313 levels, but currently not activated by the use 313 levels, but currently not activated by the user. 314 314 315 The Intel(R) SST-BF and the Intel(R) SST-TF fe 315 The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation 316 technology called Intel(R) Speed Select Techno 316 technology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP). 317 The platform firmware enables this feature whe 317 The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF 318 is supported on a platform. 318 is supported on a platform. 319 319 320 Intel(R) Speed Select Technology Core Power (I 320 Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) 321 ---------------------------------------------- 321 --------------------------------------------------------------- 322 322 323 Intel(R) Speed Select Technology Core Power (I 323 Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that 324 allows users to define per core priority. This 324 allows users to define per core priority. This defines a mechanism to distribute 325 power among cores when there is a power constr 325 power among cores when there is a power constrained scenario. This defines a 326 class of service (CLOS) configuration. 326 class of service (CLOS) configuration. 327 327 328 The user can configure up to 4 class of servic 328 The user can configure up to 4 class of service configurations. Each CLOS group 329 configuration allows definitions of parameters 329 configuration allows definitions of parameters, which affects how the frequency 330 can be limited and power is distributed. Each 330 can be limited and power is distributed. Each CPU core can be tied to a class of 331 service and hence an associated priority. The 331 service and hence an associated priority. The granularity is at core level not 332 at per CPU level. 332 at per CPU level. 333 333 334 Enable CLOS based prioritization 334 Enable CLOS based prioritization 335 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 335 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 336 336 337 To use CLOS based prioritization feature, firm 337 To use CLOS based prioritization feature, firmware must be informed to enable 338 and use a priority type. There is a default pe 338 and use a priority type. There is a default per platform priority type, which 339 can be changed with optional command line para 339 can be changed with optional command line parameter. 340 340 341 To enable and check the options, execute:: 341 To enable and check the options, execute:: 342 342 343 # intel-speed-select core-power enable --help 343 # intel-speed-select core-power enable --help 344 Intel(R) Speed Select Technology 344 Intel(R) Speed Select Technology 345 Executing on CPU model: X 345 Executing on CPU model: X 346 Enable core-power for a package/die 346 Enable core-power for a package/die 347 Clos Enable: Specify priority type wit 347 Clos Enable: Specify priority type with [--priority|-p] 348 0: Proportional, 1: Ordered 348 0: Proportional, 1: Ordered 349 349 350 There are two types of priority types: 350 There are two types of priority types: 351 351 352 - Ordered 352 - Ordered 353 353 354 Priority for ordered throttling is defined bas 354 Priority for ordered throttling is defined based on the index of the assigned 355 CLOS group. Where CLOS0 gets highest priority 355 CLOS group. Where CLOS0 gets highest priority (throttled last). 356 356 357 Priority order is: 357 Priority order is: 358 CLOS0 > CLOS1 > CLOS2 > CLOS3. 358 CLOS0 > CLOS1 > CLOS2 > CLOS3. 359 359 360 - Proportional 360 - Proportional 361 361 362 When proportional priority is used, there is a 362 When proportional priority is used, there is an additional parameter called 363 frequency_weight, which can be specified per C 363 frequency_weight, which can be specified per CLOS group. The goal of 364 proportional priority is to provide each core 364 proportional priority is to provide each core with the requested min., then 365 distribute all remaining (excess/deficit) budg 365 distribute all remaining (excess/deficit) budgets in proportion to a defined 366 weight. This proportional priority can be conf 366 weight. This proportional priority can be configured using "core-power config" 367 command. 367 command. 368 368 369 To enable with the platform default priority t 369 To enable with the platform default priority type, execute:: 370 370 371 # intel-speed-select core-power enable 371 # intel-speed-select core-power enable 372 Intel(R) Speed Select Technology 372 Intel(R) Speed Select Technology 373 Executing on CPU model: X 373 Executing on CPU model: X 374 package-0 374 package-0 375 die-0 375 die-0 376 cpu-0 376 cpu-0 377 core-power 377 core-power 378 enable:success 378 enable:success 379 package-1 379 package-1 380 die-0 380 die-0 381 cpu-6 381 cpu-6 382 core-power 382 core-power 383 enable:success 383 enable:success 384 384 385 The scope of this enable is per package or die 385 The scope of this enable is per package or die scoped when a package contains 386 multiple dies. To check if CLOS is enabled and 386 multiple dies. To check if CLOS is enabled and get priority type, "core-power 387 info" command can be used. For example to chec 387 info" command can be used. For example to check the status of core-power feature 388 on CPU 0, execute:: 388 on CPU 0, execute:: 389 389 390 # intel-speed-select -c 0 core-power info 390 # intel-speed-select -c 0 core-power info 391 Intel(R) Speed Select Technology 391 Intel(R) Speed Select Technology 392 Executing on CPU model: X 392 Executing on CPU model: X 393 package-0 393 package-0 394 die-0 394 die-0 395 cpu-0 395 cpu-0 396 core-power 396 core-power 397 support-status:supported 397 support-status:supported 398 enable-status:enabled 398 enable-status:enabled 399 clos-enable-status:enabled 399 clos-enable-status:enabled 400 priority-type:proportional 400 priority-type:proportional 401 package-1 401 package-1 402 die-0 402 die-0 403 cpu-24 403 cpu-24 404 core-power 404 core-power 405 support-status:supported 405 support-status:supported 406 enable-status:enabled 406 enable-status:enabled 407 clos-enable-status:enabled 407 clos-enable-status:enabled 408 priority-type:proportional 408 priority-type:proportional 409 409 410 Configuring CLOS groups 410 Configuring CLOS groups 411 ~~~~~~~~~~~~~~~~~~~~~~~ 411 ~~~~~~~~~~~~~~~~~~~~~~~ 412 412 413 Each CLOS group has its own attributes includi 413 Each CLOS group has its own attributes including min, max, freq_weight and 414 desired. These parameters can be configured wi 414 desired. These parameters can be configured with "core-power config" command. 415 Defaults will be used if user skips setting a 415 Defaults will be used if user skips setting a parameter except clos id, which is 416 mandatory. To check core-power config options, 416 mandatory. To check core-power config options, execute:: 417 417 418 # intel-speed-select core-power config --help 418 # intel-speed-select core-power config --help 419 Intel(R) Speed Select Technology 419 Intel(R) Speed Select Technology 420 Executing on CPU model: X 420 Executing on CPU model: X 421 Set core-power configuration for one of the f 421 Set core-power configuration for one of the four clos ids 422 Specify targeted clos id with [--clos| 422 Specify targeted clos id with [--clos|-c] 423 Specify clos Proportional Priority [-- 423 Specify clos Proportional Priority [--weight|-w] 424 Specify clos min in MHz with [--min|-n 424 Specify clos min in MHz with [--min|-n] 425 Specify clos max in MHz with [--max|-m 425 Specify clos max in MHz with [--max|-m] 426 426 427 For example:: 427 For example:: 428 428 429 # intel-speed-select core-power config -c 0 429 # intel-speed-select core-power config -c 0 430 Intel(R) Speed Select Technology 430 Intel(R) Speed Select Technology 431 Executing on CPU model: X 431 Executing on CPU model: X 432 clos epp is not specified, default: 0 432 clos epp is not specified, default: 0 433 clos frequency weight is not specified, defau 433 clos frequency weight is not specified, default: 0 434 clos min is not specified, default: 0 MHz 434 clos min is not specified, default: 0 MHz 435 clos max is not specified, default: 25500 MHz 435 clos max is not specified, default: 25500 MHz 436 clos desired is not specified, default: 0 436 clos desired is not specified, default: 0 437 package-0 437 package-0 438 die-0 438 die-0 439 cpu-0 439 cpu-0 440 core-power 440 core-power 441 config:success 441 config:success 442 package-1 442 package-1 443 die-0 443 die-0 444 cpu-6 444 cpu-6 445 core-power 445 core-power 446 config:success 446 config:success 447 447 448 The user has the option to change defaults. Fo 448 The user has the option to change defaults. For example, the user can change the 449 "min" and set the base frequency to always get 449 "min" and set the base frequency to always get guaranteed base frequency. 450 450 451 Get the current CLOS configuration 451 Get the current CLOS configuration 452 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 452 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 453 453 454 To check the current configuration, "core-powe 454 To check the current configuration, "core-power get-config" can be used. For 455 example, to get the configuration of CLOS 0:: 455 example, to get the configuration of CLOS 0:: 456 456 457 # intel-speed-select core-power get-config -c 457 # intel-speed-select core-power get-config -c 0 458 Intel(R) Speed Select Technology 458 Intel(R) Speed Select Technology 459 Executing on CPU model: X 459 Executing on CPU model: X 460 package-0 460 package-0 461 die-0 461 die-0 462 cpu-0 462 cpu-0 463 core-power 463 core-power 464 clos:0 464 clos:0 465 epp:0 465 epp:0 466 clos-proportional-priority:0 466 clos-proportional-priority:0 467 clos-min:0 MHz 467 clos-min:0 MHz 468 clos-max:Max Turbo frequency 468 clos-max:Max Turbo frequency 469 clos-desired:0 MHz 469 clos-desired:0 MHz 470 package-1 470 package-1 471 die-0 471 die-0 472 cpu-24 472 cpu-24 473 core-power 473 core-power 474 clos:0 474 clos:0 475 epp:0 475 epp:0 476 clos-proportional-priority:0 476 clos-proportional-priority:0 477 clos-min:0 MHz 477 clos-min:0 MHz 478 clos-max:Max Turbo frequency 478 clos-max:Max Turbo frequency 479 clos-desired:0 MHz 479 clos-desired:0 MHz 480 480 481 Associating a CPU with a CLOS group 481 Associating a CPU with a CLOS group 482 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 482 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 483 483 484 To associate a CPU to a CLOS group "core-power 484 To associate a CPU to a CLOS group "core-power assoc" command can be used:: 485 485 486 # intel-speed-select core-power assoc --help 486 # intel-speed-select core-power assoc --help 487 Intel(R) Speed Select Technology 487 Intel(R) Speed Select Technology 488 Executing on CPU model: X 488 Executing on CPU model: X 489 Associate a clos id to a CPU 489 Associate a clos id to a CPU 490 Specify targeted clos id with [--clos| 490 Specify targeted clos id with [--clos|-c] 491 491 492 492 493 For example to associate CPU 10 to CLOS group 493 For example to associate CPU 10 to CLOS group 3, execute:: 494 494 495 # intel-speed-select -c 10 core-power assoc - 495 # intel-speed-select -c 10 core-power assoc -c 3 496 Intel(R) Speed Select Technology 496 Intel(R) Speed Select Technology 497 Executing on CPU model: X 497 Executing on CPU model: X 498 package-0 498 package-0 499 die-0 499 die-0 500 cpu-10 500 cpu-10 501 core-power 501 core-power 502 assoc:success 502 assoc:success 503 503 504 Once a CPU is associated, its sibling CPUs are 504 Once a CPU is associated, its sibling CPUs are also associated to a CLOS group. 505 Once associated, avoid changing Linux "cpufreq 505 Once associated, avoid changing Linux "cpufreq" subsystem scaling frequency 506 limits. 506 limits. 507 507 508 To check the existing association for a CPU, " 508 To check the existing association for a CPU, "core-power get-assoc" command can 509 be used. For example, to get association of CP 509 be used. For example, to get association of CPU 10, execute:: 510 510 511 # intel-speed-select -c 10 core-power get-ass 511 # intel-speed-select -c 10 core-power get-assoc 512 Intel(R) Speed Select Technology 512 Intel(R) Speed Select Technology 513 Executing on CPU model: X 513 Executing on CPU model: X 514 package-1 514 package-1 515 die-0 515 die-0 516 cpu-10 516 cpu-10 517 get-assoc 517 get-assoc 518 clos:3 518 clos:3 519 519 520 This shows that CPU 10 is part of a CLOS group 520 This shows that CPU 10 is part of a CLOS group 3. 521 521 522 522 523 Disable CLOS based prioritization 523 Disable CLOS based prioritization 524 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 524 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 525 525 526 To disable, execute:: 526 To disable, execute:: 527 527 528 # intel-speed-select core-power disable 528 # intel-speed-select core-power disable 529 529 530 Some features like Intel(R) SST-TF can only be 530 Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritization 531 is enabled. For this reason, disabling while I 531 is enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause 532 Intel(R) SST-TF to fail. This will cause the " 532 Intel(R) SST-TF to fail. This will cause the "disable" command to display an error 533 if Intel(R) SST-TF is already enabled. In turn 533 if Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF 534 feature must be disabled first. 534 feature must be disabled first. 535 535 536 Intel(R) Speed Select Technology - Base Freque 536 Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) 537 ---------------------------------------------- 537 ------------------------------------------------------------------- 538 538 539 The Intel(R) Speed Select Technology - Base Fr 539 The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets 540 the user control base frequency. If some criti 540 the user control base frequency. If some critical workload threads demand 541 constant high guaranteed performance, then thi 541 constant high guaranteed performance, then this feature can be used to execute 542 the thread at higher base frequency on specifi 542 the thread at higher base frequency on specific sets of CPUs (high priority 543 CPUs) at the cost of lower base frequency (low 543 CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs. 544 This feature does not require offline of the l 544 This feature does not require offline of the low priority CPUs. 545 545 546 The support of Intel(R) SST-BF depends on the 546 The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology - 547 Performance Profile (Intel(R) SST-PP) performa 547 Performance Profile (Intel(R) SST-PP) performance level configuration. It is 548 possible that only certain performance levels 548 possible that only certain performance levels support Intel(R) SST-BF. It is also 549 possible that only base performance level (lev 549 possible that only base performance level (level = 0) has support of Intel 550 SST-BF. Consequently, first select the desired 550 SST-BF. Consequently, first select the desired performance level to enable this 551 feature. 551 feature. 552 552 553 In the system under test here, Intel(R) SST-BF 553 In the system under test here, Intel(R) SST-BF is supported at the base 554 performance level 0, but currently disabled. F 554 performance level 0, but currently disabled. For example for the level 0:: 555 555 556 # intel-speed-select -c 0 perf-profile info - 556 # intel-speed-select -c 0 perf-profile info -l 0 557 Intel(R) Speed Select Technology 557 Intel(R) Speed Select Technology 558 Executing on CPU model: X 558 Executing on CPU model: X 559 package-0 559 package-0 560 die-0 560 die-0 561 cpu-0 561 cpu-0 562 perf-profile-level-0 562 perf-profile-level-0 563 ... 563 ... 564 564 565 speed-select-base-freq:disabled 565 speed-select-base-freq:disabled 566 ... 566 ... 567 567 568 Before enabling Intel(R) SST-BF and measuring 568 Before enabling Intel(R) SST-BF and measuring its impact on a workload 569 performance, execute some workload and measure 569 performance, execute some workload and measure performance and get a baseline 570 performance to compare against. 570 performance to compare against. 571 571 572 Here the user wants more guaranteed performanc 572 Here the user wants more guaranteed performance. For this reason, it is likely 573 that turbo is disabled. To disable turbo, exec 573 that turbo is disabled. To disable turbo, execute:: 574 574 575 #echo 1 > /sys/devices/system/cpu/intel_pstate 575 #echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 576 576 577 Based on the output of the "intel-speed-select 577 Based on the output of the "intel-speed-select perf-profile info -l 0" base 578 frequency of guaranteed frequency 2600 MHz. 578 frequency of guaranteed frequency 2600 MHz. 579 579 580 580 581 Measure baseline performance for comparison 581 Measure baseline performance for comparison 582 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 582 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 583 583 584 To compare, pick a multi-threaded workload whe 584 To compare, pick a multi-threaded workload where each thread can be scheduled on 585 separate CPUs. "Hackbench pipe" test is a good 585 separate CPUs. "Hackbench pipe" test is a good example on how to improve 586 performance using Intel(R) SST-BF. 586 performance using Intel(R) SST-BF. 587 587 588 Below, the workload is measuring average sched 588 Below, the workload is measuring average scheduler wakeup latency, so a lower 589 number means better performance:: 589 number means better performance:: 590 590 591 # taskset -c 3,4 perf bench -r 100 sched pipe 591 # taskset -c 3,4 perf bench -r 100 sched pipe 592 # Running 'sched/pipe' benchmark: 592 # Running 'sched/pipe' benchmark: 593 # Executed 1000000 pipe operations between tw 593 # Executed 1000000 pipe operations between two processes 594 Total time: 6.102 [sec] 594 Total time: 6.102 [sec] 595 6.102445 usecs/op 595 6.102445 usecs/op 596 163868 ops/sec 596 163868 ops/sec 597 597 598 While running the above test, if we take turbo 598 While running the above test, if we take turbostat output, it will show us that 599 2 of the CPUs are busy and reaching max. frequ 599 2 of the CPUs are busy and reaching max. frequency (which would be the base 600 frequency as the turbo is disabled). The turbo 600 frequency as the turbo is disabled). The turbostat output:: 601 601 602 #turbostat -c 0-13 --show Package,Core,CPU,Bz 602 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 603 Package Core CPU Bzy_MHz 603 Package Core CPU Bzy_MHz 604 0 0 0 1000 604 0 0 0 1000 605 0 1 1 1005 605 0 1 1 1005 606 0 2 2 1000 606 0 2 2 1000 607 0 3 3 2600 607 0 3 3 2600 608 0 4 4 2600 608 0 4 4 2600 609 0 5 5 1000 609 0 5 5 1000 610 0 6 6 1000 610 0 6 6 1000 611 0 7 7 1005 611 0 7 7 1005 612 0 8 8 1005 612 0 8 8 1005 613 0 9 9 1000 613 0 9 9 1000 614 0 10 10 1000 614 0 10 10 1000 615 0 11 11 995 615 0 11 11 995 616 0 12 12 1000 616 0 12 12 1000 617 0 13 13 1000 617 0 13 13 1000 618 618 619 From the above turbostat output, both CPU 3 an 619 From the above turbostat output, both CPU 3 and 4 are very busy and reaching 620 full guaranteed frequency of 2600 MHz. 620 full guaranteed frequency of 2600 MHz. 621 621 622 Intel(R) SST-BF Capabilities 622 Intel(R) SST-BF Capabilities 623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 624 624 625 To get capabilities of Intel(R) SST-BF for the 625 To get capabilities of Intel(R) SST-BF for the current performance level 0, 626 execute:: 626 execute:: 627 627 628 # intel-speed-select base-freq info -l 0 628 # intel-speed-select base-freq info -l 0 629 Intel(R) Speed Select Technology 629 Intel(R) Speed Select Technology 630 Executing on CPU model: X 630 Executing on CPU model: X 631 package-0 631 package-0 632 die-0 632 die-0 633 cpu-0 633 cpu-0 634 speed-select-base-freq 634 speed-select-base-freq 635 high-priority-base-frequency(MHz):3000 635 high-priority-base-frequency(MHz):3000 636 high-priority-cpu-mask:00000216,000021 636 high-priority-cpu-mask:00000216,00002160 637 high-priority-cpu-list:5,6,8,13,33,34, 637 high-priority-cpu-list:5,6,8,13,33,34,36,41 638 low-priority-base-frequency(MHz):2400 638 low-priority-base-frequency(MHz):2400 639 tjunction-temperature(C):125 639 tjunction-temperature(C):125 640 thermal-design-power(W):205 640 thermal-design-power(W):205 641 641 642 The above capabilities show that there are som 642 The above capabilities show that there are some CPUs on this system that can 643 offer base frequency of 3000 MHz compared to t 643 offer base frequency of 3000 MHz compared to the standard base frequency at this 644 performance levels. Nevertheless, these CPUs a 644 performance levels. Nevertheless, these CPUs are fixed, and they are presented 645 via high-priority-cpu-list/high-priority-cpu-m 645 via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF 646 feature is selected, the low priorities CPUs ( 646 feature is selected, the low priorities CPUs (which are not in 647 high-priority-cpu-list) can only offer up to 2 647 high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this 648 clipping of low priority CPUs is acceptable, t 648 clipping of low priority CPUs is acceptable, then the user can enable Intel 649 SST-BF feature particularly for the above "sch 649 SST-BF feature particularly for the above "sched pipe" workload since only two 650 CPUs are used, they can be scheduled on high p 650 CPUs are used, they can be scheduled on high priority CPUs and can get boost of 651 400 MHz. 651 400 MHz. 652 652 653 Enable Intel(R) SST-BF 653 Enable Intel(R) SST-BF 654 ~~~~~~~~~~~~~~~~~~~~~~ 654 ~~~~~~~~~~~~~~~~~~~~~~ 655 655 656 To enable Intel(R) SST-BF feature, execute:: 656 To enable Intel(R) SST-BF feature, execute:: 657 657 658 # intel-speed-select base-freq enable -a 658 # intel-speed-select base-freq enable -a 659 Intel(R) Speed Select Technology 659 Intel(R) Speed Select Technology 660 Executing on CPU model: X 660 Executing on CPU model: X 661 package-0 661 package-0 662 die-0 662 die-0 663 cpu-0 663 cpu-0 664 base-freq 664 base-freq 665 enable:success 665 enable:success 666 package-1 666 package-1 667 die-0 667 die-0 668 cpu-14 668 cpu-14 669 base-freq 669 base-freq 670 enable:success 670 enable:success 671 671 672 In this case, -a option is optional. This not 672 In this case, -a option is optional. This not only enables Intel(R) SST-BF, but it 673 also adjusts the priority of cores using Intel 673 also adjusts the priority of cores using Intel(R) Speed Select Technology Core 674 Power (Intel(R) SST-CP) features. This option 674 Power (Intel(R) SST-CP) features. This option sets the minimum performance of each 675 Intel(R) Speed Select Technology - Performance 675 Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to 676 maximum performance so that the hardware will 676 maximum performance so that the hardware will give maximum performance possible 677 for each CPU. 677 for each CPU. 678 678 679 If -a option is not used, then the following s 679 If -a option is not used, then the following steps are required before enabling 680 Intel(R) SST-BF: 680 Intel(R) SST-BF: 681 681 682 - Discover Intel(R) SST-BF and note low and hi 682 - Discover Intel(R) SST-BF and note low and high priority base frequency 683 - Note the high priority CPU list 683 - Note the high priority CPU list 684 - Enable CLOS using core-power feature set 684 - Enable CLOS using core-power feature set 685 - Configure CLOS parameters. Use CLOS.min to s 685 - Configure CLOS parameters. Use CLOS.min to set to minimum performance 686 - Subscribe desired CPUs to CLOS groups 686 - Subscribe desired CPUs to CLOS groups 687 687 688 With this configuration, if the same workload 688 With this configuration, if the same workload is executed by pinning the 689 workload to high priority CPUs (CPU 5 and 6 in 689 workload to high priority CPUs (CPU 5 and 6 in this case):: 690 690 691 #taskset -c 5,6 perf bench -r 100 sched pipe 691 #taskset -c 5,6 perf bench -r 100 sched pipe 692 # Running 'sched/pipe' benchmark: 692 # Running 'sched/pipe' benchmark: 693 # Executed 1000000 pipe operations between tw 693 # Executed 1000000 pipe operations between two processes 694 Total time: 5.627 [sec] 694 Total time: 5.627 [sec] 695 5.627922 usecs/op 695 5.627922 usecs/op 696 177685 ops/sec 696 177685 ops/sec 697 697 698 This way, by enabling Intel(R) SST-BF, the per 698 This way, by enabling Intel(R) SST-BF, the performance of this benchmark is 699 improved (latency reduced) by 7.79%. From the 699 improved (latency reduced) by 7.79%. From the turbostat output, it can be 700 observed that the high priority CPUs reached 3 700 observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz. 701 The turbostat output:: 701 The turbostat output:: 702 702 703 #turbostat -c 0-13 --show Package,Core,CPU,Bz 703 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 704 Package Core CPU Bzy_MHz 704 Package Core CPU Bzy_MHz 705 0 0 0 2151 705 0 0 0 2151 706 0 1 1 2166 706 0 1 1 2166 707 0 2 2 2175 707 0 2 2 2175 708 0 3 3 2175 708 0 3 3 2175 709 0 4 4 2175 709 0 4 4 2175 710 0 5 5 3000 710 0 5 5 3000 711 0 6 6 3000 711 0 6 6 3000 712 0 7 7 2180 712 0 7 7 2180 713 0 8 8 2662 713 0 8 8 2662 714 0 9 9 2176 714 0 9 9 2176 715 0 10 10 2175 715 0 10 10 2175 716 0 11 11 2176 716 0 11 11 2176 717 0 12 12 2176 717 0 12 12 2176 718 0 13 13 2661 718 0 13 13 2661 719 719 720 Disable Intel(R) SST-BF 720 Disable Intel(R) SST-BF 721 ~~~~~~~~~~~~~~~~~~~~~~~ 721 ~~~~~~~~~~~~~~~~~~~~~~~ 722 722 723 To disable the Intel(R) SST-BF feature, execut 723 To disable the Intel(R) SST-BF feature, execute:: 724 724 725 # intel-speed-select base-freq disable -a 725 # intel-speed-select base-freq disable -a 726 726 727 727 728 Intel(R) Speed Select Technology - Turbo Frequ 728 Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF) 729 ---------------------------------------------- 729 -------------------------------------------------------------------- 730 730 731 This feature enables the ability to set differ 731 This feature enables the ability to set different "All core turbo ratio limits" 732 to cores based on the priority. By using this 732 to cores based on the priority. By using this feature, some cores can be 733 configured to get higher turbo frequency by de 733 configured to get higher turbo frequency by designating them as high priority at 734 the cost of lower or no turbo frequency on the 734 the cost of lower or no turbo frequency on the low priority cores. 735 735 736 For this reason, this feature is only useful w 736 For this reason, this feature is only useful when system is busy utilizing all 737 CPUs, but the user wants some configurable opt 737 CPUs, but the user wants some configurable option to get high performance on 738 some CPUs. 738 some CPUs. 739 739 740 The support of Intel(R) Speed Select Technolog 740 The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF) 741 depends on the Intel(R) Speed Select Technolog 741 depends on the Intel(R) Speed Select Technology - Performance Profile (Intel 742 SST-PP) performance level configuration. It is 742 SST-PP) performance level configuration. It is possible that only a certain 743 performance level supports Intel(R) SST-TF. It 743 performance level supports Intel(R) SST-TF. It is also possible that only the base 744 performance level (level = 0) has the support 744 performance level (level = 0) has the support of Intel(R) SST-TF. Hence, first 745 select the desired performance level to enable 745 select the desired performance level to enable this feature. 746 746 747 In the system under test here, Intel(R) SST-TF 747 In the system under test here, Intel(R) SST-TF is supported at the base 748 performance level 0, but currently disabled:: 748 performance level 0, but currently disabled:: 749 749 750 # intel-speed-select -c 0 perf-profile info - 750 # intel-speed-select -c 0 perf-profile info -l 0 751 Intel(R) Speed Select Technology 751 Intel(R) Speed Select Technology 752 package-0 752 package-0 753 die-0 753 die-0 754 cpu-0 754 cpu-0 755 perf-profile-level-0 755 perf-profile-level-0 756 ... 756 ... 757 ... 757 ... 758 speed-select-turbo-freq:disabled 758 speed-select-turbo-freq:disabled 759 ... 759 ... 760 ... 760 ... 761 761 762 762 763 To check if performance can be improved using 763 To check if performance can be improved using Intel(R) SST-TF feature, get the turbo 764 frequency properties with Intel(R) SST-TF enab 764 frequency properties with Intel(R) SST-TF enabled and compare to the base turbo 765 capability of this system. 765 capability of this system. 766 766 767 Get Base turbo capability 767 Get Base turbo capability 768 ~~~~~~~~~~~~~~~~~~~~~~~~~ 768 ~~~~~~~~~~~~~~~~~~~~~~~~~ 769 769 770 To get the base turbo capability of performanc 770 To get the base turbo capability of performance level 0, execute:: 771 771 772 # intel-speed-select perf-profile info -l 0 772 # intel-speed-select perf-profile info -l 0 773 Intel(R) Speed Select Technology 773 Intel(R) Speed Select Technology 774 Executing on CPU model: X 774 Executing on CPU model: X 775 package-0 775 package-0 776 die-0 776 die-0 777 cpu-0 777 cpu-0 778 perf-profile-level-0 778 perf-profile-level-0 779 ... 779 ... 780 ... 780 ... 781 turbo-ratio-limits-sse 781 turbo-ratio-limits-sse 782 bucket-0 782 bucket-0 783 core-count:2 783 core-count:2 784 max-turbo-frequency(MHz):3200 784 max-turbo-frequency(MHz):3200 785 bucket-1 785 bucket-1 786 core-count:4 786 core-count:4 787 max-turbo-frequency(MHz):3100 787 max-turbo-frequency(MHz):3100 788 bucket-2 788 bucket-2 789 core-count:6 789 core-count:6 790 max-turbo-frequency(MHz):3100 790 max-turbo-frequency(MHz):3100 791 bucket-3 791 bucket-3 792 core-count:8 792 core-count:8 793 max-turbo-frequency(MHz):3100 793 max-turbo-frequency(MHz):3100 794 bucket-4 794 bucket-4 795 core-count:10 795 core-count:10 796 max-turbo-frequency(MHz):3100 796 max-turbo-frequency(MHz):3100 797 bucket-5 797 bucket-5 798 core-count:12 798 core-count:12 799 max-turbo-frequency(MHz):3100 799 max-turbo-frequency(MHz):3100 800 bucket-6 800 bucket-6 801 core-count:14 801 core-count:14 802 max-turbo-frequency(MHz):3100 802 max-turbo-frequency(MHz):3100 803 bucket-7 803 bucket-7 804 core-count:16 804 core-count:16 805 max-turbo-frequency(MHz):3100 805 max-turbo-frequency(MHz):3100 806 806 807 Based on the data above, when all the CPUS are 807 Based on the data above, when all the CPUS are busy, the max. frequency of 3100 808 MHz can be achieved. If there is some busy wor 808 MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress) 809 and on CPU 12 and 13, execute "hackbench pipe" 809 and on CPU 12 and 13, execute "hackbench pipe" workload:: 810 810 811 # taskset -c 12,13 perf bench -r 100 sched pi 811 # taskset -c 12,13 perf bench -r 100 sched pipe 812 # Running 'sched/pipe' benchmark: 812 # Running 'sched/pipe' benchmark: 813 # Executed 1000000 pipe operations between tw 813 # Executed 1000000 pipe operations between two processes 814 Total time: 5.705 [sec] 814 Total time: 5.705 [sec] 815 5.705488 usecs/op 815 5.705488 usecs/op 816 175269 ops/sec 816 175269 ops/sec 817 817 818 The turbostat output:: 818 The turbostat output:: 819 819 820 #turbostat -c 0-13 --show Package,Core,CPU,Bz 820 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 821 Package Core CPU Bzy_MHz 821 Package Core CPU Bzy_MHz 822 0 0 0 3000 822 0 0 0 3000 823 0 1 1 3000 823 0 1 1 3000 824 0 2 2 3000 824 0 2 2 3000 825 0 3 3 3000 825 0 3 3 3000 826 0 4 4 3000 826 0 4 4 3000 827 0 5 5 3100 827 0 5 5 3100 828 0 6 6 3100 828 0 6 6 3100 829 0 7 7 3000 829 0 7 7 3000 830 0 8 8 3100 830 0 8 8 3100 831 0 9 9 3000 831 0 9 9 3000 832 0 10 10 3000 832 0 10 10 3000 833 0 11 11 3000 833 0 11 11 3000 834 0 12 12 3100 834 0 12 12 3100 835 0 13 13 3100 835 0 13 13 3100 836 836 837 Based on turbostat output, the performance is 837 Based on turbostat output, the performance is limited by frequency cap of 3100 838 MHz. To check if the hackbench performance can 838 MHz. To check if the hackbench performance can be improved for CPU 12 and CPU 839 13, first check the capability of the Intel(R) 839 13, first check the capability of the Intel(R) SST-TF feature for this performance 840 level. 840 level. 841 841 842 Get Intel(R) SST-TF Capability 842 Get Intel(R) SST-TF Capability 843 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 843 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 844 844 845 To get the capability, the "turbo-freq info" c 845 To get the capability, the "turbo-freq info" command can be used:: 846 846 847 # intel-speed-select turbo-freq info -l 0 847 # intel-speed-select turbo-freq info -l 0 848 Intel(R) Speed Select Technology 848 Intel(R) Speed Select Technology 849 Executing on CPU model: X 849 Executing on CPU model: X 850 package-0 850 package-0 851 die-0 851 die-0 852 cpu-0 852 cpu-0 853 speed-select-turbo-freq 853 speed-select-turbo-freq 854 bucket-0 854 bucket-0 855 high-priority-cores-count:2 855 high-priority-cores-count:2 856 high-priority-max-frequency(MHz):3 856 high-priority-max-frequency(MHz):3200 857 high-priority-max-avx2-frequency(M 857 high-priority-max-avx2-frequency(MHz):3200 858 high-priority-max-avx512-frequency 858 high-priority-max-avx512-frequency(MHz):3100 859 bucket-1 859 bucket-1 860 high-priority-cores-count:4 860 high-priority-cores-count:4 861 high-priority-max-frequency(MHz):3 861 high-priority-max-frequency(MHz):3100 862 high-priority-max-avx2-frequency(M 862 high-priority-max-avx2-frequency(MHz):3000 863 high-priority-max-avx512-frequency 863 high-priority-max-avx512-frequency(MHz):2900 864 bucket-2 864 bucket-2 865 high-priority-cores-count:6 865 high-priority-cores-count:6 866 high-priority-max-frequency(MHz):3 866 high-priority-max-frequency(MHz):3100 867 high-priority-max-avx2-frequency(M 867 high-priority-max-avx2-frequency(MHz):3000 868 high-priority-max-avx512-frequency 868 high-priority-max-avx512-frequency(MHz):2900 869 speed-select-turbo-freq-clip-frequen 869 speed-select-turbo-freq-clip-frequencies 870 low-priority-max-frequency(MHz):26 870 low-priority-max-frequency(MHz):2600 871 low-priority-max-avx2-frequency(MH 871 low-priority-max-avx2-frequency(MHz):2400 872 low-priority-max-avx512-frequency( 872 low-priority-max-avx512-frequency(MHz):2100 873 873 874 Based on the output above, there is an Intel(R 874 Based on the output above, there is an Intel(R) SST-TF bucket for which there are 875 two high priority cores. If only two high prio 875 two high priority cores. If only two high priority cores are set, then max. 876 turbo frequency on those cores can be increase 876 turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz 877 more than the base turbo capability for all co 877 more than the base turbo capability for all cores. 878 878 879 In turn, for the hackbench workload, two CPUs 879 In turn, for the hackbench workload, two CPUs can be set as high priority and 880 rest as low priority. One side effect is that 880 rest as low priority. One side effect is that once enabled, the low priority 881 cores will be clipped to a lower frequency of 881 cores will be clipped to a lower frequency of 2600 MHz. 882 882 883 Enable Intel(R) SST-TF 883 Enable Intel(R) SST-TF 884 ~~~~~~~~~~~~~~~~~~~~~~ 884 ~~~~~~~~~~~~~~~~~~~~~~ 885 885 886 To enable Intel(R) SST-TF, execute:: 886 To enable Intel(R) SST-TF, execute:: 887 887 888 # intel-speed-select -c 12,13 turbo-freq enab 888 # intel-speed-select -c 12,13 turbo-freq enable -a 889 Intel(R) Speed Select Technology 889 Intel(R) Speed Select Technology 890 Executing on CPU model: X 890 Executing on CPU model: X 891 package-0 891 package-0 892 die-0 892 die-0 893 cpu-12 893 cpu-12 894 turbo-freq 894 turbo-freq 895 enable:success 895 enable:success 896 package-0 896 package-0 897 die-0 897 die-0 898 cpu-13 898 cpu-13 899 turbo-freq 899 turbo-freq 900 enable:success 900 enable:success 901 package--1 901 package--1 902 die-0 902 die-0 903 cpu-63 903 cpu-63 904 turbo-freq --auto 904 turbo-freq --auto 905 enable:success 905 enable:success 906 906 907 In this case, the option "-a" is optional. If 907 In this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF 908 feature and also sets the CPUs to high and low 908 feature and also sets the CPUs to high and low priority using Intel Speed 909 Select Technology Core Power (Intel(R) SST-CP) 909 Select Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed 910 with "-c" arguments are marked as high priorit 910 with "-c" arguments are marked as high priority, including its siblings. 911 911 912 If -a option is not used, then the following s 912 If -a option is not used, then the following steps are required before enabling 913 Intel(R) SST-TF: 913 Intel(R) SST-TF: 914 914 915 - Discover Intel(R) SST-TF and note buckets of 915 - Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency 916 916 917 - Enable CLOS using core-power feature set - C 917 - Enable CLOS using core-power feature set - Configure CLOS parameters 918 918 919 - Subscribe desired CPUs to CLOS groups making 919 - Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency 920 920 921 If the same hackbench workload is executed, sc 921 If the same hackbench workload is executed, schedule hackbench threads on high 922 priority CPUs:: 922 priority CPUs:: 923 923 924 #taskset -c 12,13 perf bench -r 100 sched pip 924 #taskset -c 12,13 perf bench -r 100 sched pipe 925 # Running 'sched/pipe' benchmark: 925 # Running 'sched/pipe' benchmark: 926 # Executed 1000000 pipe operations between tw 926 # Executed 1000000 pipe operations between two processes 927 Total time: 5.510 [sec] 927 Total time: 5.510 [sec] 928 5.510165 usecs/op 928 5.510165 usecs/op 929 180826 ops/sec 929 180826 ops/sec 930 930 931 This improved performance by around 3.3% impro 931 This improved performance by around 3.3% improvement on a busy system. Here the 932 turbostat output will show that the CPU 12 and 932 turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost. 933 The turbostat output:: 933 The turbostat output:: 934 934 935 #turbostat -c 0-13 --show Package,Core,CPU,Bz 935 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 936 Package Core CPU Bzy_MHz 936 Package Core CPU Bzy_MHz 937 ... 937 ... 938 0 12 12 3200 938 0 12 12 3200 939 0 13 13 3200 939 0 13 13 3200
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.