~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/pm/intel_pstate.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/admin-guide/pm/intel_pstate.rst (Version linux-6.12-rc7) and /Documentation/admin-guide/pm/intel_pstate.rst (Version unix-v6-master)


  1 .. SPDX-License-Identifier: GPL-2.0               
  2 .. include:: <isonum.txt>                         
  3                                                   
  4 ==============================================    
  5 ``intel_pstate`` CPU Performance Scaling Drive    
  6 ==============================================    
  7                                                   
  8 :Copyright: |copy| 2017 Intel Corporation         
  9                                                   
 10 :Author: Rafael J. Wysocki <rafael.j.wysocki@in    
 11                                                   
 12                                                   
 13 General Information                               
 14 ===================                               
 15                                                   
 16 ``intel_pstate`` is a part of the                 
 17 :doc:`CPU performance scaling subsystem <cpufr    
 18 (``CPUFreq``).  It is a scaling driver for the    
 19 generations of Intel processors.  Note, howeve    
 20 may not be supported.  [To understand ``intel_    
 21 how ``CPUFreq`` works in general, so this is t    
 22 Documentation/admin-guide/pm/cpufreq.rst if yo    
 23                                                   
 24 For the processors supported by ``intel_pstate    
 25 than just an operating frequency or an operati    
 26 LinuxCon Europe 2015 presentation by Kristen A    
 27 information about that).  For this reason, the    
 28 by ``intel_pstate`` internally follows the har    
 29 refer to Intel Software Developer’s Manual [    
 30 uses frequencies for identifying operating per    
 31 frequencies are involved in the user space int    
 32 ``intel_pstate`` maps its internal representat    
 33 (fortunately, that mapping is unambiguous).  A    
 34 practical for ``intel_pstate`` to supply the `    
 35 available frequencies due to the possible size    
 36 that.  Some functionality of the core is limit    
 37                                                   
 38 Since the hardware P-state selection interface    
 39 available at the logical CPU level, the driver    
 40 CPUs.  Consequently, if ``intel_pstate`` is in    
 41 object corresponds to one logical CPU and ``CP    
 42 equivalent to CPUs.  In particular, this means    
 43 time the corresponding CPU is taken offline an    
 44 it goes back online.                              
 45                                                   
 46 ``intel_pstate`` is not modular, so it cannot     
 47 only way to pass early-configuration-time para    
 48 command line.  However, its configuration can     
 49 great extent.  In some configurations it even     
 50 ``sysfs`` which allows another ``CPUFreq`` sca    
 51 registered (see `below <status_attr_>`_).         
 52                                                   
 53                                                   
 54 Operation Modes                                   
 55 ===============                                   
 56                                                   
 57 ``intel_pstate`` can operate in two different     
 58 active mode, it uses its own internal performa    
 59 allows the hardware to do performance scaling     
 60 mode it responds to requests made by a generic    
 61 a certain performance scaling algorithm.  Whic    
 62 depends on what kernel command line options ar    
 63 the processor.                                    
 64                                                   
 65 Active Mode                                       
 66 -----------                                       
 67                                                   
 68 This is the default operation mode of ``intel_    
 69 hardware-managed P-states (HWP) support.  If i    
 70 ``scaling_driver`` policy attribute in ``sysfs    
 71 contains the string "intel_pstate".               
 72                                                   
 73 In this mode the driver bypasses the scaling g    
 74 provides its own scaling algorithms for P-stat    
 75 can be applied to ``CPUFreq`` policies in the     
 76 governors (that is, through the ``scaling_gove    
 77 ``sysfs``).  [Note that different P-state sele    
 78 different policies, but that is not recommende    
 79                                                   
 80 They are not generic scaling governors, but th    
 81 names of some of those governors.  Moreover, c    
 82 do not work in the same way as the generic gov    
 83 For example, the ``powersave`` P-state selecti    
 84 ``intel_pstate`` is not a counterpart of the g    
 85 (roughly, it corresponds to the ``schedutil``     
 86                                                   
 87 There are two P-state selection algorithms pro    
 88 active mode: ``powersave`` and ``performance``    
 89 depends on whether or not the hardware-managed    
 90 enabled in the processor and possibly on the p    
 91                                                   
 92 Which of the P-state selection algorithms is u    
 93 :c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMA    
 94 Namely, if that option is set, the ``performan    
 95 default, and the other one will be used by def    
 96                                                   
 97 Active Mode With HWP                              
 98 ~~~~~~~~~~~~~~~~~~~~                              
 99                                                   
100 If the processor supports the HWP feature, it     
101 processor initialization and cannot be disable    
102 to avoid enabling it by passing the ``intel_ps    
103 kernel in the command line.                       
104                                                   
105 If the HWP feature has been enabled, ``intel_p    
106 select P-states by itself, but still it can gi    
107 internal P-state selection logic.  What those     
108 selection algorithm has been applied to the gi    
109 corresponds to).                                  
110                                                   
111 Even though the P-state selection is carried o    
112 ``intel_pstate`` registers utilization update     
113 in this mode.  However, they are not used for     
114 algorithm, but for periodic updates of the cur    
115 be made available from the ``scaling_cur_freq`    
116                                                   
117 HWP + ``performance``                             
118 .....................                             
119                                                   
120 In this configuration ``intel_pstate`` will wr    
121 Energy-Performance Preference (EPP) knob (if s    
122 Energy-Performance Bias (EPB) knob (otherwise)    
123 internal P-state selection logic is expected t    
124                                                   
125 This will override the EPP/EPB setting coming     
126 (see `Energy vs Performance Hints`_ below).  M    
127 the EPP/EPB to a value different from 0 ("perf    
128 configuration will be rejected.                   
129                                                   
130 Also, in this configuration the range of P-sta    
131 internal P-state selection logic is always res    
132 (that is, the maximum P-state that the driver     
133                                                   
134 HWP + ``powersave``                               
135 ...................                               
136                                                   
137 In this configuration ``intel_pstate`` will se    
138 Energy-Performance Preference (EPP) knob (if s    
139 Energy-Performance Bias (EPB) knob (otherwise)    
140 previously set to via ``sysfs`` (or whatever d    
141 set to by the platform firmware).  This usuall    
142 internal P-state selection logic to be less pe    
143                                                   
144 Active Mode Without HWP                           
145 ~~~~~~~~~~~~~~~~~~~~~~~                           
146                                                   
147 This operation mode is optional for processors    
148 feature or when the ``intel_pstate=no_hwp`` ar    
149 the command line.  The active mode is used in     
150 ``intel_pstate=active`` argument is passed to     
151 In this mode ``intel_pstate`` may refuse to wo    
152 recognized by it.  [Note that ``intel_pstate``    
153 any processor with the HWP feature enabled.]      
154                                                   
155 In this mode ``intel_pstate`` registers utiliz    
156 CPU scheduler in order to run a P-state select    
157 ``powersave`` or ``performance``, depending on    
158 setting in ``sysfs``.  The current CPU frequen    
159 available from the ``scaling_cur_freq`` policy    
160 periodically updated by those utilization upda    
161                                                   
162 ``performance``                                   
163 ...............                                   
164                                                   
165 Without HWP, this P-state selection algorithm     
166 the processor model and platform configuration    
167                                                   
168 It selects the maximum P-state it is allowed t    
169 ``sysfs``, every time the driver configuration    
170 (e.g. via ``sysfs``).                             
171                                                   
172 This is the default P-state selection algorith    
173 :c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMA    
174 is set.                                           
175                                                   
176 ``powersave``                                     
177 .............                                     
178                                                   
179 Without HWP, this P-state selection algorithm     
180 implemented by the generic ``schedutil`` scali    
181 utilization metric used by it is based on numb    
182 registers of the CPU.  It generally selects P-    
183 current CPU utilization.                          
184                                                   
185 This algorithm is run by the driver's utilizat    
186 given CPU when it is invoked by the CPU schedu    
187 every 10 ms.  Like in the ``performance`` case    
188 is not touched if the new P-state turns out to    
189 one.                                              
190                                                   
191 This is the default P-state selection algorith    
192 :c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMA    
193 is not set.                                       
194                                                   
195 Passive Mode                                      
196 ------------                                      
197                                                   
198 This is the default operation mode of ``intel_    
199 hardware-managed P-states (HWP) support.  It i    
200 ``intel_pstate=passive`` argument is passed to    
201 regardless of whether or not the given process    
202 ``intel_pstate=no_hwp`` setting causes the dri    
203 if it is not combined with ``intel_pstate=acti    
204 without HWP support, in this mode ``intel_psta    
205 processors that are not recognized by it if HW    
206 through the kernel command line.                  
207                                                   
208 If the driver works in this mode, the ``scalin    
209 ``sysfs`` for all ``CPUFreq`` policies contain    
210 Then, the driver behaves like a regular ``CPUF    
211 it is invoked by generic scaling governors whe    
212 hardware in order to change the P-state of a C    
213 ``schedutil`` governor can invoke it directly     
214                                                   
215 While in this mode, ``intel_pstate`` can be us    
216 scaling governors listed by the ``scaling_avai    
217 in ``sysfs`` (and the P-state selection algori    
218 used).  Then, it is responsible for the config    
219 corresponding to CPUs and provides the ``CPUFr    
220 governors attached to the policy objects) with    
221 maximum and minimum operating frequencies supp    
222 the so-called "turbo" frequency ranges).  In o    
223 the entire range of available P-states is expo    
224 ``CPUFreq`` core.  However, in this mode the d    
225 utilization update callbacks with the CPU sche    
226 information comes from the ``CPUFreq`` core (a    
227 by the current scaling governor for the given     
228                                                   
229                                                   
230 .. _turbo:                                        
231                                                   
232 Turbo P-states Support                            
233 ======================                            
234                                                   
235 In the majority of cases, the entire range of     
236 ``intel_pstate`` can be divided into two sub-r    
237 different types of processor behavior, above a    
238 will be referred to as the "turbo threshold" i    
239                                                   
240 The P-states above the turbo threshold are ref    
241 the whole sub-range of P-states they belong to    
242 range".  These names are related to the Turbo     
243 multicore processor to opportunistically incre    
244 cores if there is enough power to do that and     
245 thermal envelope of the processor package to b    
246                                                   
247 Specifically, if software sets the P-state of     
248 (that is, above the turbo threshold), the proc    
249 performance scaling control for that core and     
250 choice going forward.  However, that permissio    
251 different processor generations.  Namely, the     
252 processors will never use any P-states above t    
253 the given core, even if it is within the turbo    
254 processor generations will take it as a licens    
255 turbo range, even above the one set by softwar    
256 processors setting any P-state from the turbo     
257 to put the given core into all turbo P-states     
258 supported one as it sees fit.                     
259                                                   
260 One important property of turbo P-states is th    
261 precisely, there is no guarantee that any CPUs    
262 those states indefinitely, because the power d    
263 package may change over time  or the thermal e    
264 be exceeded if a turbo P-state was used for to    
265                                                   
266 In turn, the P-states below the turbo threshol    
267 fact, if one of them is set by software, the p    
268 it to a lower one unless in a thermal stress o    
269 situation (a higher P-state may still be used     
270 the same package at the same time, for example    
271                                                   
272 Some processors allow multiple cores to be in     
273 but the maximum P-state that can be set for th    
274 of cores running concurrently.  The maximum tu    
275 cores at the same time usually is lower than t    
276 2 cores, which in turn usually is lower than t    
277 be set for 1 core.  The one-core maximum turbo    
278 supported one overall.                            
279                                                   
280 The maximum supported turbo P-state, the turbo    
281 non-turbo P-state) and the minimum supported P    
282 processor model and can be determined by readi    
283 registers (MSRs).  Moreover, some processors s    
284 (Thermal Design Power) feature and, when that     
285 threshold effectively becomes a configurable v    
286 platform firmware.                                
287                                                   
288 Unlike ``_PSS`` objects in the ACPI tables, ``    
289 the entire range of available P-states, includ    
290 ``CPUFreq`` core and (in the passive mode) to     
291 generally causes turbo P-states to be set more    
292 used relative to ACPI-based CPU performance sc    
293 for more information).                            
294                                                   
295 Moreover, since ``intel_pstate`` always knows     
296 (even if the Configurable TDP feature is enabl    
297 ``no_turbo`` attribute in ``sysfs`` (described    
298 work as expected in all cases (that is, if set    
299 always should prevent ``intel_pstate`` from us    
300                                                   
301                                                   
302 Processor Support                                 
303 =================                                 
304                                                   
305 To handle a given processor ``intel_pstate`` r    
306 pieces of information on it to be known, inclu    
307                                                   
308  * The minimum supported P-state.                 
309                                                   
310  * The maximum supported `non-turbo P-state <t    
311                                                   
312  * Whether or not turbo P-states are supported    
313                                                   
314  * The maximum supported `one-core turbo P-sta    
315    are supported).                                
316                                                   
317  * The scaling formula to translate the driver    
318    of P-states into frequencies and the other     
319                                                   
320 Generally, ways to obtain that information are    
321 or family.  Although it often is possible to o    
322 itself (using model-specific registers), there    
323 manuals need to be consulted to get to it too.    
324                                                   
325 For this reason, there is a list of supported     
326 the driver initialization will fail if the det    
327 list, unless it supports the HWP feature.  [Th    
328 information listed above is the same for all o    
329 HWP feature, which is why ``intel_pstate`` wor    
330                                                   
331                                                   
332 User Space Interface in ``sysfs``                 
333 =================================                 
334                                                   
335 Global Attributes                                 
336 -----------------                                 
337                                                   
338 ``intel_pstate`` exposes several global attrib    
339 control its functionality at the system level.    
340 ``/sys/devices/system/cpu/intel_pstate/`` dire    
341                                                   
342 Some of them are not present if the ``intel_ps    
343 argument is passed to the kernel in the comman    
344                                                   
345 ``max_perf_pct``                                  
346         Maximum P-state the driver is allowed     
347         maximum supported performance level (t    
348         P-state <turbo_>`_).                      
349                                                   
350         This attribute will not be exposed if     
351         ``intel_pstate=per_cpu_perf_limits`` a    
352         command line.                             
353                                                   
354 ``min_perf_pct``                                  
355         Minimum P-state the driver is allowed     
356         maximum supported performance level (t    
357         P-state <turbo_>`_).                      
358                                                   
359         This attribute will not be exposed if     
360         ``intel_pstate=per_cpu_perf_limits`` a    
361         command line.                             
362                                                   
363 ``num_pstates``                                   
364         Number of P-states supported by the pr    
365         inclusive) including both turbo and no    
366         `Turbo P-states Support`_).               
367                                                   
368         This attribute is present only if the     
369         for all of the CPUs in the system.        
370                                                   
371         The value of this attribute is not aff    
372         setting described `below <no_turbo_att    
373                                                   
374         This attribute is read-only.              
375                                                   
376 ``turbo_pct``                                     
377         Ratio of the `turbo range <turbo_>`_ s    
378         range of supported P-states, in percen    
379                                                   
380         This attribute is present only if the     
381         for all of the CPUs in the system.        
382                                                   
383         This attribute is read-only.              
384                                                   
385 .. _no_turbo_attr:                                
386                                                   
387 ``no_turbo``                                      
388         If set (equal to 1), the driver is not    
389         (see `Turbo P-states Support`_).  If u    
390         default), turbo P-states can be set by    
391         [Note that ``intel_pstate`` does not s    
392         attribute (supported by some other sca    
393         by this one.]                             
394                                                   
395         This attribute does not affect the max    
396         supplied to the ``CPUFreq`` core and e    
397         but it affects the maximum possible va    
398         (see `Interpretation of Policy Attribu    
399                                                   
400 ``hwp_dynamic_boost``                             
401         This attribute is only present if ``in    
402         `active mode with the HWP feature enab    
403         the processor.  If set (equal to 1), i    
404         to be increased dynamically for a shor    
405         waiting on I/O is selected to run on a    
406         of this mechanism is to improve perfor    
407                                                   
408         This setting has no effect on logical     
409         is directly set to the highest non-tur    
410                                                   
411 .. _status_attr:                                  
412                                                   
413 ``status``                                        
414         Operation mode of the driver: "active"    
415                                                   
416         "active"                                  
417                 The driver is functional and i    
418                 <Active Mode_>`_.                 
419                                                   
420         "passive"                                 
421                 The driver is functional and i    
422                 <Passive Mode_>`_.                
423                                                   
424         "off"                                     
425                 The driver is not functional (    
426                 driver with the ``CPUFreq`` co    
427                                                   
428         This attribute can be written to in or    
429         operation mode or to unregister it.  T    
430         one of the possible values of it and,     
431         cause the driver to switch over to the    
432         that string - or to be unregistered in    
433         switching over from the active mode to    
434         way around causes the driver to be unr    
435         with a different set of callbacks, so     
436         as well as the per-policy ones) are th    
437         values, possibly depending on the targ    
438                                                   
439 ``energy_efficiency``                             
440         This attribute is only present on plat    
441         Lake or Coffee Lake desktop CPU model.    
442         optimizations are disabled on these CP    
443         Enabling energy-efficiency optimizatio    
444         frequency with or without the HWP feat    
445         optimizations are done only in the tur    
446         they are done in the entire available     
447         attribute to "1" enables the energy-ef    
448         to "0" disables them.                     
449                                                   
450 Interpretation of Policy Attributes               
451 -----------------------------------               
452                                                   
453 The interpretation of some ``CPUFreq`` policy     
454 Documentation/admin-guide/pm/cpufreq.rst is sp    
455 as the current scaling driver and it generally    
456 `operation mode <Operation Modes_>`_.             
457                                                   
458 First of all, the values of the ``cpuinfo_max_    
459 ``scaling_cur_freq`` attributes are produced b    
460 multiplier to the internal P-state representat    
461 Also, the values of the ``scaling_max_freq`` a    
462 attributes are capped by the frequency corresp    
463 the driver is allowed to set.                     
464                                                   
465 If the ``no_turbo`` `global attribute <no_turb    
466 not allowed to use turbo P-states, so the maxi    
467 and ``scaling_min_freq`` is limited to the max    
468 Accordingly, setting ``no_turbo`` causes ``sca    
469 ``scaling_min_freq`` to go down to that value     
470 However, the old values of ``scaling_max_freq`    
471 restored after unsetting ``no_turbo``, unless     
472 to after ``no_turbo`` was set.                    
473                                                   
474 If ``no_turbo`` is not set, the maximum possib    
475 and ``scaling_min_freq`` corresponds to the ma    
476 which also is the value of ``cpuinfo_max_freq`    
477                                                   
478 Next, the following policy attributes have spe    
479 ``intel_pstate`` works in the `active mode <Ac    
480                                                   
481 ``scaling_available_governors``                   
482         List of P-state selection algorithms p    
483                                                   
484 ``scaling_governor``                              
485         P-state selection algorithm provided b    
486         use with the given policy.                
487                                                   
488 ``scaling_cur_freq``                              
489         Frequency of the average P-state of th    
490         policy for the time interval between t    
491         driver's utilization update callback b    
492                                                   
493 One more policy attribute is present if the HW    
494 processor:                                        
495                                                   
496 ``base_frequency``                                
497         Shows the base frequency of the CPU. A    
498         in the turbo frequency range.             
499                                                   
500 The meaning of these attributes in the `passiv    
501 same as for other scaling drivers.                
502                                                   
503 Additionally, the value of the ``scaling_drive    
504 depends on the operation mode of the driver.      
505 "intel_pstate" (in the `active mode <Active Mo    
506 `passive mode <Passive Mode_>`_).                 
507                                                   
508 Coordination of P-State Limits                    
509 ------------------------------                    
510                                                   
511 ``intel_pstate`` allows P-state limits to be s    
512 the ``max_perf_pct`` and ``min_perf_pct`` `glo    
513 <Global Attributes_>`_ or via the ``scaling_ma    
514 ``CPUFreq`` policy attributes.  The coordinati    
515 on the following rules, regardless of the curr    
516                                                   
517  1. All CPUs are affected by the global limits    
518     requested to run faster than the global ma    
519     requested to run slower than the global mi    
520                                                   
521  2. Each individual CPU is affected by its own    
522     cannot be requested to run faster than its    
523     cannot be requested to run slower than its    
524     effective performance depends on whether t    
525     P-states, hyper-threading is enabled and o    
526     from other CPUs. When platform doesn't sup    
527     effective performance can be more than the    
528     other CPUs are requesting higher performan    
529     core P-states support, when hyper-threadin    
530     is requesting higher performance, the othe    
531     performance than their policy limits.         
532                                                   
533  3. The global and per-policy limits can be se    
534                                                   
535 In the `active mode with the HWP feature enabl    
536 resulting effective values are written into ha    
537 limits change in order to request its internal    
538 set P-states within these limits.  Otherwise,     
539 by scaling governors (in the `passive mode <Pa    
540 every time before setting a new P-state for a     
541                                                   
542 Additionally, if the ``intel_pstate=per_cpu_pe    
543 is passed to the kernel, ``max_perf_pct`` and     
544 at all and the only way to set the limits is b    
545                                                   
546                                                   
547 Energy vs Performance Hints                       
548 ---------------------------                       
549                                                   
550 If the hardware-managed P-states (HWP) is enab    
551 attributes, intended to allow user space to he    
552 processor's internal P-state selection logic b    
553 energy-efficiency, or somewhere between the tw    
554 ``CPUFreq`` policy directory in ``sysfs``.  Th    
555                                                   
556 ``energy_performance_preference``                 
557         Current value of the energy vs perform    
558         (or the CPU represented by it).           
559                                                   
560         The hint can be changed by writing to     
561                                                   
562 ``energy_performance_available_preferences``      
563         List of strings that can be written to    
564         ``energy_performance_preference`` attr    
565                                                   
566         They represent different energy vs per    
567         self-explanatory, except that ``defaul    
568         value was set by the platform firmware    
569                                                   
570 Strings written to the ``energy_performance_pr    
571 internally translated to integer values writte    
572 Energy-Performance Preference (EPP) knob (if s    
573 Energy-Performance Bias (EPB) knob. It is also    
574 integer value between 0 to 255, if the EPP fea    
575 feature is not present, writing integer value     
576 supported. In this case, user can use the         
577 "/sys/devices/system/cpu/cpu*/power/energy_per    
578                                                   
579 [Note that tasks may by migrated from one CPU     
580 load-balancing algorithm and if different ener    
581 set for those CPUs, that may lead to undesirab    
582 issues it is better to set the same energy vs     
583 or to pin every task potentially sensitive to     
584                                                   
585 .. _acpi-cpufreq:                                 
586                                                   
587 ``intel_pstate`` vs ``acpi-cpufreq``              
588 ====================================              
589                                                   
590 On the majority of systems supported by ``inte    
591 provided by the platform firmware contain ``_P    
592 that can be used for CPU performance scaling (    
593 [3]_ for details on the ``_PSS`` objects and t    
594 returned by them).                                
595                                                   
596 The information returned by the ACPI ``_PSS``     
597 ``acpi-cpufreq`` scaling driver.  On systems s    
598 the ``acpi-cpufreq`` driver uses the same hard    
599 interface, but the set of P-states it can use     
600 output.                                           
601                                                   
602 On those systems each ``_PSS`` object returns     
603 the corresponding CPU which basically is a sub    
604 be used by ``intel_pstate`` on the same system    
605 `turbo range <turbo_>`_ is represented by one     
606 convention, the frequency returned by ``_PSS``    
607 than the frequency of the highest non-turbo P-    
608 corresponding P-state representation (followin    
609 returned for it matches the maximum supported     
610 special value 255 meaning essentially "go as h    
611                                                   
612 The list of P-states returned by ``_PSS`` is r    
613 available frequencies supplied by ``acpi-cpufr    
614 scaling governors and the minimum and maximum     
615 it come from that list as well.  In particular    
616 of the turbo range described above, this means    
617 frequency reported by ``acpi-cpufreq`` is high    
618 of the highest supported non-turbo P-state lis    
619 affects decisions made by the scaling governor    
620 ``performance``.                                  
621                                                   
622 For example, if a given governor attempts to s    
623 estimated CPU load and maps the load of 100% t    
624 (possibly multiplied by a constant), then it w    
625 the turbo threshold if ``acpi-cpufreq`` is use    
626 in that case the turbo range corresponds to a     
627 band it can use (1 MHz vs 1 GHz or more).  In     
628 the turbo range for the highest loads and the     
629 benefit from running at turbo frequencies will    
630 instead.                                          
631                                                   
632 One more issue related to that may appear on s    
633 `Configurable TDP feature <turbo_>`_ allowing     
634 turbo threshold.  Namely, if that is not coord    
635 returned by ``_PSS`` properly, there may be mo    
636 a turbo P-state in those lists and there may b    
637 turbo range (if desirable or necessary).  Usua    
638 P-states overall, ``acpi-cpufreq`` simply avoi    
639 by ``_PSS``, but that is not sufficient when t    
640 the list returned by it.                          
641                                                   
642 Apart from the above, ``acpi-cpufreq`` works l    
643 `passive mode <Passive Mode_>`_, except that t    
644 is limited to the ones listed by the ACPI ``_P    
645                                                   
646                                                   
647 Kernel Command Line Options for ``intel_pstate    
648 ==============================================    
649                                                   
650 Several kernel command line options can be use    
651 parameters to ``intel_pstate`` in order to enf    
652 of them have to be prepended with the ``intel_    
653                                                   
654 ``disable``                                       
655         Do not register ``intel_pstate`` as th    
656         processor is supported by it.             
657                                                   
658 ``active``                                        
659         Register ``intel_pstate`` in the `acti    
660         with.                                     
661                                                   
662 ``passive``                                       
663         Register ``intel_pstate`` in the `pass    
664         start with.                               
665                                                   
666 ``force``                                         
667         Register ``intel_pstate`` as the scali    
668         ``acpi-cpufreq`` even if the latter is    
669                                                   
670         This may prevent some platform feature    
671         power capping) that rely on the availa    
672         information from functioning as expect    
673         caution.                                  
674                                                   
675         This option does not work with process    
676         ``intel_pstate`` and on platforms wher    
677         driver is used instead of ``acpi-cpufr    
678                                                   
679 ``no_hwp``                                        
680         Do not enable the hardware-managed P-s    
681         supported by the processor.               
682                                                   
683 ``hwp_only``                                      
684         Register ``intel_pstate`` as the scali    
685         hardware-managed P-states (HWP) featur    
686                                                   
687 ``support_acpi_ppc``                              
688         Take ACPI ``_PPC`` performance limits     
689                                                   
690         If the preferred power management prof    
691         Description Table) is set to "Enterpri    
692         Server", the ACPI ``_PPC`` limits are     
693         and this option has no effect.            
694                                                   
695 ``per_cpu_perf_limits``                           
696         Use per-logical-CPU P-State limits (se    
697         Limits`_ for details).                    
698                                                   
699                                                   
700 Diagnostics and Tuning                            
701 ======================                            
702                                                   
703 Trace Events                                      
704 ------------                                      
705                                                   
706 There are two static trace events that can be     
707 diagnostics.  One of them is the ``cpu_frequen    
708 by ``CPUFreq``, and the other one is the ``pst    
709 to ``intel_pstate``.  Both of them are trigger    
710 it works in the `active mode <Active Mode_>`_.    
711                                                   
712 The following sequence of shell commands can b    
713 their output (if the kernel is generally confi    
714                                                   
715  # cd /sys/kernel/tracing/                        
716  # echo 1 > events/power/pstate_sample/enable     
717  # echo 1 > events/power/cpu_frequency/enable     
718  # cat trace                                      
719  gnome-terminal--4510  [001] ..s.  1177.680733    
720  cat-5235  [002] ..s.  1177.681723: cpu_freque    
721                                                   
722 If ``intel_pstate`` works in the `passive mode    
723 ``cpu_frequency`` trace event will be triggere    
724 scaling governor (for the policies it is attac    
725 core (for the policies with other scaling gove    
726                                                   
727 ``ftrace``                                        
728 ----------                                        
729                                                   
730 The ``ftrace`` interface can be used for low-l    
731 ``intel_pstate``.  For example, to check how o    
732 P-state is called, the ``ftrace`` filter can b    
733 :c:func:`intel_pstate_set_pstate`::               
734                                                   
735  # cd /sys/kernel/tracing/                        
736  # cat available_filter_functions | grep -i ps    
737  intel_pstate_set_pstate                          
738  intel_pstate_cpu_init                            
739  ...                                              
740  # echo intel_pstate_set_pstate > set_ftrace_f    
741  # echo function > current_tracer                 
742  # cat trace | head -15                           
743  # tracer: function                               
744  #                                                
745  # entries-in-buffer/entries-written: 80/80       
746  #                                                
747  #                              _-----=> irqs-    
748  #                             / _----=> need-    
749  #                            | / _---=> hardi    
750  #                            || / _--=> preem    
751  #                            ||| /     delay     
752  #           TASK-PID   CPU#  ||||    TIMESTAM    
753  #              | |       |   ||||       |        
754              Xorg-3129  [000] ..s.  2537.64484    
755   gnome-terminal--4510  [002] ..s.  2537.64984    
756       gnome-shell-3409  [001] ..s.  2537.65085    
757            <idle>-0     [000] ..s.  2537.65484    
758                                                   
759                                                   
760 References                                        
761 ==========                                        
762                                                   
763 .. [1] Kristen Accardi, *Balancing Power and P    
764        https://events.static.linuxfound.org/si    
765                                                   
766 .. [2] *Intel® 64 and IA-32 Architectures Sof    
767        https://www.intel.com/content/www/us/en    
768                                                   
769 .. [3] *Advanced Configuration and Power Inter    
770        https://uefi.org/sites/default/files/re    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php