~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/power/energy-model.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/power/energy-model.rst (Version linux-6.12-rc7) and /Documentation/power/energy-model.rst (Version linux-4.15.18)


  1 .. SPDX-License-Identifier: GPL-2.0               
  2                                                   
  3 =======================                           
  4 Energy Model of devices                           
  5 =======================                           
  6                                                   
  7 1. Overview                                       
  8 -----------                                       
  9                                                   
 10 The Energy Model (EM) framework serves as an i    
 11 the power consumed by devices at various perfo    
 12 subsystems willing to use that information to     
 13                                                   
 14 The source of the information about the power     
 15 from one platform to another. These power cost    
 16 devicetree data in some cases. In others, the     
 17 Alternatively, userspace might be best positio    
 18 each and every client subsystem to re-implemen    
 19 possible source of information on its own, the    
 20 abstraction layer which standardizes the forma    
 21 kernel, hence enabling to avoid redundant work    
 22                                                   
 23 The power values might be expressed in micro-W    
 24 Multiple subsystems might use the EM and it is    
 25 check that the requirements for the power valu    
 26 can be found in the Energy-Aware Scheduler doc    
 27 Documentation/scheduler/sched-energy.rst. For     
 28 powercap power values expressed in an 'abstrac    
 29 These subsystems are more interested in estima    
 30 thus the real micro-Watts might be needed. An     
 31 be found in the Intelligent Power Allocation i    
 32 Documentation/driver-api/thermal/power_allocat    
 33 Kernel subsystems might implement automatic de    
 34 registered devices have inconsistent scale (ba    
 35 Important thing to keep in mind is that when t    
 36 an 'abstract scale' deriving real energy in mi    
 37                                                   
 38 The figure below depicts an example of drivers    
 39 approach is applicable to any architecture) pr    
 40 framework, and interested clients reading the     
 41                                                   
 42        +---------------+  +-----------------+     
 43        | Thermal (IPA) |  | Scheduler (EAS) |     
 44        +---------------+  +-----------------+     
 45                |                   | em_cpu_en    
 46                |                   | em_cpu_ge    
 47                +---------+         |         +    
 48                          |         |         |    
 49                          v         v         v    
 50                         +---------------------    
 51                         |    Energy Model         
 52                         |     Framework           
 53                         +---------------------    
 54                            ^       ^       ^      
 55                            |       |       | e    
 56                 +----------+       |       +--    
 57                 |                  |              
 58         +---------------+  +---------------+      
 59         |  cpufreq-dt   |  |   arm_scmi    |      
 60         +---------------+  +---------------+      
 61                 ^                  ^              
 62                 |                  |              
 63         +--------------+   +---------------+      
 64         | Device Tree  |   |   Firmware    |      
 65         +--------------+   +---------------+      
 66                                                   
 67 In case of CPU devices the EM framework manage    
 68 'performance domain' in the system. A performa    
 69 whose performance is scaled together. Performa    
 70 1-to-1 mapping with CPUFreq policies. All CPUs    
 71 required to have the same micro-architecture.     
 72 domains can have different micro-architectures    
 73                                                   
 74 To better reflect power variation due to stati    
 75 supports runtime modifications of the power va    
 76 RCU to free the modifiable EM perf_state table    
 77 scheduler, also uses RCU to access this memory    
 78 API for allocating/freeing the new memory for     
 79 The old memory is freed automatically using RC    
 80 are no owners anymore for the given EM runtime    
 81 using kref mechanism. The device driver which     
 82 should call EM API to free it safely when it's    
 83 framework will handle the clean-up when it's p    
 84                                                   
 85 The kernel code which want to modify the EM va    
 86 access using a mutex. Therefore, the device dr    
 87 context when it tries to modify the EM.           
 88                                                   
 89 With the runtime modifiable EM we switch from     
 90 runtime static EM' (system property) design to    
 91 changed during runtime according e.g. to the w    
 92 property) design.                                 
 93                                                   
 94 It is possible also to modify the CPU performa    
 95 performance state. Thus, the full power and pe    
 96 is an exponential curve) can be changed accord    
 97 or system property.                               
 98                                                   
 99                                                   
100 2. Core APIs                                      
101 ------------                                      
102                                                   
103 2.1 Config options                                
104 ^^^^^^^^^^^^^^^^^^                                
105                                                   
106 CONFIG_ENERGY_MODEL must be enabled to use the    
107                                                   
108                                                   
109 2.2 Registration of performance domains           
110 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^           
111                                                   
112 Registration of 'advanced' EM                     
113 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                     
114                                                   
115 The 'advanced' EM gets its name due to the fac    
116 to provide more precised power model. It's not    
117 formula in the framework (like it is in 'simpl    
118 the real power measurements performed for each    
119 registration method should be preferred in cas    
120 (leakage) is important.                           
121                                                   
122 Drivers are expected to register performance d    
123 calling the following API::                       
124                                                   
125   int em_dev_register_perf_domain(struct devic    
126                 struct em_data_callback *cb, c    
127                                                   
128 Drivers must provide a callback function retur    
129 for each performance state. The callback funct    
130 to fetch data from any relevant location (DT,     
131 deemed necessary. Only for CPU devices, driver    
132 performance domains using cpumask. For other d    
133 argument must be set to NULL.                     
134 The last argument 'microwatts' is important to    
135 subsystems which use EM might rely on this fla    
136 the same scale. If there are different scales,    
137 to return warning/error, stop working or panic    
138 See Section 3. for an example of driver implem    
139 callback, or Section 2.4 for further documenta    
140                                                   
141 Registration of EM using DT                       
142 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~            
143                                                   
144 The  EM can also be registered using OPP frame    
145 "operating-points-v2". Each OPP entry in DT ca    
146 "opp-microwatt" containing micro-Watts power v    
147 allows a platform to register EM power values     
148 (static + dynamic). These power values might b    
149 experiments and measurements.                     
150                                                   
151 Registration of 'artificial' EM                   
152 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                   
153                                                   
154 There is an option to provide a custom callbac    
155 knowledge about power value for each performan    
156 .get_cost() is optional and provides the 'cost    
157 This is useful for platforms that only provide    
158 efficiency between CPU types, where one could     
159 create an abstract power model. But even an ab    
160 sometimes be hard to fit in, given the input p    
161 The .get_cost() allows to provide the 'cost' v    
162 efficiency of the CPUs. This would allow to pr    
163 has different relation than what would be forc    
164 formulas calculating 'cost' values. To registe    
165 driver must set the flag 'microwatts' to 0, pr    
166 and provide .get_cost() callback. The EM frame    
167 properly during registration. A flag EM_PERF_D    
168 platform. Special care should be taken by othe    
169 to test and treat this flag properly.             
170                                                   
171 Registration of 'simple' EM                       
172 ~~~~~~~~~~~~~~~~~~~~~~~~~~~                       
173                                                   
174 The 'simple' EM is registered using the framew    
175 cpufreq_register_em_with_opp(). It implements     
176 math formula::                                    
177                                                   
178         Power = C * V^2 * f                       
179                                                   
180 The EM which is registered using this method m    
181 physics of a real device, e.g. when static pow    
182                                                   
183                                                   
184 2.3 Accessing performance domains                 
185 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                 
186                                                   
187 There are two API functions which provide the     
188 em_cpu_get() which takes CPU id as an argument    
189 pointer as an argument. It depends on the subs    
190 going to use, but in case of CPU devices both     
191 performance domain.                               
192                                                   
193 Subsystems interested in the energy model of a    
194 em_cpu_get() API. The energy model tables are     
195 the performance domains, and kept in memory un    
196                                                   
197 The energy consumed by a performance domain ca    
198 em_cpu_energy() API. The estimation is perform    
199 CPUfreq governor is in use in case of CPU devi    
200 not provided for other type of devices.           
201                                                   
202 More details about the above APIs can be found    
203 or in Section 2.5                                 
204                                                   
205                                                   
206 2.4 Runtime modifications                         
207 ^^^^^^^^^^^^^^^^^^^^^^^^^                         
208                                                   
209 Drivers willing to update the EM at runtime sh    
210 function to allocate a new instance of the mod    
211 below::                                           
212                                                   
213   struct em_perf_table __rcu *em_table_alloc(s    
214                                                   
215 This allows to allocate a structure which cont    
216 also RCU and kref needed by the EM framework.     
217 contains array 'struct em_perf_state state[]'     
218 states in ascending order. That list must be p    
219 which wants to update the EM. The list of freq    
220 existing EM (created during boot). The content    
221 must be populated by the driver as well.          
222                                                   
223 This is the API which does the EM update, usin    
224                                                   
225   int em_dev_update_perf_domain(struct device     
226                         struct em_perf_table _    
227                                                   
228 Drivers must provide a pointer to the allocate    
229 'struct em_perf_table'. That new EM will be sa    
230 and will be visible to other sub-systems in th    
231 The main design goal for this API is to be fas    
232 or memory allocations at runtime. When pre-com    
233 device driver, than it should be possible to s    
234 performance overhead.                             
235                                                   
236 In order to free the EM, provided earlier by t    
237 is unloaded), there is a need to call the API:    
238                                                   
239   void em_table_free(struct em_perf_table __rc    
240                                                   
241 It will allow the EM framework to safely remov    
242 no other sub-system using it, e.g. EAS.           
243                                                   
244 To use the power values in other sub-systems (    
245 a need to call API which protects the reader a    
246 table data::                                      
247                                                   
248   struct em_perf_state *em_perf_state_from_pd(    
249                                                   
250 It returns the 'struct em_perf_state' pointer     
251 states in ascending order.                        
252 This function must be called in the RCU read l    
253 rcu_read_lock()). When the EM table is not nee    
254 call rcu_real_unlock(). In this way the EM saf    
255 and protects the users. It also allows the EM     
256 and free it. More details how to use it can be    
257 example driver.                                   
258                                                   
259 There is dedicated API for device drivers to c    
260 values::                                          
261                                                   
262   int em_dev_compute_costs(struct device *dev,    
263                            int nr_states);        
264                                                   
265 These 'cost' values from EM are used in EAS. T    
266 together with the number of entries and device    
267 of the cost values is done properly the return    
268 The function takes care for right setting of i    
269 state as well. It updates em_perf_state::flags    
270 Then such prepared new EM can be passed to the    
271 function, which will allow to use it.             
272                                                   
273 More details about the above APIs can be found    
274 or in Section 3.2 with an example code showing    
275 updating mechanism in a device driver.            
276                                                   
277                                                   
278 2.5 Description details of this API               
279 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^               
280 .. kernel-doc:: include/linux/energy_model.h      
281    :internal:                                     
282                                                   
283 .. kernel-doc:: kernel/power/energy_model.c       
284    :export:                                       
285                                                   
286                                                   
287 3. Examples                                       
288 -----------                                       
289                                                   
290 3.1 Example driver with EM registration           
291 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^           
292                                                   
293 The CPUFreq framework supports dedicated callb    
294 the EM for a given CPU(s) 'policy' object: cpu    
295 That callback has to be implemented properly f    
296 because the framework would call it at the rig    
297 This section provides a simple example of a CP    
298 performance domain in the Energy Model framewo    
299 protocol. The driver implements an est_power()    
300 EM framework::                                    
301                                                   
302   -> drivers/cpufreq/foo_cpufreq.c                
303                                                   
304   01    static int est_power(struct device *de    
305   02                    unsigned long *KHz)       
306   03    {                                         
307   04            long freq, power;                 
308   05                                              
309   06            /* Use the 'foo' protocol to c    
310   07            freq = foo_get_freq_ceil(dev,     
311   08            if (freq < 0);                    
312   09                    return freq;              
313   10                                              
314   11            /* Estimate the power cost for    
315   12            power = foo_estimate_power(dev    
316   13            if (power < 0);                   
317   14                    return power;             
318   15                                              
319   16            /* Return the values to the EM    
320   17            *mW = power;                      
321   18            *KHz = freq;                      
322   19                                              
323   20            return 0;                         
324   21    }                                         
325   22                                              
326   23    static void foo_cpufreq_register_em(st    
327   24    {                                         
328   25            struct em_data_callback em_cb     
329   26            struct device *cpu_dev;           
330   27            int nr_opp;                       
331   28                                              
332   29            cpu_dev = get_cpu_device(cpuma    
333   30                                              
334   31            /* Find the number of OPPs for    
335   32            nr_opp = foo_get_nr_opp(policy    
336   33                                              
337   34            /* And register the new perfor    
338   35            em_dev_register_perf_domain(cp    
339   36                                        tr    
340   37    }                                         
341   38                                              
342   39    static struct cpufreq_driver foo_cpufr    
343   40            .register_em = foo_cpufreq_reg    
344   41    };                                        
345                                                   
346                                                   
347 3.2 Example driver with EM modification           
348 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^           
349                                                   
350 This section provides a simple example of a th    
351 The driver implements a foo_thermal_em_update(    
352 up periodically to check the temperature and m    
353                                                   
354   -> drivers/soc/example/example_em_mod.c         
355                                                   
356   01    static void foo_get_new_em(struct foo_    
357   02    {                                         
358   03            struct em_perf_table __rcu *em    
359   04            struct em_perf_state *table, *    
360   05            struct device *dev = ctx->dev;    
361   06            struct em_perf_domain *pd;        
362   07            unsigned long freq;               
363   08            int i, ret;                       
364   09                                              
365   10            pd = em_pd_get(dev);              
366   11            if (!pd)                          
367   12                    return;                   
368   13                                              
369   14            em_table = em_table_alloc(pd);    
370   15            if (!em_table)                    
371   16                    return;                   
372   17                                              
373   18            new_table = em_table->state;      
374   19                                              
375   20            rcu_read_lock();                  
376   21            table = em_perf_state_from_pd(    
377   22            for (i = 0; i < pd->nr_perf_st    
378   23                    freq = table[i].freque    
379   24                    foo_get_power_perf_val    
380   25            }                                 
381   26            rcu_read_unlock();                
382   27                                              
383   28            /* Calculate 'cost' values for    
384   29            ret = em_dev_compute_costs(dev    
385   30            if (ret) {                        
386   31                    dev_warn(dev, "EM: com    
387   32                    em_free_table(em_table    
388   33                    return;                   
389   34            }                                 
390   35                                              
391   36            ret = em_dev_update_perf_domai    
392   37            if (ret) {                        
393   38                    dev_warn(dev, "EM: upd    
394   39                    em_free_table(em_table    
395   40                    return;                   
396   41            }                                 
397   42                                              
398   43            /*                                
399   44             * Since it's one-time-update     
400   45             * The EM framework will later    
401   46             */                               
402   47            em_table_free(em_table);          
403   48    }                                         
404   49                                              
405   50    /*                                        
406   51     * Function called periodically to che    
407   52     * update the EM if needed                
408   53     */                                       
409   54    static void foo_thermal_em_update(stru    
410   55    {                                         
411   56            struct device *dev = ctx->dev;    
412   57            int cpu;                          
413   58                                              
414   59            ctx->temperature = foo_get_tem    
415   60            if (ctx->temperature < FOO_EM_    
416   61                    return;                   
417   62                                              
418   63            foo_get_new_em(ctx);              
419   64    }                                         
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php