1 .. SPDX-License-Identifier: GPL-2.0 << 2 << 3 ================ 1 ================ 4 CPU Idle Cooling 2 CPU Idle Cooling 5 ================ 3 ================ 6 4 7 Situation: 5 Situation: 8 ---------- 6 ---------- 9 7 10 Under certain circumstances a SoC can reach a 8 Under certain circumstances a SoC can reach a critical temperature 11 limit and is unable to stabilize the temperatu 9 limit and is unable to stabilize the temperature around a temperature 12 control. When the SoC has to stabilize the tem 10 control. When the SoC has to stabilize the temperature, the kernel can 13 act on a cooling device to mitigate the dissip 11 act on a cooling device to mitigate the dissipated power. When the 14 critical temperature is reached, a decision mu 12 critical temperature is reached, a decision must be taken to reduce 15 the temperature, that, in turn impacts perform 13 the temperature, that, in turn impacts performance. 16 14 17 Another situation is when the silicon temperat 15 Another situation is when the silicon temperature continues to 18 increase even after the dynamic leakage is red 16 increase even after the dynamic leakage is reduced to its minimum by 19 clock gating the component. This runaway pheno 17 clock gating the component. This runaway phenomenon can continue due 20 to the static leakage. The only solution is to 18 to the static leakage. The only solution is to power down the 21 component, thus dropping the dynamic and stati 19 component, thus dropping the dynamic and static leakage that will 22 allow the component to cool down. 20 allow the component to cool down. 23 21 24 Last but not least, the system can ask for a s 22 Last but not least, the system can ask for a specific power budget but 25 because of the OPP density, we can only choose 23 because of the OPP density, we can only choose an OPP with a power 26 budget lower than the requested one and under- 24 budget lower than the requested one and under-utilize the CPU, thus 27 losing performance. In other words, one OPP un 25 losing performance. In other words, one OPP under-utilizes the CPU 28 with a power less than the requested power bud 26 with a power less than the requested power budget and the next OPP 29 exceeds the power budget. An intermediate OPP 27 exceeds the power budget. An intermediate OPP could have been used if 30 it were present. 28 it were present. 31 29 32 Solutions: 30 Solutions: 33 ---------- 31 ---------- 34 32 35 If we can remove the static and the dynamic le 33 If we can remove the static and the dynamic leakage for a specific 36 duration in a controlled period, the SoC tempe 34 duration in a controlled period, the SoC temperature will 37 decrease. Acting on the idle state duration or 35 decrease. Acting on the idle state duration or the idle cycle 38 injection period, we can mitigate the temperat 36 injection period, we can mitigate the temperature by modulating the 39 power budget. 37 power budget. 40 38 41 The Operating Performance Point (OPP) density 39 The Operating Performance Point (OPP) density has a great influence on 42 the control precision of cpufreq, however diff 40 the control precision of cpufreq, however different vendors have a 43 plethora of OPP density, and some have large p 41 plethora of OPP density, and some have large power gap between OPPs, 44 that will result in loss of performance during 42 that will result in loss of performance during thermal control and 45 loss of power in other scenarios. 43 loss of power in other scenarios. 46 44 47 At a specific OPP, we can assume that injectin 45 At a specific OPP, we can assume that injecting idle cycle on all CPUs 48 belong to the same cluster, with a duration gr 46 belong to the same cluster, with a duration greater than the cluster 49 idle state target residency, we lead to droppi 47 idle state target residency, we lead to dropping the static and the 50 dynamic leakage for this period (modulo the en 48 dynamic leakage for this period (modulo the energy needed to enter 51 this state). So the sustainable power with idl 49 this state). So the sustainable power with idle cycles has a linear 52 relation with the OPP’s sustainable power an 50 relation with the OPP’s sustainable power and can be computed with a 53 coefficient similar to:: !! 51 coefficient similar to: 54 52 55 Power(IdleCycle) = Coef x Power(OP 53 Power(IdleCycle) = Coef x Power(OPP) 56 54 57 Idle Injection: 55 Idle Injection: 58 --------------- 56 --------------- 59 57 60 The base concept of the idle injection is to f 58 The base concept of the idle injection is to force the CPU to go to an 61 idle state for a specified time each control c 59 idle state for a specified time each control cycle, it provides 62 another way to control CPU power and heat in a 60 another way to control CPU power and heat in addition to 63 cpufreq. Ideally, if all CPUs belonging to the 61 cpufreq. Ideally, if all CPUs belonging to the same cluster, inject 64 their idle cycles synchronously, the cluster c 62 their idle cycles synchronously, the cluster can reach its power down 65 state with a minimum power consumption and red 63 state with a minimum power consumption and reduce the static leakage 66 to almost zero. However, these idle cycles in 64 to almost zero. However, these idle cycles injection will add extra 67 latencies as the CPUs will have to wakeup from 65 latencies as the CPUs will have to wakeup from a deep sleep state. 68 66 69 We use a fixed duration of idle injection that 67 We use a fixed duration of idle injection that gives an acceptable 70 performance penalty and a fixed latency. Mitig 68 performance penalty and a fixed latency. Mitigation can be increased 71 or decreased by modulating the duty cycle of t 69 or decreased by modulating the duty cycle of the idle injection. 72 70 73 :: 71 :: 74 72 75 ^ 73 ^ 76 | 74 | 77 | 75 | 78 |------- ------- 76 |------- ------- 79 |_______|_______________________|_______| 77 |_______|_______________________|_______|___________ 80 78 81 <------> 79 <------> 82 idle <----------------------> 80 idle <----------------------> 83 running 81 running 84 82 85 <-----------------------------> 83 <-----------------------------> 86 duty cycle 25% 84 duty cycle 25% 87 85 88 86 89 The implementation of the cooling device bases 87 The implementation of the cooling device bases the number of states on 90 the duty cycle percentage. When no mitigation 88 the duty cycle percentage. When no mitigation is happening the cooling 91 device state is zero, meaning the duty cycle i 89 device state is zero, meaning the duty cycle is 0%. 92 90 93 When the mitigation begins, depending on the g 91 When the mitigation begins, depending on the governor's policy, a 94 starting state is selected. With a fixed idle 92 starting state is selected. With a fixed idle duration and the duty 95 cycle (aka the cooling device state), the runn 93 cycle (aka the cooling device state), the running duration can be 96 computed. 94 computed. 97 95 98 The governor will change the cooling device st 96 The governor will change the cooling device state thus the duty cycle 99 and this variation will modulate the cooling e 97 and this variation will modulate the cooling effect. 100 98 101 :: 99 :: 102 100 103 ^ 101 ^ 104 | 102 | 105 | 103 | 106 |------- ------- 104 |------- ------- 107 |_______|_______________|_______|________ 105 |_______|_______________|_______|___________ 108 106 109 <------> 107 <------> 110 idle <--------------> 108 idle <--------------> 111 running 109 running 112 110 113 <---------------------> 111 <---------------------> 114 duty cycle 33% 112 duty cycle 33% 115 113 116 114 117 ^ 115 ^ 118 | 116 | 119 | 117 | 120 |------- ------- 118 |------- ------- 121 |_______|_______|_______|___________ 119 |_______|_______|_______|___________ 122 120 123 <------> 121 <------> 124 idle <------> 122 idle <------> 125 running 123 running 126 124 127 <-------------> 125 <-------------> 128 duty cycle 50% 126 duty cycle 50% 129 127 130 The idle injection duration value must comply 128 The idle injection duration value must comply with the constraints: 131 129 132 - It is less than or equal to the latency we t 130 - It is less than or equal to the latency we tolerate when the 133 mitigation begins. It is platform dependent 131 mitigation begins. It is platform dependent and will depend on the 134 user experience, reactivity vs performance t 132 user experience, reactivity vs performance trade off we want. This 135 value should be specified. 133 value should be specified. 136 134 137 - It is greater than the idle state’s target 135 - It is greater than the idle state’s target residency we want to go 138 for thermal mitigation, otherwise we end up 136 for thermal mitigation, otherwise we end up consuming more energy. 139 137 140 Power considerations 138 Power considerations 141 -------------------- 139 -------------------- 142 140 143 When we reach the thermal trip point, we have 141 When we reach the thermal trip point, we have to sustain a specified 144 power for a specific temperature but at this t !! 142 power for a specific temperature but at this time we consume: 145 143 146 Power = Capacitance x Voltage^2 x Frequency x 144 Power = Capacitance x Voltage^2 x Frequency x Utilisation 147 145 148 ... which is more than the sustainable power ( 146 ... which is more than the sustainable power (or there is something 149 wrong in the system setup). The ‘Capacitance 147 wrong in the system setup). The ‘Capacitance’ and ‘Utilisation’ are a 150 fixed value, ‘Voltage’ and the ‘Frequenc 148 fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially 151 because we don’t want to change the OPP. We 149 because we don’t want to change the OPP. We can group the 152 ‘Capacitance’ and the ‘Utilisation’ in 150 ‘Capacitance’ and the ‘Utilisation’ into a single term which is the 153 ‘Dynamic Power Coefficient (Cdyn)’ Simplif !! 151 ‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have: 154 152 155 Pdyn = Cdyn x Voltage^2 x Frequency 153 Pdyn = Cdyn x Voltage^2 x Frequency 156 154 157 The power allocator governor will ask us someh 155 The power allocator governor will ask us somehow to reduce our power 158 in order to target the sustainable power defin 156 in order to target the sustainable power defined in the device 159 tree. So with the idle injection mechanism, we 157 tree. So with the idle injection mechanism, we want an average power 160 (Ptarget) resulting in an amount of time runni 158 (Ptarget) resulting in an amount of time running at full power on a 161 specific OPP and idle another amount of time. 159 specific OPP and idle another amount of time. That could be put in a 162 equation:: !! 160 equation: 163 161 164 P(opp)target = ((Trunning x (P(opp)running) + 162 P(opp)target = ((Trunning x (P(opp)running) + (Tidle x P(opp)idle)) / 165 (Trunning + Tidle) 163 (Trunning + Tidle) 166 164 167 ... 165 ... 168 166 169 Tidle = Trunning x ((P(opp)running / P(opp)ta 167 Tidle = Trunning x ((P(opp)running / P(opp)target) - 1) 170 168 171 At this point if we know the running period fo 169 At this point if we know the running period for the CPU, that gives us 172 the idle injection we need. Alternatively if w 170 the idle injection we need. Alternatively if we have the idle 173 injection duration, we can compute the running !! 171 injection duration, we can compute the running duration with: 174 172 175 Trunning = Tidle / ((P(opp)running / P(opp)ta 173 Trunning = Tidle / ((P(opp)running / P(opp)target) - 1) 176 174 177 Practically, if the running power is less than 175 Practically, if the running power is less than the targeted power, we 178 end up with a negative time value, so obviousl 176 end up with a negative time value, so obviously the equation usage is 179 bound to a power reduction, hence a higher OPP 177 bound to a power reduction, hence a higher OPP is needed to have the 180 running power greater than the targeted power. 178 running power greater than the targeted power. 181 179 182 However, in this demonstration we ignore three 180 However, in this demonstration we ignore three aspects: 183 181 184 * The static leakage is not defined here, we 182 * The static leakage is not defined here, we can introduce it in the 185 equation but assuming it will be zero most 183 equation but assuming it will be zero most of the time as it is 186 difficult to get the values from the SoC ve 184 difficult to get the values from the SoC vendors 187 185 188 * The idle state wake up latency (or entry + 186 * The idle state wake up latency (or entry + exit latency) is not 189 taken into account, it must be added in the 187 taken into account, it must be added in the equation in order to 190 rigorously compute the idle injection 188 rigorously compute the idle injection 191 189 192 * The injected idle duration must be greater 190 * The injected idle duration must be greater than the idle state 193 target residency, otherwise we end up consu 191 target residency, otherwise we end up consuming more energy and 194 potentially invert the mitigation effect 192 potentially invert the mitigation effect 195 193 196 So the final equation is:: !! 194 So the final equation is: 197 195 198 Trunning = (Tidle - Twakeup ) x 196 Trunning = (Tidle - Twakeup ) x 199 (((P(opp)dyn + P(opp)static ) 197 (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.