1 .. SPDX-License-Identifier: GPL-2.0 << 2 << 3 ================ << 4 CPU Idle Cooling << 5 ================ << 6 1 7 Situation: 2 Situation: 8 ---------- 3 ---------- 9 4 10 Under certain circumstances a SoC can reach a 5 Under certain circumstances a SoC can reach a critical temperature 11 limit and is unable to stabilize the temperatu 6 limit and is unable to stabilize the temperature around a temperature 12 control. When the SoC has to stabilize the tem 7 control. When the SoC has to stabilize the temperature, the kernel can 13 act on a cooling device to mitigate the dissip 8 act on a cooling device to mitigate the dissipated power. When the 14 critical temperature is reached, a decision mu 9 critical temperature is reached, a decision must be taken to reduce 15 the temperature, that, in turn impacts perform 10 the temperature, that, in turn impacts performance. 16 11 17 Another situation is when the silicon temperat 12 Another situation is when the silicon temperature continues to 18 increase even after the dynamic leakage is red 13 increase even after the dynamic leakage is reduced to its minimum by 19 clock gating the component. This runaway pheno 14 clock gating the component. This runaway phenomenon can continue due 20 to the static leakage. The only solution is to 15 to the static leakage. The only solution is to power down the 21 component, thus dropping the dynamic and stati 16 component, thus dropping the dynamic and static leakage that will 22 allow the component to cool down. 17 allow the component to cool down. 23 18 24 Last but not least, the system can ask for a s 19 Last but not least, the system can ask for a specific power budget but 25 because of the OPP density, we can only choose 20 because of the OPP density, we can only choose an OPP with a power 26 budget lower than the requested one and under- 21 budget lower than the requested one and under-utilize the CPU, thus 27 losing performance. In other words, one OPP un 22 losing performance. In other words, one OPP under-utilizes the CPU 28 with a power less than the requested power bud 23 with a power less than the requested power budget and the next OPP 29 exceeds the power budget. An intermediate OPP 24 exceeds the power budget. An intermediate OPP could have been used if 30 it were present. 25 it were present. 31 26 32 Solutions: 27 Solutions: 33 ---------- 28 ---------- 34 29 35 If we can remove the static and the dynamic le 30 If we can remove the static and the dynamic leakage for a specific 36 duration in a controlled period, the SoC tempe 31 duration in a controlled period, the SoC temperature will 37 decrease. Acting on the idle state duration or 32 decrease. Acting on the idle state duration or the idle cycle 38 injection period, we can mitigate the temperat 33 injection period, we can mitigate the temperature by modulating the 39 power budget. 34 power budget. 40 35 41 The Operating Performance Point (OPP) density 36 The Operating Performance Point (OPP) density has a great influence on 42 the control precision of cpufreq, however diff 37 the control precision of cpufreq, however different vendors have a 43 plethora of OPP density, and some have large p 38 plethora of OPP density, and some have large power gap between OPPs, 44 that will result in loss of performance during 39 that will result in loss of performance during thermal control and 45 loss of power in other scenarios. 40 loss of power in other scenarios. 46 41 47 At a specific OPP, we can assume that injectin 42 At a specific OPP, we can assume that injecting idle cycle on all CPUs 48 belong to the same cluster, with a duration gr 43 belong to the same cluster, with a duration greater than the cluster 49 idle state target residency, we lead to droppi 44 idle state target residency, we lead to dropping the static and the 50 dynamic leakage for this period (modulo the en 45 dynamic leakage for this period (modulo the energy needed to enter 51 this state). So the sustainable power with idl 46 this state). So the sustainable power with idle cycles has a linear 52 relation with the OPP’s sustainable power an 47 relation with the OPP’s sustainable power and can be computed with a 53 coefficient similar to:: !! 48 coefficient similar to: 54 49 55 Power(IdleCycle) = Coef x Power(OP 50 Power(IdleCycle) = Coef x Power(OPP) 56 51 57 Idle Injection: 52 Idle Injection: 58 --------------- 53 --------------- 59 54 60 The base concept of the idle injection is to f 55 The base concept of the idle injection is to force the CPU to go to an 61 idle state for a specified time each control c 56 idle state for a specified time each control cycle, it provides 62 another way to control CPU power and heat in a 57 another way to control CPU power and heat in addition to 63 cpufreq. Ideally, if all CPUs belonging to the 58 cpufreq. Ideally, if all CPUs belonging to the same cluster, inject 64 their idle cycles synchronously, the cluster c 59 their idle cycles synchronously, the cluster can reach its power down 65 state with a minimum power consumption and red 60 state with a minimum power consumption and reduce the static leakage 66 to almost zero. However, these idle cycles in 61 to almost zero. However, these idle cycles injection will add extra 67 latencies as the CPUs will have to wakeup from 62 latencies as the CPUs will have to wakeup from a deep sleep state. 68 63 69 We use a fixed duration of idle injection that 64 We use a fixed duration of idle injection that gives an acceptable 70 performance penalty and a fixed latency. Mitig 65 performance penalty and a fixed latency. Mitigation can be increased 71 or decreased by modulating the duty cycle of t 66 or decreased by modulating the duty cycle of the idle injection. 72 67 73 :: 68 :: 74 69 75 ^ 70 ^ 76 | 71 | 77 | 72 | 78 |------- ------- 73 |------- ------- 79 |_______|_______________________|_______| 74 |_______|_______________________|_______|___________ 80 75 81 <------> 76 <------> 82 idle <----------------------> 77 idle <----------------------> 83 running 78 running 84 79 85 <-----------------------------> 80 <-----------------------------> 86 duty cycle 25% 81 duty cycle 25% 87 82 88 83 89 The implementation of the cooling device bases 84 The implementation of the cooling device bases the number of states on 90 the duty cycle percentage. When no mitigation 85 the duty cycle percentage. When no mitigation is happening the cooling 91 device state is zero, meaning the duty cycle i 86 device state is zero, meaning the duty cycle is 0%. 92 87 93 When the mitigation begins, depending on the g 88 When the mitigation begins, depending on the governor's policy, a 94 starting state is selected. With a fixed idle 89 starting state is selected. With a fixed idle duration and the duty 95 cycle (aka the cooling device state), the runn 90 cycle (aka the cooling device state), the running duration can be 96 computed. 91 computed. 97 92 98 The governor will change the cooling device st 93 The governor will change the cooling device state thus the duty cycle 99 and this variation will modulate the cooling e 94 and this variation will modulate the cooling effect. 100 95 101 :: 96 :: 102 97 103 ^ 98 ^ 104 | 99 | 105 | 100 | 106 |------- ------- 101 |------- ------- 107 |_______|_______________|_______|________ 102 |_______|_______________|_______|___________ 108 103 109 <------> 104 <------> 110 idle <--------------> 105 idle <--------------> 111 running 106 running 112 107 113 <---------------------> !! 108 <-----------------------------> 114 duty cycle 33% !! 109 duty cycle 33% 115 110 116 111 117 ^ 112 ^ 118 | 113 | 119 | 114 | 120 |------- ------- 115 |------- ------- 121 |_______|_______|_______|___________ 116 |_______|_______|_______|___________ 122 117 123 <------> 118 <------> 124 idle <------> 119 idle <------> 125 running 120 running 126 121 127 <-------------> 122 <-------------> 128 duty cycle 50% 123 duty cycle 50% 129 124 130 The idle injection duration value must comply 125 The idle injection duration value must comply with the constraints: 131 126 132 - It is less than or equal to the latency we t 127 - It is less than or equal to the latency we tolerate when the 133 mitigation begins. It is platform dependent 128 mitigation begins. It is platform dependent and will depend on the 134 user experience, reactivity vs performance t 129 user experience, reactivity vs performance trade off we want. This 135 value should be specified. 130 value should be specified. 136 131 137 - It is greater than the idle state’s target 132 - It is greater than the idle state’s target residency we want to go 138 for thermal mitigation, otherwise we end up 133 for thermal mitigation, otherwise we end up consuming more energy. 139 134 140 Power considerations 135 Power considerations 141 -------------------- 136 -------------------- 142 137 143 When we reach the thermal trip point, we have 138 When we reach the thermal trip point, we have to sustain a specified 144 power for a specific temperature but at this t !! 139 power for a specific temperature but at this time we consume: 145 140 146 Power = Capacitance x Voltage^2 x Frequency x 141 Power = Capacitance x Voltage^2 x Frequency x Utilisation 147 142 148 ... which is more than the sustainable power ( 143 ... which is more than the sustainable power (or there is something 149 wrong in the system setup). The ‘Capacitance 144 wrong in the system setup). The ‘Capacitance’ and ‘Utilisation’ are a 150 fixed value, ‘Voltage’ and the ‘Frequenc 145 fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially 151 because we don’t want to change the OPP. We 146 because we don’t want to change the OPP. We can group the 152 ‘Capacitance’ and the ‘Utilisation’ in 147 ‘Capacitance’ and the ‘Utilisation’ into a single term which is the 153 ‘Dynamic Power Coefficient (Cdyn)’ Simplif !! 148 ‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have: 154 149 155 Pdyn = Cdyn x Voltage^2 x Frequency 150 Pdyn = Cdyn x Voltage^2 x Frequency 156 151 157 The power allocator governor will ask us someh 152 The power allocator governor will ask us somehow to reduce our power 158 in order to target the sustainable power defin 153 in order to target the sustainable power defined in the device 159 tree. So with the idle injection mechanism, we 154 tree. So with the idle injection mechanism, we want an average power 160 (Ptarget) resulting in an amount of time runni 155 (Ptarget) resulting in an amount of time running at full power on a 161 specific OPP and idle another amount of time. 156 specific OPP and idle another amount of time. That could be put in a 162 equation:: !! 157 equation: 163 158 164 P(opp)target = ((Trunning x (P(opp)running) + 159 P(opp)target = ((Trunning x (P(opp)running) + (Tidle x P(opp)idle)) / 165 (Trunning + Tidle) 160 (Trunning + Tidle) 166 161 167 ... 162 ... 168 163 169 Tidle = Trunning x ((P(opp)running / P(opp)ta 164 Tidle = Trunning x ((P(opp)running / P(opp)target) - 1) 170 165 171 At this point if we know the running period fo 166 At this point if we know the running period for the CPU, that gives us 172 the idle injection we need. Alternatively if w 167 the idle injection we need. Alternatively if we have the idle 173 injection duration, we can compute the running !! 168 injection duration, we can compute the running duration with: 174 169 175 Trunning = Tidle / ((P(opp)running / P(opp)ta 170 Trunning = Tidle / ((P(opp)running / P(opp)target) - 1) 176 171 177 Practically, if the running power is less than 172 Practically, if the running power is less than the targeted power, we 178 end up with a negative time value, so obviousl 173 end up with a negative time value, so obviously the equation usage is 179 bound to a power reduction, hence a higher OPP 174 bound to a power reduction, hence a higher OPP is needed to have the 180 running power greater than the targeted power. 175 running power greater than the targeted power. 181 176 182 However, in this demonstration we ignore three 177 However, in this demonstration we ignore three aspects: 183 178 184 * The static leakage is not defined here, we 179 * The static leakage is not defined here, we can introduce it in the 185 equation but assuming it will be zero most 180 equation but assuming it will be zero most of the time as it is 186 difficult to get the values from the SoC ve 181 difficult to get the values from the SoC vendors 187 182 188 * The idle state wake up latency (or entry + 183 * The idle state wake up latency (or entry + exit latency) is not 189 taken into account, it must be added in the 184 taken into account, it must be added in the equation in order to 190 rigorously compute the idle injection 185 rigorously compute the idle injection 191 186 192 * The injected idle duration must be greater 187 * The injected idle duration must be greater than the idle state 193 target residency, otherwise we end up consu 188 target residency, otherwise we end up consuming more energy and 194 potentially invert the mitigation effect 189 potentially invert the mitigation effect 195 190 196 So the final equation is:: !! 191 So the final equation is: 197 192 198 Trunning = (Tidle - Twakeup ) x 193 Trunning = (Tidle - Twakeup ) x 199 (((P(opp)dyn + P(opp)static ) 194 (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.