~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/pm/suspend-flows.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0
  2 .. include:: <isonum.txt>
  3 
  4 =========================
  5 System Suspend Code Flows
  6 =========================
  7 
  8 :Copyright: |copy| 2020 Intel Corporation
  9 
 10 :Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
 11 
 12 At least one global system-wide transition needs to be carried out for the
 13 system to get from the working state into one of the supported
 14 :doc:`sleep states <sleep-states>`.  Hibernation requires more than one
 15 transition to occur for this purpose, but the other sleep states, commonly
 16 referred to as *system-wide suspend* (or simply *system suspend*) states, need
 17 only one.
 18 
 19 For those sleep states, the transition from the working state of the system into
 20 the target sleep state is referred to as *system suspend* too (in the majority
 21 of cases, whether this means a transition or a sleep state of the system should
 22 be clear from the context) and the transition back from the sleep state into the
 23 working state is referred to as *system resume*.
 24 
 25 The kernel code flows associated with the suspend and resume transitions for
 26 different sleep states of the system are quite similar, but there are some
 27 significant differences between the :ref:`suspend-to-idle <s2idle>` code flows
 28 and the code flows related to the :ref:`suspend-to-RAM <s2ram>` and
 29 :ref:`standby <standby>` sleep states.
 30 
 31 The :ref:`suspend-to-RAM <s2ram>` and :ref:`standby <standby>` sleep states
 32 cannot be implemented without platform support and the difference between them
 33 boils down to the platform-specific actions carried out by the suspend and
 34 resume hooks that need to be provided by the platform driver to make them
 35 available.  Apart from that, the suspend and resume code flows for these sleep
 36 states are mostly identical, so they both together will be referred to as
 37 *platform-dependent suspend* states in what follows.
 38 
 39 
 40 .. _s2idle_suspend:
 41 
 42 Suspend-to-idle Suspend Code Flow
 43 =================================
 44 
 45 The following steps are taken in order to transition the system from the working
 46 state to the :ref:`suspend-to-idle <s2idle>` sleep state:
 47 
 48  1. Invoking system-wide suspend notifiers.
 49 
 50     Kernel subsystems can register callbacks to be invoked when the suspend
 51     transition is about to occur and when the resume transition has finished.
 52 
 53     That allows them to prepare for the change of the system state and to clean
 54     up after getting back to the working state.
 55 
 56  2. Freezing tasks.
 57 
 58     Tasks are frozen primarily in order to avoid unchecked hardware accesses
 59     from user space through MMIO regions or I/O registers exposed directly to
 60     it and to prevent user space from entering the kernel while the next step
 61     of the transition is in progress (which might have been problematic for
 62     various reasons).
 63 
 64     All user space tasks are intercepted as though they were sent a signal and
 65     put into uninterruptible sleep until the end of the subsequent system resume
 66     transition.
 67 
 68     The kernel threads that choose to be frozen during system suspend for
 69     specific reasons are frozen subsequently, but they are not intercepted.
 70     Instead, they are expected to periodically check whether or not they need
 71     to be frozen and to put themselves into uninterruptible sleep if so.  [Note,
 72     however, that kernel threads can use locking and other concurrency controls
 73     available in kernel space to synchronize themselves with system suspend and
 74     resume, which can be much more precise than the freezing, so the latter is
 75     not a recommended option for kernel threads.]
 76 
 77  3. Suspending devices and reconfiguring IRQs.
 78 
 79     Devices are suspended in four phases called *prepare*, *suspend*,
 80     *late suspend* and *noirq suspend* (see :ref:`driverapi_pm_devices` for more
 81     information on what exactly happens in each phase).
 82 
 83     Every device is visited in each phase, but typically it is not physically
 84     accessed in more than two of them.
 85 
 86     The runtime PM API is disabled for every device during the *late* suspend
 87     phase and high-level ("action") interrupt handlers are prevented from being
 88     invoked before the *noirq* suspend phase.
 89 
 90     Interrupts are still handled after that, but they are only acknowledged to
 91     interrupt controllers without performing any device-specific actions that
 92     would be triggered in the working state of the system (those actions are
 93     deferred till the subsequent system resume transition as described
 94     `below <s2idle_resume_>`_).
 95 
 96     IRQs associated with system wakeup devices are "armed" so that the resume
 97     transition of the system is started when one of them signals an event.
 98 
 99  4. Freezing the scheduler tick and suspending timekeeping.
100 
101     When all devices have been suspended, CPUs enter the idle loop and are put
102     into the deepest available idle state.  While doing that, each of them
103     "freezes" its own scheduler tick so that the timer events associated with
104     the tick do not occur until the CPU is woken up by another interrupt source.
105 
106     The last CPU to enter the idle state also stops the timekeeping which
107     (among other things) prevents high resolution timers from triggering going
108     forward until the first CPU that is woken up restarts the timekeeping.
109     That allows the CPUs to stay in the deep idle state relatively long in one
110     go.
111 
112     From this point on, the CPUs can only be woken up by non-timer hardware
113     interrupts.  If that happens, they go back to the idle state unless the
114     interrupt that woke up one of them comes from an IRQ that has been armed for
115     system wakeup, in which case the system resume transition is started.
116 
117 
118 .. _s2idle_resume:
119 
120 Suspend-to-idle Resume Code Flow
121 ================================
122 
123 The following steps are taken in order to transition the system from the
124 :ref:`suspend-to-idle <s2idle>` sleep state into the working state:
125 
126  1. Resuming timekeeping and unfreezing the scheduler tick.
127 
128     When one of the CPUs is woken up (by a non-timer hardware interrupt), it
129     leaves the idle state entered in the last step of the preceding suspend
130     transition, restarts the timekeeping (unless it has been restarted already
131     by another CPU that woke up earlier) and the scheduler tick on that CPU is
132     unfrozen.
133 
134     If the interrupt that has woken up the CPU was armed for system wakeup,
135     the system resume transition begins.
136 
137  2. Resuming devices and restoring the working-state configuration of IRQs.
138 
139     Devices are resumed in four phases called *noirq resume*, *early resume*,
140     *resume* and *complete* (see :ref:`driverapi_pm_devices` for more
141     information on what exactly happens in each phase).
142 
143     Every device is visited in each phase, but typically it is not physically
144     accessed in more than two of them.
145 
146     The working-state configuration of IRQs is restored after the *noirq* resume
147     phase and the runtime PM API is re-enabled for every device whose driver
148     supports it during the *early* resume phase.
149 
150  3. Thawing tasks.
151 
152     Tasks frozen in step 2 of the preceding `suspend <s2idle_suspend_>`_
153     transition are "thawed", which means that they are woken up from the
154     uninterruptible sleep that they went into at that time and user space tasks
155     are allowed to exit the kernel.
156 
157  4. Invoking system-wide resume notifiers.
158 
159     This is analogous to step 1 of the `suspend <s2idle_suspend_>`_ transition
160     and the same set of callbacks is invoked at this point, but a different
161     "notification type" parameter value is passed to them.
162 
163 
164 Platform-dependent Suspend Code Flow
165 ====================================
166 
167 The following steps are taken in order to transition the system from the working
168 state to platform-dependent suspend state:
169 
170  1. Invoking system-wide suspend notifiers.
171 
172     This step is the same as step 1 of the suspend-to-idle suspend transition
173     described `above <s2idle_suspend_>`_.
174 
175  2. Freezing tasks.
176 
177     This step is the same as step 2 of the suspend-to-idle suspend transition
178     described `above <s2idle_suspend_>`_.
179 
180  3. Suspending devices and reconfiguring IRQs.
181 
182     This step is analogous to step 3 of the suspend-to-idle suspend transition
183     described `above <s2idle_suspend_>`_, but the arming of IRQs for system
184     wakeup generally does not have any effect on the platform.
185 
186     There are platforms that can go into a very deep low-power state internally
187     when all CPUs in them are in sufficiently deep idle states and all I/O
188     devices have been put into low-power states.  On those platforms,
189     suspend-to-idle can reduce system power very effectively.
190 
191     On the other platforms, however, low-level components (like interrupt
192     controllers) need to be turned off in a platform-specific way (implemented
193     in the hooks provided by the platform driver) to achieve comparable power
194     reduction.
195 
196     That usually prevents in-band hardware interrupts from waking up the system,
197     which must be done in a special platform-dependent way.  Then, the
198     configuration of system wakeup sources usually starts when system wakeup
199     devices are suspended and is finalized by the platform suspend hooks later
200     on.
201 
202  4. Disabling non-boot CPUs.
203 
204     On some platforms the suspend hooks mentioned above must run in a one-CPU
205     configuration of the system (in particular, the hardware cannot be accessed
206     by any code running in parallel with the platform suspend hooks that may,
207     and often do, trap into the platform firmware in order to finalize the
208     suspend transition).
209 
210     For this reason, the CPU offline/online (CPU hotplug) framework is used
211     to take all of the CPUs in the system, except for one (the boot CPU),
212     offline (typically, the CPUs that have been taken offline go into deep idle
213     states).
214 
215     This means that all tasks are migrated away from those CPUs and all IRQs are
216     rerouted to the only CPU that remains online.
217 
218  5. Suspending core system components.
219 
220     This prepares the core system components for (possibly) losing power going
221     forward and suspends the timekeeping.
222 
223  6. Platform-specific power removal.
224 
225     This is expected to remove power from all of the system components except
226     for the memory controller and RAM (in order to preserve the contents of the
227     latter) and some devices designated for system wakeup.
228 
229     In many cases control is passed to the platform firmware which is expected
230     to finalize the suspend transition as needed.
231 
232 
233 Platform-dependent Resume Code Flow
234 ===================================
235 
236 The following steps are taken in order to transition the system from a
237 platform-dependent suspend state into the working state:
238 
239  1. Platform-specific system wakeup.
240 
241     The platform is woken up by a signal from one of the designated system
242     wakeup devices (which need not be an in-band hardware interrupt)  and
243     control is passed back to the kernel (the working configuration of the
244     platform may need to be restored by the platform firmware before the
245     kernel gets control again).
246 
247  2. Resuming core system components.
248 
249     The suspend-time configuration of the core system components is restored and
250     the timekeeping is resumed.
251 
252  3. Re-enabling non-boot CPUs.
253 
254     The CPUs disabled in step 4 of the preceding suspend transition are taken
255     back online and their suspend-time configuration is restored.
256 
257  4. Resuming devices and restoring the working-state configuration of IRQs.
258 
259     This step is the same as step 2 of the suspend-to-idle suspend transition
260     described `above <s2idle_resume_>`_.
261 
262  5. Thawing tasks.
263 
264     This step is the same as step 3 of the suspend-to-idle suspend transition
265     described `above <s2idle_resume_>`_.
266 
267  6. Invoking system-wide resume notifiers.
268 
269     This step is the same as step 4 of the suspend-to-idle suspend transition
270     described `above <s2idle_resume_>`_.

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php