~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/virt/kvm/x86/running-nested-guests.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/virt/kvm/x86/running-nested-guests.rst (Architecture mips) and /Documentation/virt/kvm/x86/running-nested-guests.rst (Architecture ppc)


  1 .. SPDX-License-Identifier: GPL-2.0                 1 .. SPDX-License-Identifier: GPL-2.0
  2                                                     2 
  3 ==============================                      3 ==============================
  4 Running nested guests with KVM                      4 Running nested guests with KVM
  5 ==============================                      5 ==============================
  6                                                     6 
  7 A nested guest is the ability to run a guest i      7 A nested guest is the ability to run a guest inside another guest (it
  8 can be KVM-based or a different hypervisor).        8 can be KVM-based or a different hypervisor).  The straightforward
  9 example is a KVM guest that in turn runs on a       9 example is a KVM guest that in turn runs on a KVM guest (the rest of
 10 this document is built on this example)::          10 this document is built on this example)::
 11                                                    11 
 12               .----------------.  .-----------     12               .----------------.  .----------------.
 13               |                |  |                13               |                |  |                |
 14               |      L2        |  |      L2        14               |      L2        |  |      L2        |
 15               | (Nested Guest) |  | (Nested Gu     15               | (Nested Guest) |  | (Nested Guest) |
 16               |                |  |                16               |                |  |                |
 17               |----------------'--'-----------     17               |----------------'--'----------------|
 18               |                                    18               |                                    |
 19               |       L1 (Guest Hypervisor)        19               |       L1 (Guest Hypervisor)        |
 20               |          KVM (/dev/kvm)            20               |          KVM (/dev/kvm)            |
 21               |                                    21               |                                    |
 22       .---------------------------------------     22       .------------------------------------------------------.
 23       |                 L0 (Host Hypervisor)       23       |                 L0 (Host Hypervisor)                 |
 24       |                    KVM (/dev/kvm)          24       |                    KVM (/dev/kvm)                    |
 25       |---------------------------------------     25       |------------------------------------------------------|
 26       |        Hardware (with virtualization e     26       |        Hardware (with virtualization extensions)     |
 27       '---------------------------------------     27       '------------------------------------------------------'
 28                                                    28 
 29 Terminology:                                       29 Terminology:
 30                                                    30 
 31 - L0 – level-0; the bare metal host, running     31 - L0 – level-0; the bare metal host, running KVM
 32                                                    32 
 33 - L1 – level-1 guest; a VM running on L0; al     33 - L1 – level-1 guest; a VM running on L0; also called the "guest
 34   hypervisor", as it itself is capable of runn     34   hypervisor", as it itself is capable of running KVM.
 35                                                    35 
 36 - L2 – level-2 guest; a VM running on L1, th     36 - L2 – level-2 guest; a VM running on L1, this is the "nested guest"
 37                                                    37 
 38 .. note:: The above diagram is modelled after      38 .. note:: The above diagram is modelled after the x86 architecture;
 39           s390x, ppc64 and other architectures     39           s390x, ppc64 and other architectures are likely to have
 40           a different design for nesting.          40           a different design for nesting.
 41                                                    41 
 42           For example, s390x always has an LPA     42           For example, s390x always has an LPAR (LogicalPARtition)
 43           hypervisor running on bare metal, ad     43           hypervisor running on bare metal, adding another layer and
 44           resulting in at least four levels in     44           resulting in at least four levels in a nested setup — L0 (bare
 45           metal, running the LPAR hypervisor),     45           metal, running the LPAR hypervisor), L1 (host hypervisor), L2
 46           (guest hypervisor), L3 (nested guest     46           (guest hypervisor), L3 (nested guest).
 47                                                    47 
 48           This document will stick with the th     48           This document will stick with the three-level terminology (L0,
 49           L1, and L2) for all architectures; a     49           L1, and L2) for all architectures; and will largely focus on
 50           x86.                                     50           x86.
 51                                                    51 
 52                                                    52 
 53 Use Cases                                          53 Use Cases
 54 ---------                                          54 ---------
 55                                                    55 
 56 There are several scenarios where nested KVM c     56 There are several scenarios where nested KVM can be useful, to name a
 57 few:                                               57 few:
 58                                                    58 
 59 - As a developer, you want to test your softwa     59 - As a developer, you want to test your software on different operating
 60   systems (OSes).  Instead of renting multiple     60   systems (OSes).  Instead of renting multiple VMs from a Cloud
 61   Provider, using nested KVM lets you rent a l     61   Provider, using nested KVM lets you rent a large enough "guest
 62   hypervisor" (level-1 guest).  This in turn a     62   hypervisor" (level-1 guest).  This in turn allows you to create
 63   multiple nested guests (level-2 guests), run     63   multiple nested guests (level-2 guests), running different OSes, on
 64   which you can develop and test your software     64   which you can develop and test your software.
 65                                                    65 
 66 - Live migration of "guest hypervisors" and th     66 - Live migration of "guest hypervisors" and their nested guests, for
 67   load balancing, disaster recovery, etc.          67   load balancing, disaster recovery, etc.
 68                                                    68 
 69 - VM image creation tools (e.g. ``virt-install     69 - VM image creation tools (e.g. ``virt-install``,  etc) often run
 70   their own VM, and users expect these to work     70   their own VM, and users expect these to work inside a VM.
 71                                                    71 
 72 - Some OSes use virtualization internally for      72 - Some OSes use virtualization internally for security (e.g. to let
 73   applications run safely in isolation).           73   applications run safely in isolation).
 74                                                    74 
 75                                                    75 
 76 Enabling "nested" (x86)                            76 Enabling "nested" (x86)
 77 -----------------------                            77 -----------------------
 78                                                    78 
 79 From Linux kernel v4.20 onwards, the ``nested`     79 From Linux kernel v4.20 onwards, the ``nested`` KVM parameter is enabled
 80 by default for Intel and AMD.  (Though your Li     80 by default for Intel and AMD.  (Though your Linux distribution might
 81 override this default.)                            81 override this default.)
 82                                                    82 
 83 In case you are running a Linux kernel older t     83 In case you are running a Linux kernel older than v4.19, to enable
 84 nesting, set the ``nested`` KVM module paramet     84 nesting, set the ``nested`` KVM module parameter to ``Y`` or ``1``.  To
 85 persist this setting across reboots, you can a     85 persist this setting across reboots, you can add it in a config file, as
 86 shown below:                                       86 shown below:
 87                                                    87 
 88 1. On the bare metal host (L0), list the kerne     88 1. On the bare metal host (L0), list the kernel modules and ensure that
 89    the KVM modules::                               89    the KVM modules::
 90                                                    90 
 91     $ lsmod | grep -i kvm                          91     $ lsmod | grep -i kvm
 92     kvm_intel             133627  0                92     kvm_intel             133627  0
 93     kvm                   435079  1 kvm_intel      93     kvm                   435079  1 kvm_intel
 94                                                    94 
 95 2. Show information for ``kvm_intel`` module::     95 2. Show information for ``kvm_intel`` module::
 96                                                    96 
 97     $ modinfo kvm_intel | grep -i nested           97     $ modinfo kvm_intel | grep -i nested
 98     parm:           nested:bool                    98     parm:           nested:bool
 99                                                    99 
100 3. For the nested KVM configuration to persist    100 3. For the nested KVM configuration to persist across reboots, place the
101    below in ``/etc/modprobed/kvm_intel.conf``     101    below in ``/etc/modprobed/kvm_intel.conf`` (create the file if it
102    doesn't exist)::                               102    doesn't exist)::
103                                                   103 
104     $ cat /etc/modprobe.d/kvm_intel.conf          104     $ cat /etc/modprobe.d/kvm_intel.conf
105     options kvm-intel nested=y                    105     options kvm-intel nested=y
106                                                   106 
107 4. Unload and re-load the KVM Intel module::      107 4. Unload and re-load the KVM Intel module::
108                                                   108 
109     $ sudo rmmod kvm-intel                        109     $ sudo rmmod kvm-intel
110     $ sudo modprobe kvm-intel                     110     $ sudo modprobe kvm-intel
111                                                   111 
112 5. Verify if the ``nested`` parameter for KVM     112 5. Verify if the ``nested`` parameter for KVM is enabled::
113                                                   113 
114     $ cat /sys/module/kvm_intel/parameters/nes    114     $ cat /sys/module/kvm_intel/parameters/nested
115     Y                                             115     Y
116                                                   116 
117 For AMD hosts, the process is the same as abov    117 For AMD hosts, the process is the same as above, except that the module
118 name is ``kvm-amd``.                              118 name is ``kvm-amd``.
119                                                   119 
120                                                   120 
121 Additional nested-related kernel parameters (x    121 Additional nested-related kernel parameters (x86)
122 ----------------------------------------------    122 -------------------------------------------------
123                                                   123 
124 If your hardware is sufficiently advanced (Int    124 If your hardware is sufficiently advanced (Intel Haswell processor or
125 higher, which has newer hardware virt extensio    125 higher, which has newer hardware virt extensions), the following
126 additional features will also be enabled by de    126 additional features will also be enabled by default: "Shadow VMCS
127 (Virtual Machine Control Structure)", APIC Vir    127 (Virtual Machine Control Structure)", APIC Virtualization on your bare
128 metal host (L0).  Parameters for Intel hosts::    128 metal host (L0).  Parameters for Intel hosts::
129                                                   129 
130     $ cat /sys/module/kvm_intel/parameters/ena    130     $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
131     Y                                             131     Y
132                                                   132 
133     $ cat /sys/module/kvm_intel/parameters/ena    133     $ cat /sys/module/kvm_intel/parameters/enable_apicv
134     Y                                             134     Y
135                                                   135 
136     $ cat /sys/module/kvm_intel/parameters/ept    136     $ cat /sys/module/kvm_intel/parameters/ept
137     Y                                             137     Y
138                                                   138 
139 .. note:: If you suspect your L2 (i.e. nested     139 .. note:: If you suspect your L2 (i.e. nested guest) is running slower,
140           ensure the above are enabled (partic    140           ensure the above are enabled (particularly
141           ``enable_shadow_vmcs`` and ``ept``).    141           ``enable_shadow_vmcs`` and ``ept``).
142                                                   142 
143                                                   143 
144 Starting a nested guest (x86)                     144 Starting a nested guest (x86)
145 -----------------------------                     145 -----------------------------
146                                                   146 
147 Once your bare metal host (L0) is configured f    147 Once your bare metal host (L0) is configured for nesting, you should be
148 able to start an L1 guest with::                  148 able to start an L1 guest with::
149                                                   149 
150     $ qemu-kvm -cpu host [...]                    150     $ qemu-kvm -cpu host [...]
151                                                   151 
152 The above will pass through the host CPU's cap    152 The above will pass through the host CPU's capabilities as-is to the
153 guest, or for better live migration compatibil    153 guest, or for better live migration compatibility, use a named CPU
154 model supported by QEMU. e.g.::                   154 model supported by QEMU. e.g.::
155                                                   155 
156     $ qemu-kvm -cpu Haswell-noTSX-IBRS,vmx=on     156     $ qemu-kvm -cpu Haswell-noTSX-IBRS,vmx=on
157                                                   157 
158 then the guest hypervisor will subsequently be    158 then the guest hypervisor will subsequently be capable of running a
159 nested guest with accelerated KVM.                159 nested guest with accelerated KVM.
160                                                   160 
161                                                   161 
162 Enabling "nested" (s390x)                         162 Enabling "nested" (s390x)
163 -------------------------                         163 -------------------------
164                                                   164 
165 1. On the host hypervisor (L0), enable the ``n    165 1. On the host hypervisor (L0), enable the ``nested`` parameter on
166    s390x::                                        166    s390x::
167                                                   167 
168     $ rmmod kvm                                   168     $ rmmod kvm
169     $ modprobe kvm nested=1                       169     $ modprobe kvm nested=1
170                                                   170 
171 .. note:: On s390x, the kernel parameter ``hpa    171 .. note:: On s390x, the kernel parameter ``hpage`` is mutually exclusive
172           with the ``nested`` parameter — i.    172           with the ``nested`` parameter — i.e. to be able to enable
173           ``nested``, the ``hpage`` parameter     173           ``nested``, the ``hpage`` parameter *must* be disabled.
174                                                   174 
175 2. The guest hypervisor (L1) must be provided     175 2. The guest hypervisor (L1) must be provided with the ``sie`` CPU
176    feature — with QEMU, this can be done by     176    feature — with QEMU, this can be done by using "host passthrough"
177    (via the command-line ``-cpu host``).          177    (via the command-line ``-cpu host``).
178                                                   178 
179 3. Now the KVM module can be loaded in the L1     179 3. Now the KVM module can be loaded in the L1 (guest hypervisor)::
180                                                   180 
181     $ modprobe kvm                                181     $ modprobe kvm
182                                                   182 
183                                                   183 
184 Live migration with nested KVM                    184 Live migration with nested KVM
185 ------------------------------                    185 ------------------------------
186                                                   186 
187 Migrating an L1 guest, with a  *live* nested g    187 Migrating an L1 guest, with a  *live* nested guest in it, to another
188 bare metal host, works as of Linux kernel 5.3     188 bare metal host, works as of Linux kernel 5.3 and QEMU 4.2.0 for
189 Intel x86 systems, and even on older versions     189 Intel x86 systems, and even on older versions for s390x.
190                                                   190 
191 On AMD systems, once an L1 guest has started a    191 On AMD systems, once an L1 guest has started an L2 guest, the L1 guest
192 should no longer be migrated or saved (refer t    192 should no longer be migrated or saved (refer to QEMU documentation on
193 "savevm"/"loadvm") until the L2 guest shuts do    193 "savevm"/"loadvm") until the L2 guest shuts down.  Attempting to migrate
194 or save-and-load an L1 guest while an L2 guest    194 or save-and-load an L1 guest while an L2 guest is running will result in
195 undefined behavior.  You might see a ``kernel     195 undefined behavior.  You might see a ``kernel BUG!`` entry in ``dmesg``, a
196 kernel 'oops', or an outright kernel panic.  S    196 kernel 'oops', or an outright kernel panic.  Such a migrated or loaded L1
197 guest can no longer be considered stable or se    197 guest can no longer be considered stable or secure, and must be restarted.
198 Migrating an L1 guest merely configured to sup    198 Migrating an L1 guest merely configured to support nesting, while not
199 actually running L2 guests, is expected to fun    199 actually running L2 guests, is expected to function normally even on AMD
200 systems but may fail once guests are started.     200 systems but may fail once guests are started.
201                                                   201 
202 Migrating an L2 guest is always expected to su    202 Migrating an L2 guest is always expected to succeed, so all the following
203 scenarios should work even on AMD systems:        203 scenarios should work even on AMD systems:
204                                                   204 
205 - Migrating a nested guest (L2) to another L1     205 - Migrating a nested guest (L2) to another L1 guest on the *same* bare
206   metal host.                                     206   metal host.
207                                                   207 
208 - Migrating a nested guest (L2) to another L1     208 - Migrating a nested guest (L2) to another L1 guest on a *different*
209   bare metal host.                                209   bare metal host.
210                                                   210 
211 - Migrating a nested guest (L2) to a bare meta    211 - Migrating a nested guest (L2) to a bare metal host.
212                                                   212 
213 Reporting bugs from nested setups                 213 Reporting bugs from nested setups
214 -----------------------------------               214 -----------------------------------
215                                                   215 
216 Debugging "nested" problems can involve siftin    216 Debugging "nested" problems can involve sifting through log files across
217 L0, L1 and L2; this can result in tedious back    217 L0, L1 and L2; this can result in tedious back-n-forth between the bug
218 reporter and the bug fixer.                       218 reporter and the bug fixer.
219                                                   219 
220 - Mention that you are in a "nested" setup.  I    220 - Mention that you are in a "nested" setup.  If you are running any kind
221   of "nesting" at all, say so.  Unfortunately,    221   of "nesting" at all, say so.  Unfortunately, this needs to be called
222   out because when reporting bugs, people tend    222   out because when reporting bugs, people tend to forget to even
223   *mention* that they're using nested virtuali    223   *mention* that they're using nested virtualization.
224                                                   224 
225 - Ensure you are actually running KVM on KVM.     225 - Ensure you are actually running KVM on KVM.  Sometimes people do not
226   have KVM enabled for their guest hypervisor     226   have KVM enabled for their guest hypervisor (L1), which results in
227   them running with pure emulation or what QEM    227   them running with pure emulation or what QEMU calls it as "TCG", but
228   they think they're running nested KVM.  Thus    228   they think they're running nested KVM.  Thus confusing "nested Virt"
229   (which could also mean, QEMU on KVM) with "n    229   (which could also mean, QEMU on KVM) with "nested KVM" (KVM on KVM).
230                                                   230 
231 Information to collect (generic)                  231 Information to collect (generic)
232 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                  232 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233                                                   233 
234 The following is not an exhaustive list, but a    234 The following is not an exhaustive list, but a very good starting point:
235                                                   235 
236   - Kernel, libvirt, and QEMU version from L0     236   - Kernel, libvirt, and QEMU version from L0
237                                                   237 
238   - Kernel, libvirt and QEMU version from L1      238   - Kernel, libvirt and QEMU version from L1
239                                                   239 
240   - QEMU command-line of L1 -- when using libv    240   - QEMU command-line of L1 -- when using libvirt, you'll find it here:
241     ``/var/log/libvirt/qemu/instance.log``        241     ``/var/log/libvirt/qemu/instance.log``
242                                                   242 
243   - QEMU command-line of L2 -- as above, when     243   - QEMU command-line of L2 -- as above, when using libvirt, get the
244     complete libvirt-generated QEMU command-li    244     complete libvirt-generated QEMU command-line
245                                                   245 
246   - ``cat /sys/cpuinfo`` from L0                  246   - ``cat /sys/cpuinfo`` from L0
247                                                   247 
248   - ``cat /sys/cpuinfo`` from L1                  248   - ``cat /sys/cpuinfo`` from L1
249                                                   249 
250   - ``lscpu`` from L0                             250   - ``lscpu`` from L0
251                                                   251 
252   - ``lscpu`` from L1                             252   - ``lscpu`` from L1
253                                                   253 
254   - Full ``dmesg`` output from L0                 254   - Full ``dmesg`` output from L0
255                                                   255 
256   - Full ``dmesg`` output from L1                 256   - Full ``dmesg`` output from L1
257                                                   257 
258 x86-specific info to collect                      258 x86-specific info to collect
259 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~                      259 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260                                                   260 
261 Both the below commands, ``x86info`` and ``dmi    261 Both the below commands, ``x86info`` and ``dmidecode``, should be
262 available on most Linux distributions with the    262 available on most Linux distributions with the same name:
263                                                   263 
264   - Output of: ``x86info -a`` from L0             264   - Output of: ``x86info -a`` from L0
265                                                   265 
266   - Output of: ``x86info -a`` from L1             266   - Output of: ``x86info -a`` from L1
267                                                   267 
268   - Output of: ``dmidecode`` from L0              268   - Output of: ``dmidecode`` from L0
269                                                   269 
270   - Output of: ``dmidecode`` from L1              270   - Output of: ``dmidecode`` from L1
271                                                   271 
272 s390x-specific info to collect                    272 s390x-specific info to collect
273 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    273 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
274                                                   274 
275 Along with the earlier mentioned generic detai    275 Along with the earlier mentioned generic details, the below is
276 also recommended:                                 276 also recommended:
277                                                   277 
278   - ``/proc/sysinfo`` from L1; this will also     278   - ``/proc/sysinfo`` from L1; this will also include the info from L0
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php