1 .. SPDX-License-Identifier: GPL-2.0 2 3 Confidential Computing VMs 4 ========================== 5 Hyper-V can create and run Linux guests that a 6 (CoCo) VMs. Such VMs cooperate with the physic 7 the confidentiality and integrity of data in t 8 face of a hypervisor/VMM that has been comprom 9 CoCo VMs on Hyper-V share the generic CoCo VM 10 objectives described in Documentation/security 11 that Hyper-V specific code in Linux refers to 12 "isolation VMs". 13 14 A Linux CoCo VM on Hyper-V requires the cooper 15 following: 16 17 * Physical hardware with a processor that supp 18 19 * The hardware runs a version of Windows/Hyper 20 21 * The VM runs a version of Linux that supports 22 23 The physical hardware requirements are as foll 24 25 * AMD processor with SEV-SNP. Hyper-V does not 26 SEV, or SEV-ES encryption, and such encrypti 27 VM on Hyper-V. 28 29 * Intel processor with TDX 30 31 To create a CoCo VM, the "Isolated VM" attribu 32 when the VM is created. A VM cannot be changed 33 or vice versa, after it is created. 34 35 Operational Modes 36 ----------------- 37 Hyper-V CoCo VMs can run in two modes. The mod 38 created and cannot be changed during the life 39 40 * Fully-enlightened mode. In this mode, the gu 41 enlightened to understand and manage all asp 42 43 * Paravisor mode. In this mode, a paravisor la 44 host provides some operations needed to run 45 system can have fewer CoCo enlightenments th 46 fully-enlightened case. 47 48 Conceptually, fully-enlightened mode and parav 49 points on a spectrum spanning the degree of gu 50 as a CoCo VM. Fully-enlightened mode is one en 51 implementation of paravisor mode is the other 52 aspects of running as a CoCo VM are handled by 53 guest OS with no knowledge of memory encryptio 54 can run successfully. However, the Hyper-V imp 55 does not go this far, and is somewhere in the 56 aspects of CoCo VMs are handled by the Hyper-V 57 must be enlightened for other aspects. Unfortu 58 standardized enumeration of feature/functions 59 paravisor, and there is no standardized mechan 60 paravisor for the feature/functions it provide 61 the paravisor provides is hard-coded in the gu 62 63 Paravisor mode has similarities to the `Coconu 64 a limited paravisor to provide services to the 65 However, the Hyper-V paravisor generally handl 66 than is currently envisioned for Coconut, and 67 guest enlightenments required" end of the spec 68 69 .. _Coconut project: https://github.com/coconu 70 71 In the CoCo VM threat model, the paravisor is 72 and must be trusted by the guest OS. By implic 73 protect itself against a potentially malicious 74 protects against a potentially malicious guest 75 76 The hardware architectural approach to fully-e 77 varies depending on the underlying processor. 78 79 * With AMD SEV-SNP processors, in fully-enligh 80 VMPL 0 and has full control of the guest con 81 guest OS runs in VMPL 2 and the paravisor ru 82 running in VMPL 0 has privileges that the gu 83 Certain operations require the guest to invo 84 paravisor mode the guest OS operates in "vir 85 as defined by the SEV-SNP architecture. This 86 of memory encryption when a paravisor is use 87 88 * With Intel TDX processor, in fully-enlighten 89 L1 VM. In paravisor mode, TD partitioning is 90 L1 VM, and the guest OS runs in a nested L2 91 92 Hyper-V exposes a synthetic MSR to guests that 93 MSR indicates if the underlying processor uses 94 whether a paravisor is being used. It is strai 95 kernel image that can boot and run properly on 96 either mode. 97 98 Paravisor Effects 99 ----------------- 100 Running in paravisor mode affects the followin 101 CoCo VM functionality: 102 103 * Initial guest memory setup. When a new VM is 104 paravisor runs first and sets up the guest p 105 guest Linux does normal memory initializatio 106 appropriate ranges as decrypted (shared). In 107 perform the early boot memory setup steps th 108 AMD SEV-SNP in fully-enlightened mode. 109 110 * #VC/#VE exception handling. In paravisor mod 111 CoCo VM to route #VC and #VE exceptions to V 112 respectively, and not the guest Linux. Conse 113 do not run in the guest Linux and are not a 114 Linux guest in paravisor mode. 115 116 * CPUID flags. Both AMD SEV-SNP and Intel TDX 117 guest indicating that the VM is operating wi 118 support. While these CPUID flags are visible 119 the paravisor filters out these flags and th 120 Throughout the Linux kernel, explicitly test 121 eliminated in favor of the cc_platform_has() 122 abstracting the differences between SEV-SNP 123 cc_platform_has() abstraction also allows th 124 to selectively enable aspects of CoCo VM fun 125 flags are not set. The exception is early bo 126 tests the CPUID SEV-SNP flag. But not having 127 mode VM achieves the desired effect or not r 128 boot memory setup. 129 130 * Device emulation. In paravisor mode, the Hyp 131 emulation of devices such as the IO-APIC and 132 happens in the paravisor in the guest contex 133 context), MMIO accesses to these devices mus 134 of the decrypted references that would be us 135 VM. The __ioremap_caller() function has been 136 check whether a particular address range sho 137 (private). See the "is_private_mmio" callbac 138 139 * Encrypt/decrypt memory transitions. In a CoC 140 memory between encrypted and decrypted requi 141 hypervisor/VMM. This is done via callbacks i 142 __set_memory_enc_pgtable(). In fully-enlight 143 TDX implementations of these callbacks are u 144 specific set of callbacks is used. These cal 145 that the paravisor can coordinate the transi 146 as necessary. See hv_vtom_init() where these 147 148 * Interrupt injection. In fully enlightened mo 149 could inject interrupts into the guest OS at 150 architectural rules. For full protection, th 151 enlightenments that use the interrupt inject 152 by CoCo-capable processors. In paravisor mod 153 interrupt injection into the guest OS, and e 154 sees interrupts that are "legal". The paravi 155 management features provided by the CoCo-cap 156 masking these complexities from the guest OS 157 158 Hyper-V Hypercalls 159 ------------------ 160 When in fully-enlightened mode, hypercalls mad 161 directly to the hypervisor, just as in a non-C 162 normal hypercalls trap to the paravisor first, 163 hypervisor. But the paravisor is idiosyncratic 164 hypercalls made by the Linux guest must always 165 hypervisor. These hypercall sites test for a p 166 a special invocation sequence. See hv_post_mes 167 168 Guest communication with Hyper-V 169 -------------------------------- 170 Separate from the generic Linux kernel handlin 171 CoCo VMs, Hyper-V has VMBus and VMBus devices 172 shared between the Linux guest and the host. T 173 marked decrypted to enable communication. Furt 174 includes a compromised and potentially malicio 175 against leaking any unintended data to the hos 176 177 These Hyper-V and VMBus memory pages are marke 178 179 * VMBus monitor pages 180 181 * Synthetic interrupt controller (synic) relat 182 the paravisor) 183 184 * Per-cpu hypercall input and output pages (un 185 186 * VMBus ring buffers. The direct mapping is ma 187 __vmbus_establish_gpadl(). The secondary map 188 hv_ringbuffer_init() must also include the " 189 190 When the guest writes data to memory that is s 191 ensure that only the intended data is written. 192 be initialized to zeros before copying into th 193 kernel data is not inadvertently given to the 194 195 Similarly, when the guest reads memory that is 196 validate the data before acting on it so that 197 the guest to expose unintended data. Doing suc 198 because the host can modify the shared memory 199 validation is performed. For messages passed f 200 VMBus ring buffer, the length of the message i 201 copied into a temporary (encrypted) buffer for 202 processing. The copying adds a small amount of 203 to protect against a malicious host. See hv_pk 204 205 Many drivers for VMBus devices have been "hard 206 validate messages received over VMBus, instead 207 acting cooperatively. Such drivers are marked 208 vmbus_devs[] table. Other drivers for VMBus de 209 CoCo VM have not been hardened, and they are n 210 VM. See vmbus_is_valid_offer() where such devi 211 212 Two VMBus devices depend on the Hyper-V host t 213 storvsc for disk I/O and netvsc for network I/ 214 Linux kernel DMA APIs, and so bounce buffering 215 memory is done implicitly. netvsc has two mode 216 mode goes through send and receive buffer spac 217 by the netvsc driver, and is used for most sma 218 receive buffers are marked decrypted by __vmbu 219 the netvsc driver explicitly copies packets to 220 equivalent of bounce buffering between encrypt 221 already part of the data path. The second mode 222 DMA APIs, and is bounce buffered through swiot 223 storvsc. 224 225 Finally, the VMBus virtual PCI driver needs sp 226 Linux PCI device drivers access PCI config spa 227 by the Linux PCI subsystem. On Hyper-V, these 228 space, and the access traps to Hyper-V for emu 229 encryption prevents Hyper-V from reading the g 230 emulate the access. So in a CoCo VM, these fun 231 with arguments explicitly describing the acces 232 _hv_pcifront_read_config() and _hv_pcifront_wr 233 "use_calls" flag indicating to use hypercalls. 234 235 load_unaligned_zeropad() 236 ------------------------ 237 When transitioning memory between encrypted an 238 set_memory_encrypted() or set_memory_decrypted 239 the memory isn't in use and isn't referenced w 240 progress. The transition has multiple steps, a 241 the Hyper-V host. The memory is in an inconsis 242 complete. A reference while the state is incon 243 exception that can't be cleanly fixed up. 244 245 However, the kernel load_unaligned_zeropad() m 246 references that can't be prevented by the call 247 set_memory_decrypted(), so there's specific co 248 handler to fixup this case. But a CoCo VM runn 249 configured to run with a paravisor, with the # 250 the paravisor. There's no architectural way to 251 the guest kernel, and in such a case, the load 252 in the #VC/#VE handlers doesn't run. 253 254 To avoid this problem, the Hyper-V specific fu 255 hypervisor of the transition mark pages as "no 256 is in progress. If load_unaligned_zeropad() ca 257 normal page fault is generated instead of #VC 258 based handlers for load_unaligned_zeropad() fi 259 encrypted/decrypted transition is complete, th 260 again. See hv_vtom_clear_present() and hv_vtom
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.