1 .. SPDX-License-Identifier: GPL-2.0 2 3 =============================== 4 Software Guard eXtensions (SGX) 5 =============================== 6 7 Overview 8 ======== 9 10 Software Guard eXtensions (SGX) hardware enabl 11 to set aside private memory regions of code an 12 13 * Privileged (ring-0) ENCLS functions orchestr 14 regions. 15 * Unprivileged (ring-3) ENCLU functions allow 16 execute inside the regions. 17 18 These memory regions are called enclaves. An e 19 fixed set of entry points. Each entry point ca 20 at a time. While the enclave is loaded from a 21 ENCLS functions, only the threads inside the e 22 region is denied from outside access by the CP 23 from LLC. 24 25 The support can be determined by 26 27 ``grep sgx /proc/cpuinfo`` 28 29 SGX must both be supported in the processor an 30 appears to be unsupported on a system which ha 31 support is enabled in the BIOS. If a BIOS pre 32 and "Software Enabled" modes for SGX, choose " 33 34 Enclave Page Cache 35 ================== 36 37 SGX utilizes an *Enclave Page Cache (EPC)* to 38 with an enclave. It is contained in a BIOS-res 39 Unlike pages used for regular memory, pages ca 40 the enclave during enclave construction with s 41 42 Only a CPU executing inside an enclave can dir 43 However, a CPU executing inside an enclave may 44 enclave. 45 46 The kernel manages enclave memory similar to h 47 48 Enclave Page Types 49 ------------------ 50 51 **SGX Enclave Control Structure (SECS)** 52 Enclave's address range, attributes and oth 53 by this structure. 54 55 **Regular (REG)** 56 Regular EPC pages contain the code and data 57 58 **Thread Control Structure (TCS)** 59 Thread Control Structure pages define the e 60 track the execution state of an enclave thr 61 62 **Version Array (VA)** 63 Version Array pages contain 512 slots, each 64 number for a page evicted from the EPC. 65 66 Enclave Page Cache Map 67 ---------------------- 68 69 The processor tracks EPC pages in a hardware m 70 *Enclave Page Cache Map (EPCM)*. The EPCM con 71 which describes the owning enclave, access rig 72 things. 73 74 EPCM permissions are separate from the normal 75 kernel from, for instance, allowing writes to 76 remain read-only. EPCM permissions may only i 77 top of normal x86 page permissions. 78 79 For all intents and purposes, the SGX architec 80 invalidate all EPCM entries at will. This req 81 handle an EPCM fault at any time. In practice 82 power transitions when the ephemeral key that 83 84 Application interface 85 ===================== 86 87 Enclave build functions 88 ----------------------- 89 90 In addition to the traditional compiler and li 91 separate enclave “build” process. Enclave 92 executed (entered). The first step in building 93 **/dev/sgx_enclave** device. Since enclave me 94 access, special privileged instructions are th 95 pages and establish enclave page permissions. 96 97 .. kernel-doc:: arch/x86/kernel/cpu/sgx/ioctl. 98 :functions: sgx_ioc_enclave_create 99 sgx_ioc_enclave_add_pages 100 sgx_ioc_enclave_init 101 sgx_ioc_enclave_provision 102 103 Enclave runtime management 104 -------------------------- 105 106 Systems supporting SGX2 additionally support c 107 enclaves: modifying enclave page permissions a 108 adding and removing of enclave pages. When an 109 within its address range that does not have a 110 regular page will be dynamically added to the 111 still required to run EACCEPT on the new page 112 113 .. kernel-doc:: arch/x86/kernel/cpu/sgx/ioctl. 114 :functions: sgx_ioc_enclave_restrict_permis 115 sgx_ioc_enclave_modify_types 116 sgx_ioc_enclave_remove_pages 117 118 Enclave vDSO 119 ------------ 120 121 Entering an enclave can only be done through S 122 functions, and is a non-trivial process. Beca 123 transitioning to and from an enclave, enclaves 124 handle the actual transitions. This is roughl 125 implementations are used by most applications 126 127 Another crucial characteristic of enclaves is 128 as part of their normal operation that need to 129 unique to SGX. 130 131 Instead of the traditional signal mechanism to 132 can leverage special exception fixup provided 133 vDSO function wraps low-level transitions to/f 134 ERESUME. The vDSO function intercepts excepti 135 a signal and return the fault information dire 136 the need to juggle signal handlers. 137 138 .. kernel-doc:: arch/x86/include/uapi/asm/sgx. 139 :functions: vdso_sgx_enter_enclave_t 140 141 ksgxd 142 ===== 143 144 SGX support includes a kernel thread called *k 145 146 EPC sanitization 147 ---------------- 148 149 ksgxd is started when SGX initializes. Enclav 150 for use when the processor powers on or resets 151 use since the reset, enclave pages may be in a 152 occur after a crash and kexec() cycle, for ins 153 reinitializes all enclave pages so that they c 154 155 The sanitization is done by going through EPC 156 EREMOVE function to each physical page. Some e 157 hardware dependencies on other pages which pre 158 Executing two EREMOVE passes removes the depen 159 160 Page reclaimer 161 -------------- 162 163 Similar to the core kswapd, ksgxd, is responsi 164 overcommitment of enclave memory. If the syst 165 *ksgxd* “swaps” enclave memory to normal m 166 167 Launch Control 168 ============== 169 170 SGX provides a launch control mechanism. After 171 copied, kernel executes EINIT function, which 172 this the CPU can execute inside the enclave. 173 174 EINIT function takes an RSA-3072 signature of 175 checks that the measurement is correct and sig 176 hashed to the four **IA32_SGXLEPUBKEYHASH{0, 1 177 SHA256 of a public key. 178 179 Those MSRs can be configured by the BIOS to be 180 Linux supports only writable configuration in 181 kernel on launch control policy. Before callin 182 the MSRs to match the enclave's signing key. 183 184 Encryption engines 185 ================== 186 187 In order to conceal the enclave data while it 188 memory controller has an encryption engine to 189 enclave memory. 190 191 In CPUs prior to Ice Lake, the Memory Encrypti 192 encrypt pages leaving the CPU caches. MEE uses 193 SRAM to maintain integrity of the encrypted da 194 anti-replay protection but does not scale to l 195 required to update the Merkle tree grows logar 196 memory size. 197 198 CPUs starting from Icelake use Total Memory En 199 MEE. TME-based SGX implementations do not have 200 means integrity and replay-attacks are not mit 201 additional changes to prevent cipher text from 202 aliases from being created. 203 204 DMA to enclave memory is blocked by range regi 205 (SDM section 41.10). 206 207 Usage Models 208 ============ 209 210 Shared Library 211 -------------- 212 213 Sensitive data and the code that acts on it is 214 into a separate library. The library is then l 215 into an enclave. The application can then make 216 the enclave through special SGX instructions. 217 configured to marshal function parameters into 218 call the correct library function. 219 220 Application Container 221 --------------------- 222 223 An application may be loaded into a container 224 configured with a library OS and run-time whic 225 The enclave run-time and library OS work toget 226 when a thread enters the enclave. 227 228 Impact of Potential Kernel SGX Bugs 229 =================================== 230 231 EPC leaks 232 --------- 233 234 When EPC page leaks happen, a WARNING like thi 235 236 "EREMOVE returned ... and an EPC page was leak 237 238 This is effectively a kernel use-after-free of 239 to the way SGX works, the bug is detected at f 240 adding the page back to the pool of available 241 intentionally leaks the page to avoid addition 242 243 When this happens, the kernel will likely soon 244 SGX will likely become unusable because the me 245 limited. However, while this may be fatal to S 246 is unlikely to be impacted and should continue 247 248 As a result, when this happens, user should st 249 SGX workloads, (or just any new workloads), an 250 workloads. Although a machine reboot can recov 251 should be reported to Linux developers. 252 253 254 Virtual EPC 255 =========== 256 257 The implementation has also a virtual EPC driv 258 in guests. Unlike the SGX driver, an EPC page 259 EPC driver doesn't have a specific enclave ass 260 because KVM doesn't track how a guest uses EPC 261 262 As a result, the SGX core page reclaimer doesn 263 pages allocated to KVM guests through the virt 264 user wants to deploy SGX applications both on 265 on the same machine, the user should reserve e 266 total virtual EPC size of all SGX VMs from the 267 host SGX applications so they can run with acc 268 269 Architectural behavior is to restore all EPC p 270 state also after a guest reboot. Because this 271 through the privileged ``ENCLS[EREMOVE]`` inst 272 provides the ``SGX_IOC_VEPC_REMOVE_ALL`` ioctl 273 on all pages in the virtual EPC. 274 275 ``EREMOVE`` can fail for three reasons. Users 276 to expected failures and handle them as follow 277 278 1. Page removal will always fail when any thre 279 enclave to which the page belongs. In this 280 return ``EBUSY`` independent of whether it 281 some pages; userspace can avoid these failu 282 of any vcpu which maps the virtual EPC. 283 284 2. Page removal will cause a general protectio 285 ``EREMOVE`` happen concurrently for pages t 286 "SECS" metadata pages. This can happen if 287 invocations to ``SGX_IOC_VEPC_REMOVE_ALL``, 288 file descriptor in the guest is closed at t 289 ``SGX_IOC_VEPC_REMOVE_ALL``; it will also b 290 This can be avoided in userspace by seriali 291 and to close(), but in general it should no 292 293 3. Finally, page removal will fail for SECS me 294 have child pages. Child pages can be remov 295 ``SGX_IOC_VEPC_REMOVE_ALL`` on all ``/dev/s 296 mapped into the guest. This means that the 297 twice: an initial set of calls to remove ch 298 set of calls to remove SECS pages. The sec 299 required for those mappings that returned a 300 first call. It indicates a bug in the kern 301 if any of the second round of ``SGX_IOC_VEP 302 a return code other than 0.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.