1 .. SPDX-License-Identifier: GPL-2.0 2 3 ================= 4 KVM VCPU Requests 5 ================= 6 7 Overview 8 ======== 9 10 KVM supports an internal API enabling threads 11 perform some activity. For example, a thread 12 its TLB with a VCPU request. The API consists 13 14 /* Check if any requests are pending for VCP 15 bool kvm_request_pending(struct kvm_vcpu *vc 16 17 /* Check if VCPU @vcpu has request @req pend 18 bool kvm_test_request(int req, struct kvm_vc 19 20 /* Clear request @req for VCPU @vcpu. */ 21 void kvm_clear_request(int req, struct kvm_v 22 23 /* 24 * Check if VCPU @vcpu has request @req pend 25 * pending it will be cleared and a memory b 26 * another in kvm_make_request(), will be is 27 */ 28 bool kvm_check_request(int req, struct kvm_v 29 30 /* 31 * Make request @req of VCPU @vcpu. Issues a 32 * with another in kvm_check_request(), prio 33 */ 34 void kvm_make_request(int req, struct kvm_vc 35 36 /* Make request @req of all VCPUs of the VM 37 bool kvm_make_all_cpus_request(struct kvm *k 38 39 Typically a requester wants the VCPU to perfor 40 as possible after making the request. This me 41 (kvm_make_request() calls) are followed by a c 42 and kvm_make_all_cpus_request() has the kickin 43 into it. 44 45 VCPU Kicks 46 ---------- 47 48 The goal of a VCPU kick is to bring a VCPU thr 49 order to perform some KVM maintenance. To do 50 a guest mode exit. However, a VCPU thread may 51 time of the kick. Therefore, depending on the 52 thread, there are two other actions a kick may 53 are listed below: 54 55 1) Send an IPI. This forces a guest mode exit 56 2) Waking a sleeping VCPU. Sleeping VCPUs are 57 mode that wait on waitqueues. Waking them 58 the waitqueues, allowing the threads to run 59 may be suppressed, see KVM_REQUEST_NO_WAKEU 60 3) Nothing. When the VCPU is not in guest mod 61 sleeping, then there is nothing to do. 62 63 VCPU Mode 64 --------- 65 66 VCPUs have a mode state, ``vcpu->mode``, that 67 guest is running in guest mode or not, as well 68 outside guest mode states. The architecture m 69 ensure VCPU requests are seen by VCPUs (see "E 70 as well as to avoid sending unnecessary IPIs ( 71 even to ensure IPI acknowledgements are waited 72 Acknowledgements"). The following modes are d 73 74 OUTSIDE_GUEST_MODE 75 76 The VCPU thread is outside guest mode. 77 78 IN_GUEST_MODE 79 80 The VCPU thread is in guest mode. 81 82 EXITING_GUEST_MODE 83 84 The VCPU thread is transitioning from IN_GUE 85 OUTSIDE_GUEST_MODE. 86 87 READING_SHADOW_PAGE_TABLES 88 89 The VCPU thread is outside guest mode, but i 90 certain VCPU requests, namely KVM_REQ_TLB_FL 91 thread is done reading the page tables. 92 93 VCPU Request Internals 94 ====================== 95 96 VCPU requests are simply bit indices of the `` 97 This means general bitops, like those document 98 also be used, e.g. :: 99 100 clear_bit(KVM_REQ_UNBLOCK & KVM_REQUEST_MASK 101 102 However, VCPU request users should refrain fro 103 break the abstraction. The first 8 bits are r 104 independent requests; all additional bits are 105 dependent requests. 106 107 Architecture Independent Requests 108 --------------------------------- 109 110 KVM_REQ_TLB_FLUSH 111 112 KVM's common MMU notifier may need to flush 113 entries, calling kvm_flush_remote_tlbs() to 114 choose to use the common kvm_flush_remote_tl 115 need to handle this VCPU request. 116 117 KVM_REQ_VM_DEAD 118 119 This request informs all VCPUs that the VM i 120 fatal error or because the VM's state has be 121 122 KVM_REQ_UNBLOCK 123 124 This request informs the vCPU to exit kvm_vc 125 example from timer handlers that run on the 126 or in order to update the interrupt routing 127 devices will wake up the vCPU. 128 129 KVM_REQ_OUTSIDE_GUEST_MODE 130 131 This "request" ensures the target vCPU has e 132 sender of the request continuing on. No act 133 and so no request is actually logged for the 134 to a "kick", but unlike a kick it guarantees 135 guest mode. A kick only guarantees the vCPU 136 future, e.g. a previous kick may have starte 137 guarantee the to-be-kicked vCPU has fully ex 138 139 KVM_REQUEST_MASK 140 ---------------- 141 142 VCPU requests should be masked by KVM_REQUEST_ 143 bitops. This is because only the lower 8 bits 144 request's number. The upper bits are used as 145 flags are defined. 146 147 VCPU Request Flags 148 ------------------ 149 150 KVM_REQUEST_NO_WAKEUP 151 152 This flag is applied to requests that only n 153 from VCPUs running in guest mode. That is, 154 to be awakened for these requests. Sleeping 155 requests when they are awakened later for so 156 157 KVM_REQUEST_WAIT 158 159 When requests with this flag are made with k 160 then the caller will wait for each VCPU to a 161 proceeding. This flag only applies to VCPUs 162 If, for example, the VCPU is sleeping, so no 163 the requesting thread does not wait. This m 164 safely combined with KVM_REQUEST_NO_WAKEUP. 165 Acknowledgements" for more information about 166 KVM_REQUEST_WAIT. 167 168 VCPU Requests with Associated State 169 =================================== 170 171 Requesters that want the receiving VCPU to han 172 the newly written state is observable to the r 173 by the time it observes the request. This mea 174 must be inserted after writing the new state a 175 request bit. Additionally, on the receiving V 176 corresponding read barrier must be inserted af 177 and before proceeding to read the new state as 178 scenario 3, Message and Flag, of [lwn-mb]_ and 179 [memory-barriers]_. 180 181 The pair of functions, kvm_check_request() and 182 the memory barriers, allowing this requirement 183 the API. 184 185 Ensuring Requests Are Seen 186 ========================== 187 188 When making requests to VCPUs, we want to avoi 189 executing in guest mode for an arbitrary long 190 request. We can be sure this won't happen as 191 thread checks kvm_request_pending() before ent 192 kick will send an IPI to force an exit from gu 193 Extra care must be taken to cover the period a 194 kvm_request_pending() check and before it has 195 IPIs will only trigger guest mode exits for VC 196 mode or at least have already disabled interru 197 enter guest mode. This means that an optimize 198 Reduction") must be certain when it's safe to 199 solution, which all architectures except s390 200 201 - set ``vcpu->mode`` to IN_GUEST_MODE between 202 the last kvm_request_pending() check; 203 - enable interrupts atomically when entering t 204 205 This solution also requires memory barriers to 206 the requesting thread and the receiving VCPU. 207 can exclude the possibility of a VCPU thread o 208 !kvm_request_pending() on its last check and t 209 the next request made of it, even if the reque 210 the check. This is done by way of the Dekker 211 (scenario 10 of [lwn-mb]_). As the Dekker pat 212 this solution pairs ``vcpu->mode`` with ``vcpu 213 them into the pattern gives:: 214 215 CPU1 CPU2 216 ================= ==== 217 local_irq_disable(); 218 WRITE_ONCE(vcpu->mode, IN_GUEST_MODE); kvm_ 219 smp_mb(); smp_ 220 if (kvm_request_pending(vcpu)) { if ( 221 222 ...abort guest entry... 223 } } 224 225 As stated above, the IPI is only useful for VC 226 that have already disabled interrupts. This i 227 the Dekker pattern has been extended to disabl 228 ``vcpu->mode`` to IN_GUEST_MODE. WRITE_ONCE() 229 pedantically implement the memory barrier patt 230 compiler doesn't interfere with ``vcpu->mode`` 231 accesses. 232 233 IPI Reduction 234 ------------- 235 236 As only one IPI is needed to get a VCPU to che 237 then they may be coalesced. This is easily do 238 sending kick also change the VCPU mode to some 239 transitional state, EXITING_GUEST_MODE, is use 240 241 Waiting for Acknowledgements 242 ---------------------------- 243 244 Some requests, those with the KVM_REQUEST_WAIT 245 be sent, and the acknowledgements to be waited 246 VCPU threads are in modes other than IN_GUEST_ 247 is when a target VCPU thread is in READING_SHA 248 is set after disabling interrupts. To support 249 KVM_REQUEST_WAIT flag changes the condition fo 250 checking that the VCPU is IN_GUEST_MODE to che 251 OUTSIDE_GUEST_MODE. 252 253 Request-less VCPU Kicks 254 ----------------------- 255 256 As the determination of whether or not to send 257 two-variable Dekker memory barrier pattern, th 258 request-less VCPU kicks are almost never corre 259 that a non-IPI generating kick will still resu 260 receiving VCPU, as the final kvm_request_pendi 261 request-accompanying kicks, then the kick may 262 all. If, for instance, a request-less kick wa 263 just about to set its mode to IN_GUEST_MODE, m 264 the VCPU thread may continue its entry without 265 whatever it was the kick was meant to initiate 266 267 One exception is x86's posted interrupt mechan 268 even the request-less VCPU kick is coupled wit 269 local_irq_disable() + smp_mb() pattern describ 270 (Outstanding Notification) in the posted inter 271 role of ``vcpu->requests``. When sending a po 272 set before reading ``vcpu->mode``; dually, in 273 vmx_sync_pir_to_irr() reads PIR after setting 274 IN_GUEST_MODE. 275 276 Additional Considerations 277 ========================= 278 279 Sleeping VCPUs 280 -------------- 281 282 VCPU threads may need to consider requests bef 283 functions that may put them to sleep, e.g. kvm 284 do or not, and, if they do, which requests nee 285 architecture dependent. kvm_vcpu_block() call 286 to check if it should awaken. One reason to d 287 architectures a function where requests may be 288 289 References 290 ========== 291 292 .. [atomic-ops] Documentation/atomic_bitops.tx 293 .. [memory-barriers] Documentation/memory-barr 294 .. [lwn-mb] https://lwn.net/Articles/573436/
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.