1 .. SPDX-License-Identifier: GPL-2.0 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 VMBus !! 3 VMbus 4 ===== 4 ===== 5 VMBus is a software construct provided by Hype !! 5 VMbus is a software construct provided by Hyper-V to guest VMs. It 6 consists of a control path and common faciliti 6 consists of a control path and common facilities used by synthetic 7 devices that Hyper-V presents to guest VMs. 7 devices that Hyper-V presents to guest VMs. The control path is 8 used to offer synthetic devices to the guest V 8 used to offer synthetic devices to the guest VM and, in some cases, 9 to rescind those devices. The common facilit 9 to rescind those devices. The common facilities include software 10 channels for communicating between the device 10 channels for communicating between the device driver in the guest VM 11 and the synthetic device implementation that i 11 and the synthetic device implementation that is part of Hyper-V, and 12 signaling primitives to allow Hyper-V and the 12 signaling primitives to allow Hyper-V and the guest to interrupt 13 each other. 13 each other. 14 14 15 VMBus is modeled in Linux as a bus, with the e !! 15 VMbus is modeled in Linux as a bus, with the expected /sys/bus/vmbus 16 entry in a running Linux guest. The VMBus dri !! 16 entry in a running Linux guest. The VMbus driver (drivers/hv/vmbus_drv.c) 17 establishes the VMBus control path with the Hy !! 17 establishes the VMbus control path with the Hyper-V host, then 18 registers itself as a Linux bus driver. It im 18 registers itself as a Linux bus driver. It implements the standard 19 bus functions for adding and removing devices 19 bus functions for adding and removing devices to/from the bus. 20 20 21 Most synthetic devices offered by Hyper-V have 21 Most synthetic devices offered by Hyper-V have a corresponding Linux 22 device driver. These devices include: 22 device driver. These devices include: 23 23 24 * SCSI controller 24 * SCSI controller 25 * NIC 25 * NIC 26 * Graphics frame buffer 26 * Graphics frame buffer 27 * Keyboard 27 * Keyboard 28 * Mouse 28 * Mouse 29 * PCI device pass-thru 29 * PCI device pass-thru 30 * Heartbeat 30 * Heartbeat 31 * Time Sync 31 * Time Sync 32 * Shutdown 32 * Shutdown 33 * Memory balloon 33 * Memory balloon 34 * Key/Value Pair (KVP) exchange with Hyper-V 34 * Key/Value Pair (KVP) exchange with Hyper-V 35 * Hyper-V online backup (a.k.a. VSS) 35 * Hyper-V online backup (a.k.a. VSS) 36 36 37 Guest VMs may have multiple instances of the s 37 Guest VMs may have multiple instances of the synthetic SCSI 38 controller, synthetic NIC, and PCI pass-thru d 38 controller, synthetic NIC, and PCI pass-thru devices. Other 39 synthetic devices are limited to a single inst 39 synthetic devices are limited to a single instance per VM. Not 40 listed above are a small number of synthetic d 40 listed above are a small number of synthetic devices offered by 41 Hyper-V that are used only by Windows guests a 41 Hyper-V that are used only by Windows guests and for which Linux 42 does not have a driver. 42 does not have a driver. 43 43 44 Hyper-V uses the terms "VSP" and "VSC" in desc 44 Hyper-V uses the terms "VSP" and "VSC" in describing synthetic 45 devices. "VSP" refers to the Hyper-V code tha 45 devices. "VSP" refers to the Hyper-V code that implements a 46 particular synthetic device, while "VSC" refer 46 particular synthetic device, while "VSC" refers to the driver for 47 the device in the guest VM. For example, the 47 the device in the guest VM. For example, the Linux driver for the 48 synthetic NIC is referred to as "netvsc" and t 48 synthetic NIC is referred to as "netvsc" and the Linux driver for 49 the synthetic SCSI controller is "storvsc". T 49 the synthetic SCSI controller is "storvsc". These drivers contain 50 functions with names like "storvsc_connect_to_ 50 functions with names like "storvsc_connect_to_vsp". 51 51 52 VMBus channels !! 52 VMbus channels 53 -------------- 53 -------------- 54 An instance of a synthetic device uses VMBus c !! 54 An instance of a synthetic device uses VMbus channels to communicate 55 between the VSP and the VSC. Channels are bi- 55 between the VSP and the VSC. Channels are bi-directional and used 56 for passing messages. Most synthetic devices 56 for passing messages. Most synthetic devices use a single channel, 57 but the synthetic SCSI controller and syntheti 57 but the synthetic SCSI controller and synthetic NIC may use multiple 58 channels to achieve higher performance and gre 58 channels to achieve higher performance and greater parallelism. 59 59 60 Each channel consists of two ring buffers. Th 60 Each channel consists of two ring buffers. These are classic ring 61 buffers from a university data structures text 61 buffers from a university data structures textbook. If the read 62 and writes pointers are equal, the ring buffer 62 and writes pointers are equal, the ring buffer is considered to be 63 empty, so a full ring buffer always has at lea 63 empty, so a full ring buffer always has at least one byte unused. 64 The "in" ring buffer is for messages from the 64 The "in" ring buffer is for messages from the Hyper-V host to the 65 guest, and the "out" ring buffer is for messag 65 guest, and the "out" ring buffer is for messages from the guest to 66 the Hyper-V host. In Linux, the "in" and "out 66 the Hyper-V host. In Linux, the "in" and "out" designations are as 67 viewed by the guest side. The ring buffers ar 67 viewed by the guest side. The ring buffers are memory that is 68 shared between the guest and the host, and the 68 shared between the guest and the host, and they follow the standard 69 paradigm where the memory is allocated by the 69 paradigm where the memory is allocated by the guest, with the list 70 of GPAs that make up the ring buffer communica 70 of GPAs that make up the ring buffer communicated to the host. Each 71 ring buffer consists of a header page (4 Kbyte 71 ring buffer consists of a header page (4 Kbytes) with the read and 72 write indices and some control flags, followed 72 write indices and some control flags, followed by the memory for the 73 actual ring. The size of the ring is determin 73 actual ring. The size of the ring is determined by the VSC in the 74 guest and is specific to each synthetic device 74 guest and is specific to each synthetic device. The list of GPAs 75 making up the ring is communicated to the Hype 75 making up the ring is communicated to the Hyper-V host over the 76 VMBus control path as a GPA Descriptor List (G !! 76 VMbus control path as a GPA Descriptor List (GPADL). See function 77 vmbus_establish_gpadl(). 77 vmbus_establish_gpadl(). 78 78 79 Each ring buffer is mapped into contiguous Lin 79 Each ring buffer is mapped into contiguous Linux kernel virtual 80 space in three parts: 1) the 4 Kbyte header p 80 space in three parts: 1) the 4 Kbyte header page, 2) the memory 81 that makes up the ring itself, and 3) a second 81 that makes up the ring itself, and 3) a second mapping of the memory 82 that makes up the ring itself. Because (2) an 82 that makes up the ring itself. Because (2) and (3) are contiguous 83 in kernel virtual space, the code that copies 83 in kernel virtual space, the code that copies data to and from the 84 ring buffer need not be concerned with ring bu 84 ring buffer need not be concerned with ring buffer wrap-around. 85 Once a copy operation has completed, the read 85 Once a copy operation has completed, the read or write index may 86 need to be reset to point back into the first 86 need to be reset to point back into the first mapping, but the 87 actual data copy does not need to be broken in 87 actual data copy does not need to be broken into two parts. This 88 approach also allows complex data structures t 88 approach also allows complex data structures to be easily accessed 89 directly in the ring without handling wrap-aro 89 directly in the ring without handling wrap-around. 90 90 91 On arm64 with page sizes > 4 Kbytes, the heade 91 On arm64 with page sizes > 4 Kbytes, the header page must still be 92 passed to Hyper-V as a 4 Kbyte area. But the 92 passed to Hyper-V as a 4 Kbyte area. But the memory for the actual 93 ring must be aligned to PAGE_SIZE and have a s 93 ring must be aligned to PAGE_SIZE and have a size that is a multiple 94 of PAGE_SIZE so that the duplicate mapping tri 94 of PAGE_SIZE so that the duplicate mapping trick can be done. Hence 95 a portion of the header page is unused and not 95 a portion of the header page is unused and not communicated to 96 Hyper-V. This case is handled by vmbus_establ 96 Hyper-V. This case is handled by vmbus_establish_gpadl(). 97 97 98 Hyper-V enforces a limit on the aggregate amou 98 Hyper-V enforces a limit on the aggregate amount of guest memory 99 that can be shared with the host via GPADLs. 99 that can be shared with the host via GPADLs. This limit ensures 100 that a rogue guest can't force the consumption 100 that a rogue guest can't force the consumption of excessive host 101 resources. For Windows Server 2019 and later, 101 resources. For Windows Server 2019 and later, this limit is 102 approximately 1280 Mbytes. For versions prior 102 approximately 1280 Mbytes. For versions prior to Windows Server 103 2019, the limit is approximately 384 Mbytes. 103 2019, the limit is approximately 384 Mbytes. 104 104 105 VMBus channel messages !! 105 VMbus messages 106 ---------------------- !! 106 -------------- 107 All messages sent in a VMBus channel have a st !! 107 All VMbus messages have a standard header that includes the message 108 the message length, the offset of the message !! 108 length, the offset of the message payload, some flags, and a 109 transactionID. The portion of the message aft 109 transactionID. The portion of the message after the header is 110 unique to each VSP/VSC pair. 110 unique to each VSP/VSC pair. 111 111 112 Messages follow one of two patterns: 112 Messages follow one of two patterns: 113 113 114 * Unidirectional: Either side sends a message 114 * Unidirectional: Either side sends a message and does not 115 expect a response message 115 expect a response message 116 * Request/response: One side (usually the gue 116 * Request/response: One side (usually the guest) sends a message 117 and expects a response 117 and expects a response 118 118 119 The transactionID (a.k.a. "requestID") is for 119 The transactionID (a.k.a. "requestID") is for matching requests & 120 responses. Some synthetic devices allow multi 120 responses. Some synthetic devices allow multiple requests to be in- 121 flight simultaneously, so the guest specifies 121 flight simultaneously, so the guest specifies a transactionID when 122 sending a request. Hyper-V sends back the sam 122 sending a request. Hyper-V sends back the same transactionID in the 123 matching response. 123 matching response. 124 124 125 Messages passed between the VSP and VSC are co 125 Messages passed between the VSP and VSC are control messages. For 126 example, a message sent from the storvsc drive 126 example, a message sent from the storvsc driver might be "execute 127 this SCSI command". If a message also implie 127 this SCSI command". If a message also implies some data transfer 128 between the guest and the Hyper-V host, the ac 128 between the guest and the Hyper-V host, the actual data to be 129 transferred may be embedded with the control m 129 transferred may be embedded with the control message, or it may be 130 specified as a separate data buffer that the H 130 specified as a separate data buffer that the Hyper-V host will 131 access as a DMA operation. The former case is 131 access as a DMA operation. The former case is used when the size of 132 the data is small and the cost of copying the 132 the data is small and the cost of copying the data to and from the 133 ring buffer is minimal. For example, time syn 133 ring buffer is minimal. For example, time sync messages from the 134 Hyper-V host to the guest contain the actual t 134 Hyper-V host to the guest contain the actual time value. When the 135 data is larger, a separate data buffer is used 135 data is larger, a separate data buffer is used. In this case, the 136 control message contains a list of GPAs that d 136 control message contains a list of GPAs that describe the data 137 buffer. For example, the storvsc driver uses 137 buffer. For example, the storvsc driver uses this approach to 138 specify the data buffers to/from which disk I/ 138 specify the data buffers to/from which disk I/O is done. 139 139 140 Three functions exist to send VMBus channel me !! 140 Three functions exist to send VMbus messages: 141 141 142 1. vmbus_sendpacket(): Control-only messages 142 1. vmbus_sendpacket(): Control-only messages and messages with 143 embedded data -- no GPAs 143 embedded data -- no GPAs 144 2. vmbus_sendpacket_pagebuffer(): Message with 144 2. vmbus_sendpacket_pagebuffer(): Message with list of GPAs 145 identifying data to transfer. An offset an 145 identifying data to transfer. An offset and length is 146 associated with each GPA so that multiple d 146 associated with each GPA so that multiple discontinuous areas 147 of guest memory can be targeted. 147 of guest memory can be targeted. 148 3. vmbus_sendpacket_mpb_desc(): Message with l 148 3. vmbus_sendpacket_mpb_desc(): Message with list of GPAs 149 identifying data to transfer. A single off 149 identifying data to transfer. A single offset and length is 150 associated with a list of GPAs. The GPAs m 150 associated with a list of GPAs. The GPAs must describe a 151 single logical area of guest memory to be t 151 single logical area of guest memory to be targeted. 152 152 153 Historically, Linux guests have trusted Hyper- 153 Historically, Linux guests have trusted Hyper-V to send well-formed 154 and valid messages, and Linux drivers for synt 154 and valid messages, and Linux drivers for synthetic devices did not 155 fully validate messages. With the introductio 155 fully validate messages. With the introduction of processor 156 technologies that fully encrypt guest memory a 156 technologies that fully encrypt guest memory and that allow the 157 guest to not trust the hypervisor (AMD SEV-SNP !! 157 guest to not trust the hypervisor (AMD SNP-SEV, Intel TDX), trusting 158 the Hyper-V host is no longer a valid assumpti 158 the Hyper-V host is no longer a valid assumption. The drivers for 159 VMBus synthetic devices are being updated to f !! 159 VMbus synthetic devices are being updated to fully validate any 160 values read from memory that is shared with Hy 160 values read from memory that is shared with Hyper-V, which includes 161 messages from VMBus devices. To facilitate su !! 161 messages from VMbus devices. To facilitate such validation, 162 messages read by the guest from the "in" ring 162 messages read by the guest from the "in" ring buffer are copied to a 163 temporary buffer that is not shared with Hyper 163 temporary buffer that is not shared with Hyper-V. Validation is 164 performed in this temporary buffer without the 164 performed in this temporary buffer without the risk of Hyper-V 165 maliciously modifying the message after it is 165 maliciously modifying the message after it is validated but before 166 it is used. 166 it is used. 167 167 168 Synthetic Interrupt Controller (synic) !! 168 VMbus interrupts 169 -------------------------------------- << 170 Hyper-V provides each guest CPU with a synthet << 171 that is used by VMBus for host-guest communica << 172 defines 16 synthetic interrupts (SINT), Linux << 173 (VMBUS_MESSAGE_SINT). All interrupts related t << 174 the Hyper-V host and a guest CPU use that SINT << 175 << 176 The SINT is mapped to a single per-CPU archite << 177 an 8-bit x86/x64 interrupt vector, or an arm64 << 178 each CPU in the guest has a synic and may rece << 179 they are best modeled in Linux as per-CPU inte << 180 well on arm64 where a single per-CPU Linux IRQ << 181 VMBUS_MESSAGE_SINT. This IRQ appears in /proc/ << 182 "Hyper-V VMbus". Since x86/x64 lacks support f << 183 interrupt vector is statically allocated (HYPE << 184 across all CPUs and explicitly coded to call v << 185 there's no Linux IRQ, and the interrupts are v << 186 /proc/interrupts on the "HYP" line. << 187 << 188 The synic provides the means to demultiplex th << 189 one or more logical interrupts and route the l << 190 VMBus handler in Linux. This demultiplexing is << 191 related functions that access synic data struc << 192 << 193 The synic is not modeled in Linux as an irq ch << 194 and the demultiplexed logical interrupts are n << 195 they don't appear in /proc/interrupts or /proc << 196 affinity for one of these logical interrupts i << 197 entry under /sys/bus/vmbus as described below. << 198 << 199 VMBus interrupts << 200 ---------------- 169 ---------------- 201 VMBus provides a mechanism for the guest to in !! 170 VMbus provides a mechanism for the guest to interrupt the host when 202 the guest has queued new messages in a ring bu 171 the guest has queued new messages in a ring buffer. The host 203 expects that the guest will send an interrupt 172 expects that the guest will send an interrupt only when an "out" 204 ring buffer transitions from empty to non-empt 173 ring buffer transitions from empty to non-empty. If the guest sends 205 interrupts at other times, the host deems such 174 interrupts at other times, the host deems such interrupts to be 206 unnecessary. If a guest sends an excessive nu 175 unnecessary. If a guest sends an excessive number of unnecessary 207 interrupts, the host may throttle that guest b 176 interrupts, the host may throttle that guest by suspending its 208 execution for a few seconds to prevent a denia 177 execution for a few seconds to prevent a denial-of-service attack. 209 178 210 Similarly, the host will interrupt the guest v !! 179 Similarly, the host will interrupt the guest when it sends a new 211 it sends a new message on the VMBus control pa !! 180 message on the VMbus control path, or when a VMbus channel "in" ring 212 channel "in" ring buffer transitions from empt !! 181 buffer transitions from empty to non-empty. Each CPU in the guest 213 the host inserting a new VMBus channel message !! 182 may receive VMbus interrupts, so they are best modeled as per-CPU 214 and each VMBus channel "in" ring buffer are se !! 183 interrupts in Linux. This model works well on arm64 where a single 215 that are demultiplexed by vmbus_isr(). It demu !! 184 per-CPU IRQ is allocated for VMbus. Since x86/x64 lacks support for 216 for channel interrupts by calling vmbus_chan_s !! 185 per-CPU IRQs, an x86 interrupt vector is statically allocated (see 217 bitmap to determine which channels have pendin !! 186 HYPERVISOR_CALLBACK_VECTOR) across all CPUs and explicitly coded to 218 If multiple channels have pending interrupts f !! 187 call the VMbus interrupt service routine. These interrupts are 219 processed sequentially. When all channel inte !! 188 visible in /proc/interrupts on the "HYP" line. 220 vmbus_isr() checks for and processes any messa << 221 control path. << 222 189 223 The guest CPU that a VMBus channel will interr !! 190 The guest CPU that a VMbus channel will interrupt is selected by the 224 guest when the channel is created, and the hos 191 guest when the channel is created, and the host is informed of that 225 selection. VMBus devices are broadly grouped !! 192 selection. VMbus devices are broadly grouped into two categories: 226 193 227 1. "Slow" devices that need only one VMBus cha !! 194 1. "Slow" devices that need only one VMbus channel. The devices 228 (such as keyboard, mouse, heartbeat, and ti 195 (such as keyboard, mouse, heartbeat, and timesync) generate 229 relatively few interrupts. Their VMBus cha !! 196 relatively few interrupts. Their VMbus channels are all 230 assigned to interrupt the VMBUS_CONNECT_CPU 197 assigned to interrupt the VMBUS_CONNECT_CPU, which is always 231 CPU 0. 198 CPU 0. 232 199 233 2. "High speed" devices that may use multiple !! 200 2. "High speed" devices that may use multiple VMbus channels for 234 higher parallelism and performance. These 201 higher parallelism and performance. These devices include the 235 synthetic SCSI controller and synthetic NIC !! 202 synthetic SCSI controller and synthetic NIC. Their VMbus 236 channels interrupts are assigned to CPUs th 203 channels interrupts are assigned to CPUs that are spread out 237 among the available CPUs in the VM so that 204 among the available CPUs in the VM so that interrupts on 238 multiple channels can be processed in paral 205 multiple channels can be processed in parallel. 239 206 240 The assignment of VMBus channel interrupts to !! 207 The assignment of VMbus channel interrupts to CPUs is done in the 241 function init_vp_index(). This assignment is 208 function init_vp_index(). This assignment is done outside of the 242 normal Linux interrupt affinity mechanism, so 209 normal Linux interrupt affinity mechanism, so the interrupts are 243 neither "unmanaged" nor "managed" interrupts. 210 neither "unmanaged" nor "managed" interrupts. 244 211 245 The CPU that a VMBus channel will interrupt ca !! 212 The CPU that a VMbus channel will interrupt can be seen in 246 /sys/bus/vmbus/devices/<deviceGUID>/ channels/ 213 /sys/bus/vmbus/devices/<deviceGUID>/ channels/<channelRelID>/cpu. 247 When running on later versions of Hyper-V, the 214 When running on later versions of Hyper-V, the CPU can be changed 248 by writing a new value to this sysfs entry. Be !! 215 by writing a new value to this sysfs entry. Because the interrupt 249 interrupts are not Linux IRQs, there are no en !! 216 assignment is done outside of the normal Linux affinity mechanism, 250 or /proc/irq corresponding to individual VMBus !! 217 there are no entries in /proc/irq corresponding to individual >> 218 VMbus channel interrupts. 251 219 252 An online CPU in a Linux guest may not be take 220 An online CPU in a Linux guest may not be taken offline if it has 253 VMBus channel interrupts assigned to it. Any !! 221 VMbus channel interrupts assigned to it. Any such channel 254 interrupts must first be manually reassigned t 222 interrupts must first be manually reassigned to another CPU as 255 described above. When no channel interrupts a 223 described above. When no channel interrupts are assigned to the 256 CPU, it can be taken offline. 224 CPU, it can be taken offline. 257 225 258 The VMBus channel interrupt handling code is d !! 226 When a guest CPU receives a VMbus interrupt from the host, the >> 227 function vmbus_isr() handles the interrupt. It first checks for >> 228 channel interrupts by calling vmbus_chan_sched(), which looks at a >> 229 bitmap setup by the host to determine which channels have pending >> 230 interrupts on this CPU. If multiple channels have pending >> 231 interrupts for this CPU, they are processed sequentially. When all >> 232 channel interrupts have been processed, vmbus_isr() checks for and >> 233 processes any message received on the VMbus control path. >> 234 >> 235 The VMbus channel interrupt handling code is designed to work 259 correctly even if an interrupt is received on 236 correctly even if an interrupt is received on a CPU other than the 260 CPU assigned to the channel. Specifically, th 237 CPU assigned to the channel. Specifically, the code does not use 261 CPU-based exclusion for correctness. In norma 238 CPU-based exclusion for correctness. In normal operation, Hyper-V 262 will interrupt the assigned CPU. But when the 239 will interrupt the assigned CPU. But when the CPU assigned to a 263 channel is being changed via sysfs, the guest 240 channel is being changed via sysfs, the guest doesn't know exactly 264 when Hyper-V will make the transition. The co 241 when Hyper-V will make the transition. The code must work correctly 265 even if there is a time lag before Hyper-V sta 242 even if there is a time lag before Hyper-V starts interrupting the 266 new CPU. See comments in target_cpu_store(). 243 new CPU. See comments in target_cpu_store(). 267 244 268 VMBus device creation/deletion !! 245 VMbus device creation/deletion 269 ------------------------------ 246 ------------------------------ 270 Hyper-V and the Linux guest have a separate me 247 Hyper-V and the Linux guest have a separate message-passing path 271 that is used for synthetic device creation and 248 that is used for synthetic device creation and deletion. This 272 path does not use a VMBus channel. See vmbus_ !! 249 path does not use a VMbus channel. See vmbus_post_msg() and 273 vmbus_on_msg_dpc(). 250 vmbus_on_msg_dpc(). 274 251 275 The first step is for the guest to connect to 252 The first step is for the guest to connect to the generic 276 Hyper-V VMBus mechanism. As part of establish !! 253 Hyper-V VMbus mechanism. As part of establishing this connection, 277 the guest and Hyper-V agree on a VMBus protoco !! 254 the guest and Hyper-V agree on a VMbus protocol version they will 278 use. This negotiation allows newer Linux kern 255 use. This negotiation allows newer Linux kernels to run on older 279 Hyper-V versions, and vice versa. 256 Hyper-V versions, and vice versa. 280 257 281 The guest then tells Hyper-V to "send offers". 258 The guest then tells Hyper-V to "send offers". Hyper-V sends an 282 offer message to the guest for each synthetic 259 offer message to the guest for each synthetic device that the VM 283 is configured to have. Each VMBus device type !! 260 is configured to have. Each VMbus device type has a fixed GUID 284 known as the "class ID", and each VMBus device !! 261 known as the "class ID", and each VMbus device instance is also 285 identified by a GUID. The offer message from H 262 identified by a GUID. The offer message from Hyper-V contains 286 both GUIDs to uniquely (within the VM) identif 263 both GUIDs to uniquely (within the VM) identify the device. 287 There is one offer message for each device ins 264 There is one offer message for each device instance, so a VM with 288 two synthetic NICs will get two offers message 265 two synthetic NICs will get two offers messages with the NIC 289 class ID. The ordering of offer messages can v 266 class ID. The ordering of offer messages can vary from boot-to-boot 290 and must not be assumed to be consistent in Li 267 and must not be assumed to be consistent in Linux code. Offer 291 messages may also arrive long after Linux has 268 messages may also arrive long after Linux has initially booted 292 because Hyper-V supports adding devices, such 269 because Hyper-V supports adding devices, such as synthetic NICs, 293 to running VMs. A new offer message is process 270 to running VMs. A new offer message is processed by 294 vmbus_process_offer(), which indirectly invoke 271 vmbus_process_offer(), which indirectly invokes vmbus_add_channel_work(). 295 272 296 Upon receipt of an offer message, the guest id 273 Upon receipt of an offer message, the guest identifies the device 297 type based on the class ID, and invokes the co 274 type based on the class ID, and invokes the correct driver to set up 298 the device. Driver/device matching is perform 275 the device. Driver/device matching is performed using the standard 299 Linux mechanism. 276 Linux mechanism. 300 277 301 The device driver probe function opens the pri !! 278 The device driver probe function opens the primary VMbus channel to 302 the corresponding VSP. It allocates guest memo 279 the corresponding VSP. It allocates guest memory for the channel 303 ring buffers and shares the ring buffer with t 280 ring buffers and shares the ring buffer with the Hyper-V host by 304 giving the host a list of GPAs for the ring bu 281 giving the host a list of GPAs for the ring buffer memory. See 305 vmbus_establish_gpadl(). 282 vmbus_establish_gpadl(). 306 283 307 Once the ring buffer is set up, the device dri 284 Once the ring buffer is set up, the device driver and VSP exchange 308 setup messages via the primary channel. These 285 setup messages via the primary channel. These messages may include 309 negotiating the device protocol version to be 286 negotiating the device protocol version to be used between the Linux 310 VSC and the VSP on the Hyper-V host. The setu 287 VSC and the VSP on the Hyper-V host. The setup messages may also 311 include creating additional VMBus channels, wh !! 288 include creating additional VMbus channels, which are somewhat 312 mis-named as "sub-channels" since they are fun 289 mis-named as "sub-channels" since they are functionally 313 equivalent to the primary channel once they ar 290 equivalent to the primary channel once they are created. 314 291 315 Finally, the device driver may create entries 292 Finally, the device driver may create entries in /dev as with 316 any device driver. 293 any device driver. 317 294 318 The Hyper-V host can send a "rescind" message 295 The Hyper-V host can send a "rescind" message to the guest to 319 remove a device that was previously offered. L 296 remove a device that was previously offered. Linux drivers must 320 handle such a rescind message at any time. Res 297 handle such a rescind message at any time. Rescinding a device 321 invokes the device driver "remove" function to 298 invokes the device driver "remove" function to cleanly shut 322 down the device and remove it. Once a syntheti 299 down the device and remove it. Once a synthetic device is 323 rescinded, neither Hyper-V nor Linux retains a 300 rescinded, neither Hyper-V nor Linux retains any state about 324 its previous existence. Such a device might be 301 its previous existence. Such a device might be re-added later, 325 in which case it is treated as an entirely new 302 in which case it is treated as an entirely new device. See 326 vmbus_onoffer_rescind(). 303 vmbus_onoffer_rescind().
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.