~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/networking/net_failover.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0
  2 
  3 ============
  4 NET_FAILOVER
  5 ============
  6 
  7 Overview
  8 ========
  9 
 10 The net_failover driver provides an automated failover mechanism via APIs
 11 to create and destroy a failover master netdev and manages a primary and
 12 standby slave netdevs that get registered via the generic failover
 13 infrastructure.
 14 
 15 The failover netdev acts a master device and controls 2 slave devices. The
 16 original paravirtual interface is registered as 'standby' slave netdev and
 17 a passthru/vf device with the same MAC gets registered as 'primary' slave
 18 netdev. Both 'standby' and 'failover' netdevs are associated with the same
 19 'pci' device. The user accesses the network interface via 'failover' netdev.
 20 The 'failover' netdev chooses 'primary' netdev as default for transmits when
 21 it is available with link up and running.
 22 
 23 This can be used by paravirtual drivers to enable an alternate low latency
 24 datapath. It also enables hypervisor controlled live migration of a VM with
 25 direct attached VF by failing over to the paravirtual datapath when the VF
 26 is unplugged.
 27 
 28 virtio-net accelerated datapath: STANDBY mode
 29 =============================================
 30 
 31 net_failover enables hypervisor controlled accelerated datapath to virtio-net
 32 enabled VMs in a transparent manner with no/minimal guest userspace changes.
 33 
 34 To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
 35 feature on the virtio-net interface and assign the same MAC address to both
 36 virtio-net and VF interfaces.
 37 
 38 Here is an example libvirt XML snippet that shows such configuration:
 39 ::
 40 
 41   <interface type='network'>
 42     <mac address='52:54:00:00:12:53'/>
 43     <source network='enp66s0f0_br'/>
 44     <target dev='tap01'/>
 45     <model type='virtio'/>
 46     <driver name='vhost' queues='4'/>
 47     <link state='down'/>
 48     <teaming type='persistent'/>
 49     <alias name='ua-backup0'/>
 50   </interface>
 51   <interface type='hostdev' managed='yes'>
 52     <mac address='52:54:00:00:12:53'/>
 53     <source>
 54       <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
 55     </source>
 56     <teaming type='transient' persistent='ua-backup0'/>
 57   </interface>
 58 
 59 In this configuration, the first device definition is for the virtio-net
 60 interface and this acts as the 'persistent' device indicating that this
 61 interface will always be plugged in. This is specified by the 'teaming' tag with
 62 required attribute type having value 'persistent'. The link state for the
 63 virtio-net device is set to 'down' to ensure that the 'failover' netdev prefers
 64 the VF passthrough device for normal communication. The virtio-net device will
 65 be brought UP during live migration to allow uninterrupted communication.
 66 
 67 The second device definition is for the VF passthrough interface. Here the
 68 'teaming' tag is provided with type 'transient' indicating that this device may
 69 periodically be unplugged. A second attribute - 'persistent' is provided and
 70 points to the alias name declared for the virtio-net device.
 71 
 72 Booting a VM with the above configuration will result in the following 3
 73 interfaces created in the VM:
 74 ::
 75 
 76   4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
 77       link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
 78       inet 192.168.12.53/24 brd 192.168.12.255 scope global dynamic ens10
 79          valid_lft 42482sec preferred_lft 42482sec
 80       inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
 81          valid_lft forever preferred_lft forever
 82   5: ens10nsby: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master ens10 state DOWN group default qlen 1000
 83       link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
 84   7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
 85       link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
 86 
 87 Here, ens10 is the 'failover' master interface, ens10nsby is the slave 'standby'
 88 virtio-net interface, and ens11 is the slave 'primary' VF passthrough interface.
 89 
 90 One point to note here is that some user space network configuration daemons
 91 like systemd-networkd, ifupdown, etc, do not understand the 'net_failover'
 92 device; and on the first boot, the VM might end up with both 'failover' device
 93 and VF acquiring IP addresses (either same or different) from the DHCP server.
 94 This will result in lack of connectivity to the VM. So some tweaks might be
 95 needed to these network configuration daemons to make sure that an IP is
 96 received only on the 'failover' device.
 97 
 98 Below is the patch snippet used with 'cloud-ifupdown-helper' script found on
 99 Debian cloud images:
100 
101 ::
102   @@ -27,6 +27,8 @@ do_setup() {
103        local working="$cfgdir/.$INTERFACE"
104        local final="$cfgdir/$INTERFACE"
105 
106   +    if [ -d "/sys/class/net/${INTERFACE}/master" ]; then exit 0; fi
107   +
108        if ifup --no-act "$INTERFACE" > /dev/null 2>&1; then
109            # interface is already known to ifupdown, no need to generate cfg
110            log "Skipping configuration generation for $INTERFACE"
111 
112 
113 Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
114 ==================================================================
115 
116 net_failover also enables hypervisor controlled live migration to be supported
117 with VMs that have direct attached SR-IOV VF devices by automatic failover to
118 the paravirtual datapath when the VF is unplugged.
119 
120 Here is a sample script that shows the steps to initiate live migration from
121 the source hypervisor. Note: It is assumed that the VM is connected to a
122 software bridge 'br0' which has a single VF attached to it along with the vnet
123 device to the VM. This is not the VF that was passthrough'd to the VM (seen in
124 the vf.xml file).
125 ::
126 
127   # cat vf.xml
128   <interface type='hostdev' managed='yes'>
129     <mac address='52:54:00:00:12:53'/>
130     <source>
131       <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
132     </source>
133     <teaming type='transient' persistent='ua-backup0'/>
134   </interface>
135 
136   # Source Hypervisor migrate.sh
137   #!/bin/bash
138 
139   DOMAIN=vm-01
140   PF=ens6np0
141   VF=ens6v1             # VF attached to the bridge.
142   VF_NUM=1
143   TAP_IF=vmtap01        # virtio-net interface in the VM.
144   VF_XML=vf.xml
145 
146   MAC=52:54:00:00:12:53
147   ZERO_MAC=00:00:00:00:00:00
148 
149   # Set the virtio-net interface up.
150   virsh domif-setlink $DOMAIN $TAP_IF up
151 
152   # Remove the VF that was passthrough'd to the VM.
153   virsh detach-device --live --config $DOMAIN $VF_XML
154 
155   ip link set $PF vf $VF_NUM mac $ZERO_MAC
156 
157   # Add FDB entry for traffic to continue going to the VM via
158   # the VF -> br0 -> vnet interface path.
159   bridge fdb add $MAC dev $VF
160   bridge fdb add $MAC dev $TAP_IF master
161 
162   # Migrate the VM
163   virsh migrate --live --persistent $DOMAIN qemu+ssh://$REMOTE_HOST/system
164 
165   # Clean up FDB entries after migration completes.
166   bridge fdb del $MAC dev $VF
167   bridge fdb del $MAC dev $TAP_IF master
168 
169 On the destination hypervisor, a shared bridge 'br0' is created before migration
170 starts, and a VF from the destination PF is added to the bridge. Similarly an
171 appropriate FDB entry is added.
172 
173 The following script is executed on the destination hypervisor once migration
174 completes, and it reattaches the VF to the VM and brings down the virtio-net
175 interface.
176 
177 ::
178   # reattach-vf.sh
179   #!/bin/bash
180 
181   bridge fdb del 52:54:00:00:12:53 dev ens36v0
182   bridge fdb del 52:54:00:00:12:53 dev vmtap01 master
183   virsh attach-device --config --live vm01 vf.xml
184   virsh domif-setlink vm01 vmtap01 down

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php