1 =========== 1 =========== 2 NFS LOCALIO 2 NFS LOCALIO 3 =========== 3 =========== 4 4 5 Overview 5 Overview 6 ======== 6 ======== 7 7 8 The LOCALIO auxiliary RPC protocol allows the 8 The LOCALIO auxiliary RPC protocol allows the Linux NFS client and 9 server to reliably handshake to determine if t 9 server to reliably handshake to determine if they are on the same 10 host. Select "NFS client and server support fo 10 host. Select "NFS client and server support for LOCALIO auxiliary 11 protocol" in menuconfig to enable CONFIG_NFS_L 11 protocol" in menuconfig to enable CONFIG_NFS_LOCALIO in the kernel 12 config (both CONFIG_NFS_FS and CONFIG_NFSD mus 12 config (both CONFIG_NFS_FS and CONFIG_NFSD must also be enabled). 13 13 14 Once an NFS client and server handshake as "lo 14 Once an NFS client and server handshake as "local", the client will 15 bypass the network RPC protocol for read, writ 15 bypass the network RPC protocol for read, write and commit operations. 16 Due to this XDR and RPC bypass, these operatio 16 Due to this XDR and RPC bypass, these operations will operate faster. 17 17 18 The LOCALIO auxiliary protocol's implementatio 18 The LOCALIO auxiliary protocol's implementation, which uses the same 19 connection as NFS traffic, follows the pattern 19 connection as NFS traffic, follows the pattern established by the NFS 20 ACL protocol extension. 20 ACL protocol extension. 21 21 22 The LOCALIO auxiliary protocol is needed to al 22 The LOCALIO auxiliary protocol is needed to allow robust discovery of 23 clients local to their servers. In a private i 23 clients local to their servers. In a private implementation that 24 preceded use of this LOCALIO protocol, a fragi 24 preceded use of this LOCALIO protocol, a fragile sockaddr network 25 address based match against all local network 25 address based match against all local network interfaces was attempted. 26 But unlike the LOCALIO protocol, the sockaddr- 26 But unlike the LOCALIO protocol, the sockaddr-based matching didn't 27 handle use of iptables or containers. 27 handle use of iptables or containers. 28 28 29 The robust handshake between local client and 29 The robust handshake between local client and server is just the 30 beginning, the ultimate use case this locality 30 beginning, the ultimate use case this locality makes possible is the 31 client is able to open files and issue reads, 31 client is able to open files and issue reads, writes and commits 32 directly to the server without having to go ov 32 directly to the server without having to go over the network. The 33 requirement is to perform these loopback NFS o 33 requirement is to perform these loopback NFS operations as efficiently 34 as possible, this is particularly useful for c 34 as possible, this is particularly useful for container use cases 35 (e.g. kubernetes) where it is possible to run 35 (e.g. kubernetes) where it is possible to run an IO job local to the 36 server. 36 server. 37 37 38 The performance advantage realized from LOCALI 38 The performance advantage realized from LOCALIO's ability to bypass 39 using XDR and RPC for reads, writes and commit 39 using XDR and RPC for reads, writes and commits can be extreme, e.g.: 40 40 41 fio for 20 secs with directio, qd of 8, 16 lib 41 fio for 20 secs with directio, qd of 8, 16 libaio threads: 42 - With LOCALIO: 42 - With LOCALIO: 43 4K read: IOPS=979k, BW=3825MiB/s (4011 43 4K read: IOPS=979k, BW=3825MiB/s (4011MB/s)(74.7GiB/20002msec) 44 4K write: IOPS=165k, BW=646MiB/s (678M 44 4K write: IOPS=165k, BW=646MiB/s (678MB/s)(12.6GiB/20002msec) 45 128K read: IOPS=402k, BW=49.1GiB/s (52.7 45 128K read: IOPS=402k, BW=49.1GiB/s (52.7GB/s)(982GiB/20002msec) 46 128K write: IOPS=11.5k, BW=1433MiB/s (1503 46 128K write: IOPS=11.5k, BW=1433MiB/s (1503MB/s)(28.0GiB/20004msec) 47 47 48 - Without LOCALIO: 48 - Without LOCALIO: 49 4K read: IOPS=79.2k, BW=309MiB/s (324M 49 4K read: IOPS=79.2k, BW=309MiB/s (324MB/s)(6188MiB/20003msec) 50 4K write: IOPS=59.8k, BW=234MiB/s (245M 50 4K write: IOPS=59.8k, BW=234MiB/s (245MB/s)(4671MiB/20002msec) 51 128K read: IOPS=33.9k, BW=4234MiB/s (4440 51 128K read: IOPS=33.9k, BW=4234MiB/s (4440MB/s)(82.7GiB/20004msec) 52 128K write: IOPS=11.5k, BW=1434MiB/s (1504 52 128K write: IOPS=11.5k, BW=1434MiB/s (1504MB/s)(28.0GiB/20011msec) 53 53 54 fio for 20 secs with directio, qd of 8, 1 liba 54 fio for 20 secs with directio, qd of 8, 1 libaio thread: 55 - With LOCALIO: 55 - With LOCALIO: 56 4K read: IOPS=230k, BW=898MiB/s (941M 56 4K read: IOPS=230k, BW=898MiB/s (941MB/s)(17.5GiB/20001msec) 57 4K write: IOPS=22.6k, BW=88.3MiB/s (92.6 57 4K write: IOPS=22.6k, BW=88.3MiB/s (92.6MB/s)(1766MiB/20001msec) 58 128K read: IOPS=38.8k, BW=4855MiB/s (5091 58 128K read: IOPS=38.8k, BW=4855MiB/s (5091MB/s)(94.8GiB/20001msec) 59 128K write: IOPS=11.4k, BW=1428MiB/s (1497 59 128K write: IOPS=11.4k, BW=1428MiB/s (1497MB/s)(27.9GiB/20001msec) 60 60 61 - Without LOCALIO: 61 - Without LOCALIO: 62 4K read: IOPS=77.1k, BW=301MiB/s (316M 62 4K read: IOPS=77.1k, BW=301MiB/s (316MB/s)(6022MiB/20001msec) 63 4K write: IOPS=32.8k, BW=128MiB/s (135M 63 4K write: IOPS=32.8k, BW=128MiB/s (135MB/s)(2566MiB/20001msec) 64 128K read: IOPS=24.4k, BW=3050MiB/s (3198 64 128K read: IOPS=24.4k, BW=3050MiB/s (3198MB/s)(59.6GiB/20001msec) 65 128K write: IOPS=11.4k, BW=1430MiB/s (1500 65 128K write: IOPS=11.4k, BW=1430MiB/s (1500MB/s)(27.9GiB/20001msec) 66 66 67 FAQ 67 FAQ 68 === 68 === 69 69 70 1. What are the use cases for LOCALIO? 70 1. What are the use cases for LOCALIO? 71 71 72 a. Workloads where the NFS client and serve 72 a. Workloads where the NFS client and server are on the same host 73 realize improved IO performance. In part 73 realize improved IO performance. In particular, it is common when 74 running containerised workloads for jobs 74 running containerised workloads for jobs to find themselves 75 running on the same host as the knfsd se 75 running on the same host as the knfsd server being used for 76 storage. 76 storage. 77 77 78 2. What are the requirements for LOCALIO? 78 2. What are the requirements for LOCALIO? 79 79 80 a. Bypass use of the network RPC protocol a 80 a. Bypass use of the network RPC protocol as much as possible. This 81 includes bypassing XDR and RPC for open, 81 includes bypassing XDR and RPC for open, read, write and commit 82 operations. 82 operations. 83 b. Allow client and server to autonomously 83 b. Allow client and server to autonomously discover if they are 84 running local to each other without maki 84 running local to each other without making any assumptions about 85 the local network topology. 85 the local network topology. 86 c. Support the use of containers by being c 86 c. Support the use of containers by being compatible with relevant 87 namespaces (e.g. network, user, mount). 87 namespaces (e.g. network, user, mount). 88 d. Support all versions of NFS. NFSv3 is of 88 d. Support all versions of NFS. NFSv3 is of particular importance 89 because it has wide enterprise usage and 89 because it has wide enterprise usage and pNFS flexfiles makes use 90 of it for the data path. 90 of it for the data path. 91 91 92 3. Why doesn’t LOCALIO just compare IP addre 92 3. Why doesn’t LOCALIO just compare IP addresses or hostnames when 93 deciding if the NFS client and server are c 93 deciding if the NFS client and server are co-located on the same 94 host? 94 host? 95 95 96 Since one of the main use cases is containe 96 Since one of the main use cases is containerised workloads, we cannot 97 assume that IP addresses will be shared bet 97 assume that IP addresses will be shared between the client and 98 server. This sets up a requirement for a ha 98 server. This sets up a requirement for a handshake protocol that 99 needs to go over the same connection as the 99 needs to go over the same connection as the NFS traffic in order to 100 identify that the client and the server rea 100 identify that the client and the server really are running on the 101 same host. The handshake uses a secret that 101 same host. The handshake uses a secret that is sent over the wire, 102 and can be verified by both parties by comp 102 and can be verified by both parties by comparing with a value stored 103 in shared kernel memory if they are truly c 103 in shared kernel memory if they are truly co-located. 104 104 105 4. Does LOCALIO improve pNFS flexfiles? 105 4. Does LOCALIO improve pNFS flexfiles? 106 106 107 Yes, LOCALIO complements pNFS flexfiles by 107 Yes, LOCALIO complements pNFS flexfiles by allowing it to take 108 advantage of NFS client and server locality 108 advantage of NFS client and server locality. Policy that initiates 109 client IO as closely to the server where th 109 client IO as closely to the server where the data is stored naturally 110 benefits from the data path optimization LO 110 benefits from the data path optimization LOCALIO provides. 111 111 112 5. Why not develop a new pNFS layout to enable 112 5. Why not develop a new pNFS layout to enable LOCALIO? 113 113 114 A new pNFS layout could be developed, but d 114 A new pNFS layout could be developed, but doing so would put the 115 onus on the server to somehow discover that 115 onus on the server to somehow discover that the client is co-located 116 when deciding to hand out the layout. 116 when deciding to hand out the layout. 117 There is value in a simpler approach (as pr 117 There is value in a simpler approach (as provided by LOCALIO) that 118 allows the NFS client to negotiate and leve 118 allows the NFS client to negotiate and leverage locality without 119 requiring more elaborate modeling and disco 119 requiring more elaborate modeling and discovery of such locality in a 120 more centralized manner. 120 more centralized manner. 121 121 122 6. Why is having the client perform a server-s 122 6. Why is having the client perform a server-side file OPEN, without 123 using RPC, beneficial? Is the benefit pNFS 123 using RPC, beneficial? Is the benefit pNFS specific? 124 124 125 Avoiding the use of XDR and RPC for file op 125 Avoiding the use of XDR and RPC for file opens is beneficial to 126 performance regardless of whether pNFS is u 126 performance regardless of whether pNFS is used. Especially when 127 dealing with small files its best to avoid 127 dealing with small files its best to avoid going over the wire 128 whenever possible, otherwise it could reduc 128 whenever possible, otherwise it could reduce or even negate the 129 benefits of avoiding the wire for doing the 129 benefits of avoiding the wire for doing the small file I/O itself. 130 Given LOCALIO's requirements the current ap 130 Given LOCALIO's requirements the current approach of having the 131 client perform a server-side file open, wit 131 client perform a server-side file open, without using RPC, is ideal. 132 If in the future requirements change then w 132 If in the future requirements change then we can adapt accordingly. 133 133 134 7. Why is LOCALIO only supported with UNIX Aut 134 7. Why is LOCALIO only supported with UNIX Authentication (AUTH_UNIX)? 135 135 136 Strong authentication is usually tied to th 136 Strong authentication is usually tied to the connection itself. It 137 works by establishing a context that is cac 137 works by establishing a context that is cached by the server, and 138 that acts as the key for discovering the au 138 that acts as the key for discovering the authorisation token, which 139 can then be passed to rpc.mountd to complet 139 can then be passed to rpc.mountd to complete the authentication 140 process. On the other hand, in the case of 140 process. On the other hand, in the case of AUTH_UNIX, the credential 141 that was passed over the wire is used direc 141 that was passed over the wire is used directly as the key in the 142 upcall to rpc.mountd. This simplifies the a 142 upcall to rpc.mountd. This simplifies the authentication process, and 143 so makes AUTH_UNIX easier to support. 143 so makes AUTH_UNIX easier to support. 144 144 145 8. How do export options that translate RPC us 145 8. How do export options that translate RPC user IDs behave for LOCALIO 146 operations (eg. root_squash, all_squash)? 146 operations (eg. root_squash, all_squash)? 147 147 148 Export options that translate user IDs are 148 Export options that translate user IDs are managed by nfsd_setuser() 149 which is called by nfsd_setuser_and_check_p 149 which is called by nfsd_setuser_and_check_port() which is called by 150 __fh_verify(). So they get handled exactly 150 __fh_verify(). So they get handled exactly the same way for LOCALIO 151 as they do for non-LOCALIO. 151 as they do for non-LOCALIO. 152 152 153 9. How does LOCALIO make certain that object l 153 9. How does LOCALIO make certain that object lifetimes are managed 154 properly given NFSD and NFS operate in diff 154 properly given NFSD and NFS operate in different contexts? 155 155 156 See the detailed "NFS Client and Server Int 156 See the detailed "NFS Client and Server Interlock" section below. 157 157 158 RPC 158 RPC 159 === 159 === 160 160 161 The LOCALIO auxiliary RPC protocol consists of 161 The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" 162 RPC method that allows the Linux NFS client to 162 RPC method that allows the Linux NFS client to verify the local Linux 163 NFS server can see the nonce (single-use UUID) 163 NFS server can see the nonce (single-use UUID) the client generated and 164 made available in nfs_common. This protocol is 164 made available in nfs_common. This protocol isn't part of an IETF 165 standard, nor does it need to be considering i 165 standard, nor does it need to be considering it is Linux-to-Linux 166 auxiliary RPC protocol that amounts to an impl 166 auxiliary RPC protocol that amounts to an implementation detail. 167 167 168 The UUID_IS_LOCAL method encodes the client ge 168 The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of 169 the fixed UUID_SIZE (16 bytes). The fixed size 169 the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode 170 XDR methods are used instead of the less effic 170 XDR methods are used instead of the less efficient variable sized 171 methods. 171 methods. 172 172 173 The RPC program number for the NFS_LOCALIO_PRO 173 The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned 174 by IANA, see https://www.iana.org/assignments/ 174 by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ): 175 Linux Kernel Organization 400122 nfsloc 175 Linux Kernel Organization 400122 nfslocalio 176 176 177 The LOCALIO protocol spec in rpcgen syntax is: 177 The LOCALIO protocol spec in rpcgen syntax is:: 178 178 179 /* raw RFC 9562 UUID */ 179 /* raw RFC 9562 UUID */ 180 #define UUID_SIZE 16 180 #define UUID_SIZE 16 181 typedef u8 uuid_t<UUID_SIZE>; 181 typedef u8 uuid_t<UUID_SIZE>; 182 182 183 program NFS_LOCALIO_PROGRAM { 183 program NFS_LOCALIO_PROGRAM { 184 version LOCALIO_V1 { 184 version LOCALIO_V1 { 185 void 185 void 186 NULL(void) = 0; 186 NULL(void) = 0; 187 187 188 void 188 void 189 UUID_IS_LOCAL(uuid_t) = 1; 189 UUID_IS_LOCAL(uuid_t) = 1; 190 } = 1; 190 } = 1; 191 } = 400122; 191 } = 400122; 192 192 193 LOCALIO uses the same transport connection as 193 LOCALIO uses the same transport connection as NFS traffic. As such, 194 LOCALIO is not registered with rpcbind. 194 LOCALIO is not registered with rpcbind. 195 195 196 NFS Common and Client/Server Handshake 196 NFS Common and Client/Server Handshake 197 ====================================== 197 ====================================== 198 198 199 fs/nfs_common/nfslocalio.c provides interfaces 199 fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS client 200 to generate a nonce (single-use UUID) and asso 200 to generate a nonce (single-use UUID) and associated short-lived 201 nfs_uuid_t struct, register it with nfs_common 201 nfs_uuid_t struct, register it with nfs_common for subsequent lookup and 202 verification by the NFS server and if matched 202 verification by the NFS server and if matched the NFS server populates 203 members in the nfs_uuid_t struct. The NFS clie 203 members in the nfs_uuid_t struct. The NFS client then uses nfs_common to 204 transfer the nfs_uuid_t from its nfs_uuids to 204 transfer the nfs_uuid_t from its nfs_uuids to the nn->nfsd_serv 205 clients_list from the nfs_common's uuids_list. 205 clients_list from the nfs_common's uuids_list. See: 206 fs/nfs/localio.c:nfs_local_probe() 206 fs/nfs/localio.c:nfs_local_probe() 207 207 208 nfs_common's nfs_uuids list is the basis for L 208 nfs_common's nfs_uuids list is the basis for LOCALIO enablement, as such 209 it has members that point to nfsd memory for d 209 it has members that point to nfsd memory for direct use by the client 210 (e.g. 'net' is the server's network namespace, 210 (e.g. 'net' is the server's network namespace, through it the client can 211 access nn->nfsd_serv with proper rcu read acce 211 access nn->nfsd_serv with proper rcu read access). It is this client 212 and server synchronization that enables advanc 212 and server synchronization that enables advanced usage and lifetime of 213 objects to span from the host kernel's nfsd to 213 objects to span from the host kernel's nfsd to per-container knfsd 214 instances that are connected to nfs client's r 214 instances that are connected to nfs client's running on the same local 215 host. 215 host. 216 216 217 NFS Client and Server Interlock 217 NFS Client and Server Interlock 218 =============================== 218 =============================== 219 219 220 LOCALIO provides the nfs_uuid_t object and ass 220 LOCALIO provides the nfs_uuid_t object and associated interfaces to 221 allow proper network namespace (net-ns) and NF 221 allow proper network namespace (net-ns) and NFSD object refcounting: 222 222 223 We don't want to keep a long-term counted 223 We don't want to keep a long-term counted reference on each NFSD's 224 net-ns in the client because that prevents 224 net-ns in the client because that prevents a server container from 225 completely shutting down. 225 completely shutting down. 226 226 227 So we avoid taking a reference at all and 227 So we avoid taking a reference at all and rely on the per-cpu 228 reference to the server (detailed below) b 228 reference to the server (detailed below) being sufficient to keep 229 the net-ns active. This involves allowing 229 the net-ns active. This involves allowing the NFSD's net-ns exit 230 code to iterate all active clients and cle 230 code to iterate all active clients and clear their ->net pointers 231 (which are needed to find the per-cpu-refc 231 (which are needed to find the per-cpu-refcount for the nfsd_serv). 232 232 233 Details: 233 Details: 234 234 235 - Embed nfs_uuid_t in nfs_client. nfs_uui 235 - Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head 236 that can be used to find the client. It 236 that can be used to find the client. It does add the 16-byte 237 uuid_t to nfs_client so it is bigger th 237 uuid_t to nfs_client so it is bigger than needed (given that 238 uuid_t is only used during the initial 238 uuid_t is only used during the initial NFS client and server 239 LOCALIO handshake to determine if they 239 LOCALIO handshake to determine if they are local to each other). 240 If that is really a problem we can find 240 If that is really a problem we can find a fix. 241 241 242 - When the nfs server confirms that the u 242 - When the nfs server confirms that the uuid_t is local, it moves 243 the nfs_uuid_t onto a per-net-ns list i 243 the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net. 244 244 245 - When each server's net-ns is shutting d 245 - When each server's net-ns is shutting down - in a "pre_exit" 246 handler, all these nfs_uuid_t have thei 246 handler, all these nfs_uuid_t have their ->net cleared. There is 247 an rcu_synchronize() call between pre_e 247 an rcu_synchronize() call between pre_exit() handlers and exit() 248 handlers so any caller that sees nfs_uu 248 handlers so any caller that sees nfs_uuid_t ->net as not NULL can 249 safely manage the per-cpu-refcount for 249 safely manage the per-cpu-refcount for nfsd_serv. 250 250 251 - The client's nfs_uuid_t is passed to nf 251 - The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it 252 can safely dereference ->net in a priva 252 can safely dereference ->net in a private rcu_read_lock() section 253 to allow safe access to the associated 253 to allow safe access to the associated nfsd_net and nfsd_serv. 254 254 255 So LOCALIO required the introduction and use o 255 So LOCALIO required the introduction and use of NFSD's percpu_ref to 256 interlock nfsd_destroy_serv() and nfsd_open_lo 256 interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each 257 nn->nfsd_serv is not destroyed while in use by 257 nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and 258 warrants a more detailed explanation: 258 warrants a more detailed explanation: 259 259 260 nfsd_open_local_fh() uses nfsd_serv_try_ge 260 nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its 261 nfsd_file handle and then the caller (NFS 261 nfsd_file handle and then the caller (NFS client) must drop the 262 reference for the nfsd_file and associated 262 reference for the nfsd_file and associated nn->nfsd_serv using 263 nfs_file_put_local() once it has completed 263 nfs_file_put_local() once it has completed its IO. 264 264 265 This interlock working relies heavily on n 265 This interlock working relies heavily on nfsd_open_local_fh() being 266 afforded the ability to safely deal with t 266 afforded the ability to safely deal with the possibility that the 267 NFSD's net-ns (and nfsd_net by association 267 NFSD's net-ns (and nfsd_net by association) may have been destroyed 268 by nfsd_destroy_serv() via nfsd_shutdown_n 268 by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only 269 possible given the nfs_uuid_t ->net pointe 269 possible given the nfs_uuid_t ->net pointer managemenet detailed 270 above. 270 above. 271 271 272 All told, this elaborate interlock of the NFS 272 All told, this elaborate interlock of the NFS client and server has been 273 verified to fix an easy to hit crash that woul 273 verified to fix an easy to hit crash that would occur if an NFSD 274 instance running in a container, with a LOCALI 274 instance running in a container, with a LOCALIO client mounted, is 275 shutdown. Upon restart of the container and as 275 shutdown. Upon restart of the container and associated NFSD the client 276 would go on to crash due to NULL pointer deref 276 would go on to crash due to NULL pointer dereference that occurred due 277 to the LOCALIO client's attempting to nfsd_ope 277 to the LOCALIO client's attempting to nfsd_open_local_fh(), using 278 nn->nfsd_serv, without having a proper referen 278 nn->nfsd_serv, without having a proper reference on nn->nfsd_serv. 279 279 280 NFS Client issues IO instead of Server 280 NFS Client issues IO instead of Server 281 ====================================== 281 ====================================== 282 282 283 Because LOCALIO is focused on protocol bypass 283 Because LOCALIO is focused on protocol bypass to achieve improved IO 284 performance, alternatives to the traditional N 284 performance, alternatives to the traditional NFS wire protocol (SUNRPC 285 with XDR) must be provided to access the backi 285 with XDR) must be provided to access the backing filesystem. 286 286 287 See fs/nfs/localio.c:nfs_local_open_fh() and 287 See fs/nfs/localio.c:nfs_local_open_fh() and 288 fs/nfsd/localio.c:nfsd_open_local_fh() for the 288 fs/nfsd/localio.c:nfsd_open_local_fh() for the interface that makes 289 focused use of select nfs server objects to al 289 focused use of select nfs server objects to allow a client local to a 290 server to open a file pointer without needing 290 server to open a file pointer without needing to go over the network. 291 291 292 The client's fs/nfs/localio.c:nfs_local_open_f 292 The client's fs/nfs/localio.c:nfs_local_open_fh() will call into the 293 server's fs/nfsd/localio.c:nfsd_open_local_fh( 293 server's fs/nfsd/localio.c:nfsd_open_local_fh() and carefully access 294 both the associated nfsd network namespace and 294 both the associated nfsd network namespace and nn->nfsd_serv in terms of 295 RCU. If nfsd_open_local_fh() finds that the cl 295 RCU. If nfsd_open_local_fh() finds that the client no longer sees valid 296 nfsd objects (be it struct net or nn->nfsd_ser 296 nfsd objects (be it struct net or nn->nfsd_serv) it returns -ENXIO 297 to nfs_local_open_fh() and the client will try 297 to nfs_local_open_fh() and the client will try to reestablish the 298 LOCALIO resources needed by calling nfs_local_ 298 LOCALIO resources needed by calling nfs_local_probe() again. This 299 recovery is needed if/when an nfsd instance ru 299 recovery is needed if/when an nfsd instance running in a container were 300 to reboot while a LOCALIO client is connected 300 to reboot while a LOCALIO client is connected to it. 301 301 302 Once the client has an open nfsd_file pointer 302 Once the client has an open nfsd_file pointer it will issue reads, 303 writes and commits directly to the underlying 303 writes and commits directly to the underlying local filesystem (normally 304 done by the nfs server). As such, for these op 304 done by the nfs server). As such, for these operations, the NFS client 305 is issuing IO to the underlying local filesyst 305 is issuing IO to the underlying local filesystem that it is sharing with 306 the NFS server. See: fs/nfs/localio.c:nfs_loca 306 the NFS server. See: fs/nfs/localio.c:nfs_local_doio() and 307 fs/nfs/localio.c:nfs_local_commit(). 307 fs/nfs/localio.c:nfs_local_commit(). 308 308 309 Security 309 Security 310 ======== 310 ======== 311 311 312 Localio is only supported when UNIX-style auth 312 Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka 313 AUTH_SYS) is used. 313 AUTH_SYS) is used. 314 314 315 Care is taken to ensure the same NFS security 315 Care is taken to ensure the same NFS security mechanisms are used 316 (authentication, etc) regardless of whether LO 316 (authentication, etc) regardless of whether LOCALIO or regular NFS 317 access is used. The auth_domain established as 317 access is used. The auth_domain established as part of the traditional 318 NFS client access to the NFS server is also us 318 NFS client access to the NFS server is also used for LOCALIO. 319 319 320 Relative to containers, LOCALIO gives the clie 320 Relative to containers, LOCALIO gives the client access to the network 321 namespace the server has. This is required to 321 namespace the server has. This is required to allow the client to access 322 the server's per-namespace nfsd_net struct. Wi 322 the server's per-namespace nfsd_net struct. With traditional NFS, the 323 client is afforded this same level of access ( 323 client is afforded this same level of access (albeit in terms of the NFS 324 protocol via SUNRPC). No other namespaces (use 324 protocol via SUNRPC). No other namespaces (user, mount, etc) have been 325 altered or purposely extended from the server 325 altered or purposely extended from the server to the client. 326 326 327 Testing 327 Testing 328 ======= 328 ======= 329 329 330 The LOCALIO auxiliary protocol and associated 330 The LOCALIO auxiliary protocol and associated NFS LOCALIO read, write 331 and commit access have proven stable against v 331 and commit access have proven stable against various test scenarios: 332 332 333 - Client and server both on the same host. 333 - Client and server both on the same host. 334 334 335 - All permutations of client and server suppor 335 - All permutations of client and server support enablement for both 336 local and remote client and server. 336 local and remote client and server. 337 337 338 - Testing against NFS storage products that do 338 - Testing against NFS storage products that don't support the LOCALIO 339 protocol was also performed. 339 protocol was also performed. 340 340 341 - Client on host, server within a container (f 341 - Client on host, server within a container (for both v3 and v4.2). 342 The container testing was in terms of podman 342 The container testing was in terms of podman managed containers and 343 includes successful container stop/restart s 343 includes successful container stop/restart scenario. 344 344 345 - Formalizing these test scenarios in terms of 345 - Formalizing these test scenarios in terms of existing test 346 infrastructure is on-going. Initial regular 346 infrastructure is on-going. Initial regular coverage is provided in 347 terms of ktest running xfstests against a LO 347 terms of ktest running xfstests against a LOCALIO-enabled NFS loopback 348 mount configuration, and includes lockdep an 348 mount configuration, and includes lockdep and KASAN coverage, see: 349 https://evilpiepirate.org/~testdashboard/ci? 349 https://evilpiepirate.org/~testdashboard/ci?user=snitzer&branch=snitm-nfs-next 350 https://github.com/koverstreet/ktest 350 https://github.com/koverstreet/ktest 351 351 352 - Various kdevops testing (in terms of "Chuck' 352 - Various kdevops testing (in terms of "Chuck's BuildBot") has been 353 performed to regularly verify the LOCALIO ch 353 performed to regularly verify the LOCALIO changes haven't caused any 354 regressions to non-LOCALIO NFS use cases. 354 regressions to non-LOCALIO NFS use cases. 355 355 356 - All of Hammerspace's various sanity tests pa 356 - All of Hammerspace's various sanity tests pass with LOCALIO enabled 357 (this includes numerous pNFS and flexfiles t 357 (this includes numerous pNFS and flexfiles tests).
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.