1 ========================== 1 ========================== 2 Reference counting in pnfs 2 Reference counting in pnfs 3 ========================== 3 ========================== 4 4 5 The are several inter-related caches. We have 5 The are several inter-related caches. We have layouts which can 6 reference multiple devices, each of which can 6 reference multiple devices, each of which can reference multiple data servers. 7 Each data server can be referenced by multiple 7 Each data server can be referenced by multiple devices. Each device 8 can be referenced by multiple layouts. To keep 8 can be referenced by multiple layouts. To keep all of this straight, 9 we need to reference count. 9 we need to reference count. 10 10 11 11 12 struct pnfs_layout_hdr 12 struct pnfs_layout_hdr 13 ====================== 13 ====================== 14 14 15 The on-the-wire command LAYOUTGET corresponds 15 The on-the-wire command LAYOUTGET corresponds to struct 16 pnfs_layout_segment, usually referred to by th 16 pnfs_layout_segment, usually referred to by the variable name lseg. 17 Each nfs_inode may hold a pointer to a cache o 17 Each nfs_inode may hold a pointer to a cache of these layout 18 segments in nfsi->layout, of type struct pnfs_ 18 segments in nfsi->layout, of type struct pnfs_layout_hdr. 19 19 20 We reference the header for the inode pointing 20 We reference the header for the inode pointing to it, across each 21 outstanding RPC call that references it (LAYOU 21 outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN, 22 LAYOUTCOMMIT), and for each lseg held within. 22 LAYOUTCOMMIT), and for each lseg held within. 23 23 24 Each header is also (when non-empty) put on a 24 Each header is also (when non-empty) put on a list associated with 25 struct nfs_client (cl_layouts). Being put on 25 struct nfs_client (cl_layouts). Being put on this list does not bump 26 the reference count, as the layout is kept aro 26 the reference count, as the layout is kept around by the lseg that 27 keeps it in the list. 27 keeps it in the list. 28 28 29 deviceid_cache 29 deviceid_cache 30 ============== 30 ============== 31 31 32 lsegs reference device ids, which are resolved 32 lsegs reference device ids, which are resolved per nfs_client and 33 layout driver type. The device ids are held i 33 layout driver type. The device ids are held in a RCU cache (struct 34 nfs4_deviceid_cache). The cache itself is ref 34 nfs4_deviceid_cache). The cache itself is referenced across each 35 mount. The entries (struct nfs4_deviceid) the 35 mount. The entries (struct nfs4_deviceid) themselves are held across 36 the lifetime of each lseg referencing them. 36 the lifetime of each lseg referencing them. 37 37 38 RCU is used because the deviceid is basically 38 RCU is used because the deviceid is basically a write once, read many 39 data structure. The hlist size of 32 buckets 39 data structure. The hlist size of 32 buckets needs better 40 justification, but seems reasonable given that 40 justification, but seems reasonable given that we can have multiple 41 deviceid's per filesystem, and multiple filesy 41 deviceid's per filesystem, and multiple filesystems per nfs_client. 42 42 43 The hash code is copied from the nfsd code bas 43 The hash code is copied from the nfsd code base. A discussion of 44 hashing and variations of this algorithm can b 44 hashing and variations of this algorithm can be found `here. 45 <http://groups.google.com/group/comp.lang.c/br 45 <http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809>`_ 46 46 47 data server cache 47 data server cache 48 ================= 48 ================= 49 49 50 file driver devices refer to data servers, whi 50 file driver devices refer to data servers, which are kept in a module 51 level cache. Its reference is held over the l 51 level cache. Its reference is held over the lifetime of the deviceid 52 pointing to it. 52 pointing to it. 53 53 54 lseg 54 lseg 55 ==== 55 ==== 56 56 57 lseg maintains an extra reference correspondin 57 lseg maintains an extra reference corresponding to the NFS_LSEG_VALID 58 bit which holds it in the pnfs_layout_hdr's li 58 bit which holds it in the pnfs_layout_hdr's list. When the final lseg 59 is removed from the pnfs_layout_hdr's list, th 59 is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED 60 bit is set, preventing any new lsegs from bein 60 bit is set, preventing any new lsegs from being added. 61 61 62 layout drivers 62 layout drivers 63 ============== 63 ============== 64 64 65 PNFS utilizes what is called layout drivers. T 65 PNFS utilizes what is called layout drivers. The STD defines 4 basic 66 layout types: "files", "objects", "blocks", an 66 layout types: "files", "objects", "blocks", and "flexfiles". For each 67 of these types there is a layout-driver with a 67 of these types there is a layout-driver with a common function-vectors 68 table which are called by the nfs-client pnfs- 68 table which are called by the nfs-client pnfs-core to implement the 69 different layout types. 69 different layout types. 70 70 71 Files-layout-driver code is in: fs/nfs/filelay 71 Files-layout-driver code is in: fs/nfs/filelayout/.. directory 72 Blocks-layout-driver code is in: fs/nfs/blockl 72 Blocks-layout-driver code is in: fs/nfs/blocklayout/.. directory 73 Flexfiles-layout-driver code is in: fs/nfs/fle 73 Flexfiles-layout-driver code is in: fs/nfs/flexfilelayout/.. directory 74 74 75 blocks-layout setup 75 blocks-layout setup 76 =================== 76 =================== 77 77 78 TODO: Document the setup needs of the blocks l 78 TODO: Document the setup needs of the blocks layout driver
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.