1 =============== 1 =============== 2 Persistent data 2 Persistent data 3 =============== 3 =============== 4 4 5 Introduction 5 Introduction 6 ============ 6 ============ 7 7 8 The more-sophisticated device-mapper targets r 8 The more-sophisticated device-mapper targets require complex metadata 9 that is managed in kernel. In late 2010 we we 9 that is managed in kernel. In late 2010 we were seeing that various 10 different targets were rolling their own data 10 different targets were rolling their own data structures, for example: 11 11 12 - Mikulas Patocka's multisnap implementation 12 - Mikulas Patocka's multisnap implementation 13 - Heinz Mauelshagen's thin provisioning target 13 - Heinz Mauelshagen's thin provisioning target 14 - Another btree-based caching target posted to 14 - Another btree-based caching target posted to dm-devel 15 - Another multi-snapshot target based on a des 15 - Another multi-snapshot target based on a design of Daniel Phillips 16 16 17 Maintaining these data structures takes a lot 17 Maintaining these data structures takes a lot of work, so if possible 18 we'd like to reduce the number. 18 we'd like to reduce the number. 19 19 20 The persistent-data library is an attempt to p 20 The persistent-data library is an attempt to provide a re-usable 21 framework for people who want to store metadat 21 framework for people who want to store metadata in device-mapper 22 targets. It's currently used by the thin-prov 22 targets. It's currently used by the thin-provisioning target and an 23 upcoming hierarchical storage target. 23 upcoming hierarchical storage target. 24 24 25 Overview 25 Overview 26 ======== 26 ======== 27 27 28 The main documentation is in the header files 28 The main documentation is in the header files which can all be found 29 under drivers/md/persistent-data. 29 under drivers/md/persistent-data. 30 30 31 The block manager 31 The block manager 32 ----------------- 32 ----------------- 33 33 34 dm-block-manager.[hc] 34 dm-block-manager.[hc] 35 35 36 This provides access to the data on disk in fi 36 This provides access to the data on disk in fixed sized-blocks. There 37 is a read/write locking interface to prevent c 37 is a read/write locking interface to prevent concurrent accesses, and 38 keep data that is being used in the cache. 38 keep data that is being used in the cache. 39 39 40 Clients of persistent-data are unlikely to use 40 Clients of persistent-data are unlikely to use this directly. 41 41 42 The transaction manager 42 The transaction manager 43 ----------------------- 43 ----------------------- 44 44 45 dm-transaction-manager.[hc] 45 dm-transaction-manager.[hc] 46 46 47 This restricts access to blocks and enforces c 47 This restricts access to blocks and enforces copy-on-write semantics. 48 The only way you can get hold of a writable bl 48 The only way you can get hold of a writable block through the 49 transaction manager is by shadowing an existin 49 transaction manager is by shadowing an existing block (ie. doing 50 copy-on-write) or allocating a fresh one. Sha 50 copy-on-write) or allocating a fresh one. Shadowing is elided within 51 the same transaction so performance is reasona 51 the same transaction so performance is reasonable. The commit method 52 ensures that all data is flushed before it wri 52 ensures that all data is flushed before it writes the superblock. 53 On power failure your metadata will be as it w 53 On power failure your metadata will be as it was when last committed. 54 54 55 The Space Maps 55 The Space Maps 56 -------------- 56 -------------- 57 57 58 dm-space-map.h 58 dm-space-map.h 59 dm-space-map-metadata.[hc] 59 dm-space-map-metadata.[hc] 60 dm-space-map-disk.[hc] 60 dm-space-map-disk.[hc] 61 61 62 On-disk data structures that keep track of ref 62 On-disk data structures that keep track of reference counts of blocks. 63 Also acts as the allocator of new blocks. Cur 63 Also acts as the allocator of new blocks. Currently two 64 implementations: a simpler one for managing bl 64 implementations: a simpler one for managing blocks on a different 65 device (eg. thinly-provisioned data blocks); a 65 device (eg. thinly-provisioned data blocks); and one for managing 66 the metadata space. The latter is complicated 66 the metadata space. The latter is complicated by the need to store 67 its own data within the space it's managing. 67 its own data within the space it's managing. 68 68 69 The data structures 69 The data structures 70 ------------------- 70 ------------------- 71 71 72 dm-btree.[hc] 72 dm-btree.[hc] 73 dm-btree-remove.c 73 dm-btree-remove.c 74 dm-btree-spine.c 74 dm-btree-spine.c 75 dm-btree-internal.h 75 dm-btree-internal.h 76 76 77 Currently there is only one data structure, a 77 Currently there is only one data structure, a hierarchical btree. 78 There are plans to add more. For example, som 78 There are plans to add more. For example, something with an 79 array-like interface would see a lot of use. 79 array-like interface would see a lot of use. 80 80 81 The btree is 'hierarchical' in that you can de 81 The btree is 'hierarchical' in that you can define it to be composed 82 of nested btrees, and take multiple keys. For 82 of nested btrees, and take multiple keys. For example, the 83 thin-provisioning target uses a btree with two 83 thin-provisioning target uses a btree with two levels of nesting. 84 The first maps a device id to a mapping tree, 84 The first maps a device id to a mapping tree, and that in turn maps a 85 virtual block to a physical block. 85 virtual block to a physical block. 86 86 87 Values stored in the btrees can have arbitrary 87 Values stored in the btrees can have arbitrary size. Keys are always 88 64bits, although nesting allows you to use mul 88 64bits, although nesting allows you to use multiple keys.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.