~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/device-mapper/log-writes.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/admin-guide/device-mapper/log-writes.rst (Version linux-6.12-rc7) and /Documentation/admin-guide/device-mapper/log-writes.rst (Version linux-6.9.12)


  1 =============                                       1 =============
  2 dm-log-writes                                       2 dm-log-writes
  3 =============                                       3 =============
  4                                                     4 
  5 This target takes 2 devices, one to pass all I      5 This target takes 2 devices, one to pass all IO to normally, and one to log all
  6 of the write operations to.  This is intended       6 of the write operations to.  This is intended for file system developers wishing
  7 to verify the integrity of metadata or data as      7 to verify the integrity of metadata or data as the file system is written to.
  8 There is a log_write_entry written for every W      8 There is a log_write_entry written for every WRITE request and the target is
  9 able to take arbitrary data from userspace to       9 able to take arbitrary data from userspace to insert into the log.  The data
 10 that is in the WRITE requests is copied into t     10 that is in the WRITE requests is copied into the log to make the replay happen
 11 exactly as it happened originally.                 11 exactly as it happened originally.
 12                                                    12 
 13 Log Ordering                                       13 Log Ordering
 14 ============                                       14 ============
 15                                                    15 
 16 We log things in order of completion once we a     16 We log things in order of completion once we are sure the write is no longer in
 17 cache.  This means that normal WRITE requests      17 cache.  This means that normal WRITE requests are not actually logged until the
 18 next REQ_PREFLUSH request.  This is to make it     18 next REQ_PREFLUSH request.  This is to make it easier for userspace to replay
 19 the log in a way that correlates to what is on     19 the log in a way that correlates to what is on disk and not what is in cache,
 20 to make it easier to detect improper waiting/f     20 to make it easier to detect improper waiting/flushing.
 21                                                    21 
 22 This works by attaching all WRITE requests to      22 This works by attaching all WRITE requests to a list once the write completes.
 23 Once we see a REQ_PREFLUSH request we splice t     23 Once we see a REQ_PREFLUSH request we splice this list onto the request and once
 24 the FLUSH request completes we log all of the      24 the FLUSH request completes we log all of the WRITEs and then the FLUSH.  Only
 25 completed WRITEs, at the time the REQ_PREFLUSH     25 completed WRITEs, at the time the REQ_PREFLUSH is issued, are added in order to
 26 simulate the worst case scenario with regard t     26 simulate the worst case scenario with regard to power failures.  Consider the
 27 following example (W means write, C means comp     27 following example (W means write, C means complete):
 28                                                    28 
 29         W1,W2,W3,C3,C2,Wflush,C1,Cflush            29         W1,W2,W3,C3,C2,Wflush,C1,Cflush
 30                                                    30 
 31 The log would show the following:                  31 The log would show the following:
 32                                                    32 
 33         W3,W2,flush,W1....                         33         W3,W2,flush,W1....
 34                                                    34 
 35 Again this is to simulate what is actually on      35 Again this is to simulate what is actually on disk, this allows us to detect
 36 cases where a power failure at a particular po     36 cases where a power failure at a particular point in time would create an
 37 inconsistent file system.                          37 inconsistent file system.
 38                                                    38 
 39 Any REQ_FUA requests bypass this flushing mech     39 Any REQ_FUA requests bypass this flushing mechanism and are logged as soon as
 40 they complete as those requests will obviously     40 they complete as those requests will obviously bypass the device cache.
 41                                                    41 
 42 Any REQ_OP_DISCARD requests are treated like W     42 Any REQ_OP_DISCARD requests are treated like WRITE requests.  Otherwise we would
 43 have all the DISCARD requests, and then the WR     43 have all the DISCARD requests, and then the WRITE requests and then the FLUSH
 44 request.  Consider the following example:          44 request.  Consider the following example:
 45                                                    45 
 46         WRITE block 1, DISCARD block 1, FLUSH      46         WRITE block 1, DISCARD block 1, FLUSH
 47                                                    47 
 48 If we logged DISCARD when it completed, the re     48 If we logged DISCARD when it completed, the replay would look like this:
 49                                                    49 
 50         DISCARD 1, WRITE 1, FLUSH                  50         DISCARD 1, WRITE 1, FLUSH
 51                                                    51 
 52 which isn't quite what happened and wouldn't b     52 which isn't quite what happened and wouldn't be caught during the log replay.
 53                                                    53 
 54 Target interface                                   54 Target interface
 55 ================                                   55 ================
 56                                                    56 
 57 i) Constructor                                     57 i) Constructor
 58                                                    58 
 59    log-writes <dev_path> <log_dev_path>            59    log-writes <dev_path> <log_dev_path>
 60                                                    60 
 61    ============= =============================     61    ============= ==============================================
 62    dev_path      Device that all of the IO wil     62    dev_path      Device that all of the IO will go to normally.
 63    log_dev_path  Device where the log entries      63    log_dev_path  Device where the log entries are written to.
 64    ============= =============================     64    ============= ==============================================
 65                                                    65 
 66 ii) Status                                         66 ii) Status
 67                                                    67 
 68     <#logged entries> <highest allocated secto     68     <#logged entries> <highest allocated sector>
 69                                                    69 
 70     =========================== ==============     70     =========================== ========================
 71     #logged entries             Number of logg     71     #logged entries             Number of logged entries
 72     highest allocated sector    Highest alloca     72     highest allocated sector    Highest allocated sector
 73     =========================== ==============     73     =========================== ========================
 74                                                    74 
 75 iii) Messages                                      75 iii) Messages
 76                                                    76 
 77     mark <description>                             77     mark <description>
 78                                                    78 
 79         You can use a dmsetup message to set a     79         You can use a dmsetup message to set an arbitrary mark in a log.
 80         For example say you want to fsck a fil     80         For example say you want to fsck a file system after every
 81         write, but first you need to replay up     81         write, but first you need to replay up to the mkfs to make sure
 82         we're fsck'ing something reasonable, y     82         we're fsck'ing something reasonable, you would do something like
 83         this::                                     83         this::
 84                                                    84 
 85           mkfs.btrfs -f /dev/mapper/log            85           mkfs.btrfs -f /dev/mapper/log
 86           dmsetup message log 0 mark mkfs          86           dmsetup message log 0 mark mkfs
 87           <run test>                               87           <run test>
 88                                                    88 
 89         This would allow you to replay the log     89         This would allow you to replay the log up to the mkfs mark and
 90         then replay from that point on doing t     90         then replay from that point on doing the fsck check in the
 91         interval that you want.                    91         interval that you want.
 92                                                    92 
 93         Every log has a mark at the end labele     93         Every log has a mark at the end labeled "dm-log-writes-end".
 94                                                    94 
 95 Userspace component                                95 Userspace component
 96 ===================                                96 ===================
 97                                                    97 
 98 There is a userspace tool that will replay the     98 There is a userspace tool that will replay the log for you in various ways.
 99 It can be found here: https://github.com/josef     99 It can be found here: https://github.com/josefbacik/log-writes
100                                                   100 
101 Example usage                                     101 Example usage
102 =============                                     102 =============
103                                                   103 
104 Say you want to test fsync on your file system    104 Say you want to test fsync on your file system.  You would do something like
105 this::                                            105 this::
106                                                   106 
107   TABLE="0 $(blockdev --getsz /dev/sdb) log-wr    107   TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
108   dmsetup create log --table "$TABLE"             108   dmsetup create log --table "$TABLE"
109   mkfs.btrfs -f /dev/mapper/log                   109   mkfs.btrfs -f /dev/mapper/log
110   dmsetup message log 0 mark mkfs                 110   dmsetup message log 0 mark mkfs
111                                                   111 
112   mount /dev/mapper/log /mnt/btrfs-test           112   mount /dev/mapper/log /mnt/btrfs-test
113   <some test that does fsync at the end>          113   <some test that does fsync at the end>
114   dmsetup message log 0 mark fsync                114   dmsetup message log 0 mark fsync
115   md5sum /mnt/btrfs-test/foo                      115   md5sum /mnt/btrfs-test/foo
116   umount /mnt/btrfs-test                          116   umount /mnt/btrfs-test
117                                                   117 
118   dmsetup remove log                              118   dmsetup remove log
119   replay-log --log /dev/sdc --replay /dev/sdb     119   replay-log --log /dev/sdc --replay /dev/sdb --end-mark fsync
120   mount /dev/sdb /mnt/btrfs-test                  120   mount /dev/sdb /mnt/btrfs-test
121   md5sum /mnt/btrfs-test/foo                      121   md5sum /mnt/btrfs-test/foo
122   <verify md5sum's are correct>                   122   <verify md5sum's are correct>
123                                                   123 
124   Another option is to do a complicated file s    124   Another option is to do a complicated file system operation and verify the file
125   system is consistent during the entire opera    125   system is consistent during the entire operation.  You could do this with:
126                                                   126 
127   TABLE="0 $(blockdev --getsz /dev/sdb) log-wr    127   TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
128   dmsetup create log --table "$TABLE"             128   dmsetup create log --table "$TABLE"
129   mkfs.btrfs -f /dev/mapper/log                   129   mkfs.btrfs -f /dev/mapper/log
130   dmsetup message log 0 mark mkfs                 130   dmsetup message log 0 mark mkfs
131                                                   131 
132   mount /dev/mapper/log /mnt/btrfs-test           132   mount /dev/mapper/log /mnt/btrfs-test
133   <fsstress to dirty the fs>                      133   <fsstress to dirty the fs>
134   btrfs filesystem balance /mnt/btrfs-test        134   btrfs filesystem balance /mnt/btrfs-test
135   umount /mnt/btrfs-test                          135   umount /mnt/btrfs-test
136   dmsetup remove log                              136   dmsetup remove log
137                                                   137 
138   replay-log --log /dev/sdc --replay /dev/sdb     138   replay-log --log /dev/sdc --replay /dev/sdb --end-mark mkfs
139   btrfsck /dev/sdb                                139   btrfsck /dev/sdb
140   replay-log --log /dev/sdc --replay /dev/sdb     140   replay-log --log /dev/sdc --replay /dev/sdb --start-mark mkfs \
141         --fsck "btrfsck /dev/sdb" --check fua     141         --fsck "btrfsck /dev/sdb" --check fua
142                                                   142 
143 And that will replay the log until it sees a F    143 And that will replay the log until it sees a FUA request, run the fsck command
144 and if the fsck passes it will replay to the n    144 and if the fsck passes it will replay to the next FUA, until it is completed or
145 the fsck command exists abnormally.               145 the fsck command exists abnormally.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php