~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/ext4/journal.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/filesystems/ext4/journal.rst (Architecture sparc) and /Documentation/filesystems/ext4/journal.rst (Architecture i386)


  1 .. SPDX-License-Identifier: GPL-2.0                 1 .. SPDX-License-Identifier: GPL-2.0
  2                                                     2 
  3 Journal (jbd2)                                      3 Journal (jbd2)
  4 --------------                                      4 --------------
  5                                                     5 
  6 Introduced in ext3, the ext4 filesystem employ      6 Introduced in ext3, the ext4 filesystem employs a journal to protect the
  7 filesystem against metadata inconsistencies in      7 filesystem against metadata inconsistencies in the case of a system crash. Up
  8 to 10,240,000 file system blocks (see man mke2      8 to 10,240,000 file system blocks (see man mke2fs(8) for more details on journal
  9 size limits) can be reserved inside the filesy      9 size limits) can be reserved inside the filesystem as a place to land
 10 “important” data writes on-disk as quickly     10 “important” data writes on-disk as quickly as possible. Once the important
 11 data transaction is fully written to the disk      11 data transaction is fully written to the disk and flushed from the disk write
 12 cache, a record of the data being committed is     12 cache, a record of the data being committed is also written to the journal. At
 13 some later point in time, the journal code wri     13 some later point in time, the journal code writes the transactions to their
 14 final locations on disk (this could involve a      14 final locations on disk (this could involve a lot of seeking or a lot of small
 15 read-write-erases) before erasing the commit r     15 read-write-erases) before erasing the commit record. Should the system
 16 crash during the second slow write, the journa     16 crash during the second slow write, the journal can be replayed all the
 17 way to the latest commit record, guaranteeing      17 way to the latest commit record, guaranteeing the atomicity of whatever
 18 gets written through the journal to the disk.      18 gets written through the journal to the disk. The effect of this is to
 19 guarantee that the filesystem does not become      19 guarantee that the filesystem does not become stuck midway through a
 20 metadata update.                                   20 metadata update.
 21                                                    21 
 22 For performance reasons, ext4 by default only      22 For performance reasons, ext4 by default only writes filesystem metadata
 23 through the journal. This means that file data     23 through the journal. This means that file data blocks are /not/
 24 guaranteed to be in any consistent state after     24 guaranteed to be in any consistent state after a crash. If this default
 25 guarantee level (``data=ordered``) is not sati     25 guarantee level (``data=ordered``) is not satisfactory, there is a mount
 26 option to control journal behavior. If ``data=     26 option to control journal behavior. If ``data=journal``, all data and
 27 metadata are written to disk through the journ     27 metadata are written to disk through the journal. This is slower but
 28 safest. If ``data=writeback``, dirty data bloc     28 safest. If ``data=writeback``, dirty data blocks are not flushed to the
 29 disk before the metadata are written to disk t     29 disk before the metadata are written to disk through the journal.
 30                                                    30 
 31 In case of ``data=ordered`` mode, Ext4 also su     31 In case of ``data=ordered`` mode, Ext4 also supports fast commits which
 32 help reduce commit latency significantly. The      32 help reduce commit latency significantly. The default ``data=ordered``
 33 mode works by logging metadata blocks to the j     33 mode works by logging metadata blocks to the journal. In fast commit
 34 mode, Ext4 only stores the minimal delta neede     34 mode, Ext4 only stores the minimal delta needed to recreate the
 35 affected metadata in fast commit space that is     35 affected metadata in fast commit space that is shared with JBD2.
 36 Once the fast commit area fills in or if fast      36 Once the fast commit area fills in or if fast commit is not possible
 37 or if JBD2 commit timer goes off, Ext4 perform     37 or if JBD2 commit timer goes off, Ext4 performs a traditional full commit.
 38 A full commit invalidates all the fast commits     38 A full commit invalidates all the fast commits that happened before
 39 it and thus it makes the fast commit area empt     39 it and thus it makes the fast commit area empty for further fast
 40 commits. This feature needs to be enabled at m     40 commits. This feature needs to be enabled at mkfs time.
 41                                                    41 
 42 The journal inode is typically inode 8. The fi     42 The journal inode is typically inode 8. The first 68 bytes of the
 43 journal inode are replicated in the ext4 super     43 journal inode are replicated in the ext4 superblock. The journal itself
 44 is normal (but hidden) file within the filesys     44 is normal (but hidden) file within the filesystem. The file usually
 45 consumes an entire block group, though mke2fs      45 consumes an entire block group, though mke2fs tries to put it in the
 46 middle of the disk.                                46 middle of the disk.
 47                                                    47 
 48 All fields in jbd2 are written to disk in big-     48 All fields in jbd2 are written to disk in big-endian order. This is the
 49 opposite of ext4.                                  49 opposite of ext4.
 50                                                    50 
 51 NOTE: Both ext4 and ocfs2 use jbd2.                51 NOTE: Both ext4 and ocfs2 use jbd2.
 52                                                    52 
 53 The maximum size of a journal embedded in an e     53 The maximum size of a journal embedded in an ext4 filesystem is 2^32
 54 blocks. jbd2 itself does not seem to care.         54 blocks. jbd2 itself does not seem to care.
 55                                                    55 
 56 Layout                                             56 Layout
 57 ~~~~~~                                             57 ~~~~~~
 58                                                    58 
 59 Generally speaking, the journal has this forma     59 Generally speaking, the journal has this format:
 60                                                    60 
 61 .. list-table::                                    61 .. list-table::
 62    :widths: 16 48 16                               62    :widths: 16 48 16
 63    :header-rows: 1                                 63    :header-rows: 1
 64                                                    64 
 65    * - Superblock                                  65    * - Superblock
 66      - descriptor_block (data_blocks or revoca     66      - descriptor_block (data_blocks or revocation_block) [more data or
 67        revocations] commmit_block                  67        revocations] commmit_block
 68      - [more transactions...]                      68      - [more transactions...]
 69    * -                                             69    * - 
 70      - One transaction                             70      - One transaction
 71      -                                             71      -
 72                                                    72 
 73 Notice that a transaction begins with either a     73 Notice that a transaction begins with either a descriptor and some data,
 74 or a block revocation list. A finished transac     74 or a block revocation list. A finished transaction always ends with a
 75 commit. If there is no commit record (or the c     75 commit. If there is no commit record (or the checksums don't match), the
 76 transaction will be discarded during replay.       76 transaction will be discarded during replay.
 77                                                    77 
 78 External Journal                                   78 External Journal
 79 ~~~~~~~~~~~~~~~~                                   79 ~~~~~~~~~~~~~~~~
 80                                                    80 
 81 Optionally, an ext4 filesystem can be created      81 Optionally, an ext4 filesystem can be created with an external journal
 82 device (as opposed to an internal journal, whi     82 device (as opposed to an internal journal, which uses a reserved inode).
 83 In this case, on the filesystem device, ``s_jo     83 In this case, on the filesystem device, ``s_journal_inum`` should be
 84 zero and ``s_journal_uuid`` should be set. On      84 zero and ``s_journal_uuid`` should be set. On the journal device there
 85 will be an ext4 super block in the usual place     85 will be an ext4 super block in the usual place, with a matching UUID.
 86 The journal superblock will be in the next ful     86 The journal superblock will be in the next full block after the
 87 superblock.                                        87 superblock.
 88                                                    88 
 89 .. list-table::                                    89 .. list-table::
 90    :widths: 12 12 12 32 12                         90    :widths: 12 12 12 32 12
 91    :header-rows: 1                                 91    :header-rows: 1
 92                                                    92 
 93    * - 1024 bytes of padding                       93    * - 1024 bytes of padding
 94      - ext4 Superblock                             94      - ext4 Superblock
 95      - Journal Superblock                          95      - Journal Superblock
 96      - descriptor_block (data_blocks or revoca     96      - descriptor_block (data_blocks or revocation_block) [more data or
 97        revocations] commmit_block                  97        revocations] commmit_block
 98      - [more transactions...]                      98      - [more transactions...]
 99    * -                                             99    * - 
100      -                                            100      -
101      -                                            101      -
102      - One transaction                            102      - One transaction
103      -                                            103      -
104                                                   104 
105 Block Header                                      105 Block Header
106 ~~~~~~~~~~~~                                      106 ~~~~~~~~~~~~
107                                                   107 
108 Every block in the journal starts with a commo    108 Every block in the journal starts with a common 12-byte header
109 ``struct journal_header_s``:                      109 ``struct journal_header_s``:
110                                                   110 
111 .. list-table::                                   111 .. list-table::
112    :widths: 8 8 24 40                             112    :widths: 8 8 24 40
113    :header-rows: 1                                113    :header-rows: 1
114                                                   114 
115    * - Offset                                     115    * - Offset
116      - Type                                       116      - Type
117      - Name                                       117      - Name
118      - Description                                118      - Description
119    * - 0x0                                        119    * - 0x0
120      - __be32                                     120      - __be32
121      - h_magic                                    121      - h_magic
122      - jbd2 magic number, 0xC03B3998.             122      - jbd2 magic number, 0xC03B3998.
123    * - 0x4                                        123    * - 0x4
124      - __be32                                     124      - __be32
125      - h_blocktype                                125      - h_blocktype
126      - Description of what this block contains    126      - Description of what this block contains. See the jbd2_blocktype_ table
127        below.                                     127        below.
128    * - 0x8                                        128    * - 0x8
129      - __be32                                     129      - __be32
130      - h_sequence                                 130      - h_sequence
131      - The transaction ID that goes with this     131      - The transaction ID that goes with this block.
132                                                   132 
133 .. _jbd2_blocktype:                               133 .. _jbd2_blocktype:
134                                                   134 
135 The journal block type can be any one of:         135 The journal block type can be any one of:
136                                                   136 
137 .. list-table::                                   137 .. list-table::
138    :widths: 16 64                                 138    :widths: 16 64
139    :header-rows: 1                                139    :header-rows: 1
140                                                   140 
141    * - Value                                      141    * - Value
142      - Description                                142      - Description
143    * - 1                                          143    * - 1
144      - Descriptor. This block precedes a serie    144      - Descriptor. This block precedes a series of data blocks that were
145        written through the journal during a tr    145        written through the journal during a transaction.
146    * - 2                                          146    * - 2
147      - Block commit record. This block signifi    147      - Block commit record. This block signifies the completion of a
148        transaction.                               148        transaction.
149    * - 3                                          149    * - 3
150      - Journal superblock, v1.                    150      - Journal superblock, v1.
151    * - 4                                          151    * - 4
152      - Journal superblock, v2.                    152      - Journal superblock, v2.
153    * - 5                                          153    * - 5
154      - Block revocation records. This speeds u    154      - Block revocation records. This speeds up recovery by enabling the
155        journal to skip writing blocks that wer    155        journal to skip writing blocks that were subsequently rewritten.
156                                                   156 
157 Super Block                                       157 Super Block
158 ~~~~~~~~~~~                                       158 ~~~~~~~~~~~
159                                                   159 
160 The super block for the journal is much simple    160 The super block for the journal is much simpler as compared to ext4's.
161 The key data kept within are size of the journ    161 The key data kept within are size of the journal, and where to find the
162 start of the log of transactions.                 162 start of the log of transactions.
163                                                   163 
164 The journal superblock is recorded as ``struct    164 The journal superblock is recorded as ``struct journal_superblock_s``,
165 which is 1024 bytes long:                         165 which is 1024 bytes long:
166                                                   166 
167 .. list-table::                                   167 .. list-table::
168    :widths: 8 8 24 40                             168    :widths: 8 8 24 40
169    :header-rows: 1                                169    :header-rows: 1
170                                                   170 
171    * - Offset                                     171    * - Offset
172      - Type                                       172      - Type
173      - Name                                       173      - Name
174      - Description                                174      - Description
175    * -                                            175    * -
176      -                                            176      -
177      -                                            177      -
178      - Static information describing the journ    178      - Static information describing the journal.
179    * - 0x0                                        179    * - 0x0
180      - journal_header_t (12 bytes)                180      - journal_header_t (12 bytes)
181      - s_header                                   181      - s_header
182      - Common header identifying this as a sup    182      - Common header identifying this as a superblock.
183    * - 0xC                                        183    * - 0xC
184      - __be32                                     184      - __be32
185      - s_blocksize                                185      - s_blocksize
186      - Journal device block size.                 186      - Journal device block size.
187    * - 0x10                                       187    * - 0x10
188      - __be32                                     188      - __be32
189      - s_maxlen                                   189      - s_maxlen
190      - Total number of blocks in this journal.    190      - Total number of blocks in this journal.
191    * - 0x14                                       191    * - 0x14
192      - __be32                                     192      - __be32
193      - s_first                                    193      - s_first
194      - First block of log information.            194      - First block of log information.
195    * -                                            195    * -
196      -                                            196      -
197      -                                            197      -
198      - Dynamic information describing the curr    198      - Dynamic information describing the current state of the log.
199    * - 0x18                                       199    * - 0x18
200      - __be32                                     200      - __be32
201      - s_sequence                                 201      - s_sequence
202      - First commit ID expected in log.           202      - First commit ID expected in log.
203    * - 0x1C                                       203    * - 0x1C
204      - __be32                                     204      - __be32
205      - s_start                                    205      - s_start
206      - Block number of the start of log. Contr    206      - Block number of the start of log. Contrary to the comments, this field
207        being zero does not imply that the jour    207        being zero does not imply that the journal is clean!
208    * - 0x20                                       208    * - 0x20
209      - __be32                                     209      - __be32
210      - s_errno                                    210      - s_errno
211      - Error value, as set by jbd2_journal_abo    211      - Error value, as set by jbd2_journal_abort().
212    * -                                            212    * -
213      -                                            213      -
214      -                                            214      -
215      - The remaining fields are only valid in     215      - The remaining fields are only valid in a v2 superblock.
216    * - 0x24                                       216    * - 0x24
217      - __be32                                     217      - __be32
218      - s_feature_compat;                          218      - s_feature_compat;
219      - Compatible feature set. See the table j    219      - Compatible feature set. See the table jbd2_compat_ below.
220    * - 0x28                                       220    * - 0x28
221      - __be32                                     221      - __be32
222      - s_feature_incompat                         222      - s_feature_incompat
223      - Incompatible feature set. See the table    223      - Incompatible feature set. See the table jbd2_incompat_ below.
224    * - 0x2C                                       224    * - 0x2C
225      - __be32                                     225      - __be32
226      - s_feature_ro_compat                        226      - s_feature_ro_compat
227      - Read-only compatible feature set. There    227      - Read-only compatible feature set. There aren't any of these currently.
228    * - 0x30                                       228    * - 0x30
229      - __u8                                       229      - __u8
230      - s_uuid[16]                                 230      - s_uuid[16]
231      - 128-bit uuid for journal. This is compa    231      - 128-bit uuid for journal. This is compared against the copy in the ext4
232        super block at mount time.                 232        super block at mount time.
233    * - 0x40                                       233    * - 0x40
234      - __be32                                     234      - __be32
235      - s_nr_users                                 235      - s_nr_users
236      - Number of file systems sharing this jou    236      - Number of file systems sharing this journal.
237    * - 0x44                                       237    * - 0x44
238      - __be32                                     238      - __be32
239      - s_dynsuper                                 239      - s_dynsuper
240      - Location of dynamic super block copy. (    240      - Location of dynamic super block copy. (Not used?)
241    * - 0x48                                       241    * - 0x48
242      - __be32                                     242      - __be32
243      - s_max_transaction                          243      - s_max_transaction
244      - Limit of journal blocks per transaction    244      - Limit of journal blocks per transaction. (Not used?)
245    * - 0x4C                                       245    * - 0x4C
246      - __be32                                     246      - __be32
247      - s_max_trans_data                           247      - s_max_trans_data
248      - Limit of data blocks per transaction. (    248      - Limit of data blocks per transaction. (Not used?)
249    * - 0x50                                       249    * - 0x50
250      - __u8                                       250      - __u8
251      - s_checksum_type                            251      - s_checksum_type
252      - Checksum algorithm used for the journal    252      - Checksum algorithm used for the journal.  See jbd2_checksum_type_ for
253        more info.                                 253        more info.
254    * - 0x51                                       254    * - 0x51
255      - __u8[3]                                    255      - __u8[3]
256      - s_padding2                                 256      - s_padding2
257      -                                            257      -
258    * - 0x54                                       258    * - 0x54
259      - __be32                                     259      - __be32
260      - s_num_fc_blocks                            260      - s_num_fc_blocks
261      - Number of fast commit blocks in the jou    261      - Number of fast commit blocks in the journal.
262    * - 0x58                                       262    * - 0x58
263      - __be32                                     263      - __be32
264      - s_head                                     264      - s_head
265      - Block number of the head (first unused     265      - Block number of the head (first unused block) of the journal, only
266        up-to-date when the journal is empty.      266        up-to-date when the journal is empty.
267    * - 0x5C                                       267    * - 0x5C
268      - __u32                                      268      - __u32
269      - s_padding[40]                              269      - s_padding[40]
270      -                                            270      -
271    * - 0xFC                                       271    * - 0xFC
272      - __be32                                     272      - __be32
273      - s_checksum                                 273      - s_checksum
274      - Checksum of the entire superblock, with    274      - Checksum of the entire superblock, with this field set to zero.
275    * - 0x100                                      275    * - 0x100
276      - __u8                                       276      - __u8
277      - s_users[16*48]                             277      - s_users[16*48]
278      - ids of all file systems sharing the log    278      - ids of all file systems sharing the log. e2fsprogs/Linux don't allow
279        shared external journals, but I imagine    279        shared external journals, but I imagine Lustre (or ocfs2?), which use
280        the jbd2 code, might.                      280        the jbd2 code, might.
281                                                   281 
282 .. _jbd2_compat:                                  282 .. _jbd2_compat:
283                                                   283 
284 The journal compat features are any combinatio    284 The journal compat features are any combination of the following:
285                                                   285 
286 .. list-table::                                   286 .. list-table::
287    :widths: 16 64                                 287    :widths: 16 64
288    :header-rows: 1                                288    :header-rows: 1
289                                                   289 
290    * - Value                                      290    * - Value
291      - Description                                291      - Description
292    * - 0x1                                        292    * - 0x1
293      - Journal maintains checksums on the data    293      - Journal maintains checksums on the data blocks.
294        (JBD2_FEATURE_COMPAT_CHECKSUM)             294        (JBD2_FEATURE_COMPAT_CHECKSUM)
295                                                   295 
296 .. _jbd2_incompat:                                296 .. _jbd2_incompat:
297                                                   297 
298 The journal incompat features are any combinat    298 The journal incompat features are any combination of the following:
299                                                   299 
300 .. list-table::                                   300 .. list-table::
301    :widths: 16 64                                 301    :widths: 16 64
302    :header-rows: 1                                302    :header-rows: 1
303                                                   303 
304    * - Value                                      304    * - Value
305      - Description                                305      - Description
306    * - 0x1                                        306    * - 0x1
307      - Journal has block revocation records. (    307      - Journal has block revocation records. (JBD2_FEATURE_INCOMPAT_REVOKE)
308    * - 0x2                                        308    * - 0x2
309      - Journal can deal with 64-bit block numb    309      - Journal can deal with 64-bit block numbers.
310        (JBD2_FEATURE_INCOMPAT_64BIT)              310        (JBD2_FEATURE_INCOMPAT_64BIT)
311    * - 0x4                                        311    * - 0x4
312      - Journal commits asynchronously. (JBD2_F    312      - Journal commits asynchronously. (JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT)
313    * - 0x8                                        313    * - 0x8
314      - This journal uses v2 of the checksum on    314      - This journal uses v2 of the checksum on-disk format. Each journal
315        metadata block gets its own checksum, a    315        metadata block gets its own checksum, and the block tags in the
316        descriptor table contain checksums for     316        descriptor table contain checksums for each of the data blocks in the
317        journal. (JBD2_FEATURE_INCOMPAT_CSUM_V2    317        journal. (JBD2_FEATURE_INCOMPAT_CSUM_V2)
318    * - 0x10                                       318    * - 0x10
319      - This journal uses v3 of the checksum on    319      - This journal uses v3 of the checksum on-disk format. This is the same as
320        v2, but the journal block tag size is f    320        v2, but the journal block tag size is fixed regardless of the size of
321        block numbers. (JBD2_FEATURE_INCOMPAT_C    321        block numbers. (JBD2_FEATURE_INCOMPAT_CSUM_V3)
322    * - 0x20                                       322    * - 0x20
323      - Journal has fast commit blocks. (JBD2_F    323      - Journal has fast commit blocks. (JBD2_FEATURE_INCOMPAT_FAST_COMMIT)
324                                                   324 
325 .. _jbd2_checksum_type:                           325 .. _jbd2_checksum_type:
326                                                   326 
327 Journal checksum type codes are one of the fol    327 Journal checksum type codes are one of the following.  crc32 or crc32c are the
328 most likely choices.                              328 most likely choices.
329                                                   329 
330 .. list-table::                                   330 .. list-table::
331    :widths: 16 64                                 331    :widths: 16 64
332    :header-rows: 1                                332    :header-rows: 1
333                                                   333 
334    * - Value                                      334    * - Value
335      - Description                                335      - Description
336    * - 1                                          336    * - 1
337      - CRC32                                      337      - CRC32
338    * - 2                                          338    * - 2
339      - MD5                                        339      - MD5
340    * - 3                                          340    * - 3
341      - SHA1                                       341      - SHA1
342    * - 4                                          342    * - 4
343      - CRC32C                                     343      - CRC32C
344                                                   344 
345 Descriptor Block                                  345 Descriptor Block
346 ~~~~~~~~~~~~~~~~                                  346 ~~~~~~~~~~~~~~~~
347                                                   347 
348 The descriptor block contains an array of jour    348 The descriptor block contains an array of journal block tags that
349 describe the final locations of the data block    349 describe the final locations of the data blocks that follow in the
350 journal. Descriptor blocks are open-coded inst    350 journal. Descriptor blocks are open-coded instead of being completely
351 described by a data structure, but here is the    351 described by a data structure, but here is the block structure anyway.
352 Descriptor blocks consume at least 36 bytes, b    352 Descriptor blocks consume at least 36 bytes, but use a full block:
353                                                   353 
354 .. list-table::                                   354 .. list-table::
355    :widths: 8 8 24 40                             355    :widths: 8 8 24 40
356    :header-rows: 1                                356    :header-rows: 1
357                                                   357 
358    * - Offset                                     358    * - Offset
359      - Type                                       359      - Type
360      - Name                                       360      - Name
361      - Descriptor                                 361      - Descriptor
362    * - 0x0                                        362    * - 0x0
363      - journal_header_t                           363      - journal_header_t
364      - (open coded)                               364      - (open coded)
365      - Common block header.                       365      - Common block header.
366    * - 0xC                                        366    * - 0xC
367      - struct journal_block_tag_s                 367      - struct journal_block_tag_s
368      - open coded array[]                         368      - open coded array[]
369      - Enough tags either to fill up the block    369      - Enough tags either to fill up the block or to describe all the data
370        blocks that follow this descriptor bloc    370        blocks that follow this descriptor block.
371                                                   371 
372 Journal block tags have any of the following f    372 Journal block tags have any of the following formats, depending on which
373 journal feature and block tag flags are set.      373 journal feature and block tag flags are set.
374                                                   374 
375 If JBD2_FEATURE_INCOMPAT_CSUM_V3 is set, the j    375 If JBD2_FEATURE_INCOMPAT_CSUM_V3 is set, the journal block tag is
376 defined as ``struct journal_block_tag3_s``, wh    376 defined as ``struct journal_block_tag3_s``, which looks like the
377 following. The size is 16 or 32 bytes.            377 following. The size is 16 or 32 bytes.
378                                                   378 
379 .. list-table::                                   379 .. list-table::
380    :widths: 8 8 24 40                             380    :widths: 8 8 24 40
381    :header-rows: 1                                381    :header-rows: 1
382                                                   382 
383    * - Offset                                     383    * - Offset
384      - Type                                       384      - Type
385      - Name                                       385      - Name
386      - Descriptor                                 386      - Descriptor
387    * - 0x0                                        387    * - 0x0
388      - __be32                                     388      - __be32
389      - t_blocknr                                  389      - t_blocknr
390      - Lower 32-bits of the location of where     390      - Lower 32-bits of the location of where the corresponding data block
391        should end up on disk.                     391        should end up on disk.
392    * - 0x4                                        392    * - 0x4
393      - __be32                                     393      - __be32
394      - t_flags                                    394      - t_flags
395      - Flags that go with the descriptor. See     395      - Flags that go with the descriptor. See the table jbd2_tag_flags_ for
396        more info.                                 396        more info.
397    * - 0x8                                        397    * - 0x8
398      - __be32                                     398      - __be32
399      - t_blocknr_high                             399      - t_blocknr_high
400      - Upper 32-bits of the location of where     400      - Upper 32-bits of the location of where the corresponding data block
401        should end up on disk. This is zero if     401        should end up on disk. This is zero if JBD2_FEATURE_INCOMPAT_64BIT is
402        not enabled.                               402        not enabled.
403    * - 0xC                                        403    * - 0xC
404      - __be32                                     404      - __be32
405      - t_checksum                                 405      - t_checksum
406      - Checksum of the journal UUID, the seque    406      - Checksum of the journal UUID, the sequence number, and the data block.
407    * -                                            407    * -
408      -                                            408      -
409      -                                            409      -
410      - This field appears to be open coded. It    410      - This field appears to be open coded. It always comes at the end of the
411        tag, after t_checksum. This field is no    411        tag, after t_checksum. This field is not present if the "same UUID" flag
412        is set.                                    412        is set.
413    * - 0x8 or 0xC                                 413    * - 0x8 or 0xC
414      - char                                       414      - char
415      - uuid[16]                                   415      - uuid[16]
416      - A UUID to go with this tag. This field     416      - A UUID to go with this tag. This field appears to be copied from the
417        ``j_uuid`` field in ``struct journal_s`    417        ``j_uuid`` field in ``struct journal_s``, but only tune2fs touches that
418        field.                                     418        field.
419                                                   419 
420 .. _jbd2_tag_flags:                               420 .. _jbd2_tag_flags:
421                                                   421 
422 The journal tag flags are any combination of t    422 The journal tag flags are any combination of the following:
423                                                   423 
424 .. list-table::                                   424 .. list-table::
425    :widths: 16 64                                 425    :widths: 16 64
426    :header-rows: 1                                426    :header-rows: 1
427                                                   427 
428    * - Value                                      428    * - Value
429      - Description                                429      - Description
430    * - 0x1                                        430    * - 0x1
431      - On-disk block is escaped. The first fou    431      - On-disk block is escaped. The first four bytes of the data block just
432        happened to match the jbd2 magic number    432        happened to match the jbd2 magic number.
433    * - 0x2                                        433    * - 0x2
434      - This block has the same UUID as previou    434      - This block has the same UUID as previous, therefore the UUID field is
435        omitted.                                   435        omitted.
436    * - 0x4                                        436    * - 0x4
437      - The data block was deleted by the trans    437      - The data block was deleted by the transaction. (Not used?)
438    * - 0x8                                        438    * - 0x8
439      - This is the last tag in this descriptor    439      - This is the last tag in this descriptor block.
440                                                   440 
441 If JBD2_FEATURE_INCOMPAT_CSUM_V3 is NOT set, t    441 If JBD2_FEATURE_INCOMPAT_CSUM_V3 is NOT set, the journal block tag
442 is defined as ``struct journal_block_tag_s``,     442 is defined as ``struct journal_block_tag_s``, which looks like the
443 following. The size is 8, 12, 24, or 28 bytes:    443 following. The size is 8, 12, 24, or 28 bytes:
444                                                   444 
445 .. list-table::                                   445 .. list-table::
446    :widths: 8 8 24 40                             446    :widths: 8 8 24 40
447    :header-rows: 1                                447    :header-rows: 1
448                                                   448 
449    * - Offset                                     449    * - Offset
450      - Type                                       450      - Type
451      - Name                                       451      - Name
452      - Descriptor                                 452      - Descriptor
453    * - 0x0                                        453    * - 0x0
454      - __be32                                     454      - __be32
455      - t_blocknr                                  455      - t_blocknr
456      - Lower 32-bits of the location of where     456      - Lower 32-bits of the location of where the corresponding data block
457        should end up on disk.                     457        should end up on disk.
458    * - 0x4                                        458    * - 0x4
459      - __be16                                     459      - __be16
460      - t_checksum                                 460      - t_checksum
461      - Checksum of the journal UUID, the seque    461      - Checksum of the journal UUID, the sequence number, and the data block.
462        Note that only the lower 16 bits are st    462        Note that only the lower 16 bits are stored.
463    * - 0x6                                        463    * - 0x6
464      - __be16                                     464      - __be16
465      - t_flags                                    465      - t_flags
466      - Flags that go with the descriptor. See     466      - Flags that go with the descriptor. See the table jbd2_tag_flags_ for
467        more info.                                 467        more info.
468    * -                                            468    * -
469      -                                            469      -
470      -                                            470      -
471      - This next field is only present if the     471      - This next field is only present if the super block indicates support for
472        64-bit block numbers.                      472        64-bit block numbers.
473    * - 0x8                                        473    * - 0x8
474      - __be32                                     474      - __be32
475      - t_blocknr_high                             475      - t_blocknr_high
476      - Upper 32-bits of the location of where     476      - Upper 32-bits of the location of where the corresponding data block
477        should end up on disk.                     477        should end up on disk.
478    * -                                            478    * -
479      -                                            479      -
480      -                                            480      -
481      - This field appears to be open coded. It    481      - This field appears to be open coded. It always comes at the end of the
482        tag, after t_flags or t_blocknr_high. T    482        tag, after t_flags or t_blocknr_high. This field is not present if the
483        "same UUID" flag is set.                   483        "same UUID" flag is set.
484    * - 0x8 or 0xC                                 484    * - 0x8 or 0xC
485      - char                                       485      - char
486      - uuid[16]                                   486      - uuid[16]
487      - A UUID to go with this tag. This field     487      - A UUID to go with this tag. This field appears to be copied from the
488        ``j_uuid`` field in ``struct journal_s`    488        ``j_uuid`` field in ``struct journal_s``, but only tune2fs touches that
489        field.                                     489        field.
490                                                   490 
491 If JBD2_FEATURE_INCOMPAT_CSUM_V2 or               491 If JBD2_FEATURE_INCOMPAT_CSUM_V2 or
492 JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end    492 JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end of the block is a
493 ``struct jbd2_journal_block_tail``, which look    493 ``struct jbd2_journal_block_tail``, which looks like this:
494                                                   494 
495 .. list-table::                                   495 .. list-table::
496    :widths: 8 8 24 40                             496    :widths: 8 8 24 40
497    :header-rows: 1                                497    :header-rows: 1
498                                                   498 
499    * - Offset                                     499    * - Offset
500      - Type                                       500      - Type
501      - Name                                       501      - Name
502      - Descriptor                                 502      - Descriptor
503    * - 0x0                                        503    * - 0x0
504      - __be32                                     504      - __be32
505      - t_checksum                                 505      - t_checksum
506      - Checksum of the journal UUID + the desc    506      - Checksum of the journal UUID + the descriptor block, with this field set
507        to zero.                                   507        to zero.
508                                                   508 
509 Data Block                                        509 Data Block
510 ~~~~~~~~~~                                        510 ~~~~~~~~~~
511                                                   511 
512 In general, the data blocks being written to d    512 In general, the data blocks being written to disk through the journal
513 are written verbatim into the journal file aft    513 are written verbatim into the journal file after the descriptor block.
514 However, if the first four bytes of the block     514 However, if the first four bytes of the block match the jbd2 magic
515 number then those four bytes are replaced with    515 number then those four bytes are replaced with zeroes and the “escaped”
516 flag is set in the descriptor block tag.          516 flag is set in the descriptor block tag.
517                                                   517 
518 Revocation Block                                  518 Revocation Block
519 ~~~~~~~~~~~~~~~~                                  519 ~~~~~~~~~~~~~~~~
520                                                   520 
521 A revocation block is used to prevent replay o    521 A revocation block is used to prevent replay of a block in an earlier
522 transaction. This is used to mark blocks that     522 transaction. This is used to mark blocks that were journalled at one
523 time but are no longer journalled. Typically t    523 time but are no longer journalled. Typically this happens if a metadata
524 block is freed and re-allocated as a file data    524 block is freed and re-allocated as a file data block; in this case, a
525 journal replay after the file block was writte    525 journal replay after the file block was written to disk will cause
526 corruption.                                       526 corruption.
527                                                   527 
528 **NOTE**: This mechanism is NOT used to expres    528 **NOTE**: This mechanism is NOT used to express “this journal block is
529 superseded by this other journal block”, as     529 superseded by this other journal block”, as the author (djwong)
530 mistakenly thought. Any block being added to a    530 mistakenly thought. Any block being added to a transaction will cause
531 the removal of all existing revocation records    531 the removal of all existing revocation records for that block.
532                                                   532 
533 Revocation blocks are described in                533 Revocation blocks are described in
534 ``struct jbd2_journal_revoke_header_s``, are a    534 ``struct jbd2_journal_revoke_header_s``, are at least 16 bytes in
535 length, but use a full block:                     535 length, but use a full block:
536                                                   536 
537 .. list-table::                                   537 .. list-table::
538    :widths: 8 8 24 40                             538    :widths: 8 8 24 40
539    :header-rows: 1                                539    :header-rows: 1
540                                                   540 
541    * - Offset                                     541    * - Offset
542      - Type                                       542      - Type
543      - Name                                       543      - Name
544      - Description                                544      - Description
545    * - 0x0                                        545    * - 0x0
546      - journal_header_t                           546      - journal_header_t
547      - r_header                                   547      - r_header
548      - Common block header.                       548      - Common block header.
549    * - 0xC                                        549    * - 0xC
550      - __be32                                     550      - __be32
551      - r_count                                    551      - r_count
552      - Number of bytes used in this block.        552      - Number of bytes used in this block.
553    * - 0x10                                       553    * - 0x10
554      - __be32 or __be64                           554      - __be32 or __be64
555      - blocks[0]                                  555      - blocks[0]
556      - Blocks to revoke.                          556      - Blocks to revoke.
557                                                   557 
558 After r_count is a linear array of block numbe    558 After r_count is a linear array of block numbers that are effectively
559 revoked by this transaction. The size of each     559 revoked by this transaction. The size of each block number is 8 bytes if
560 the superblock advertises 64-bit block number     560 the superblock advertises 64-bit block number support, or 4 bytes
561 otherwise.                                        561 otherwise.
562                                                   562 
563 If JBD2_FEATURE_INCOMPAT_CSUM_V2 or               563 If JBD2_FEATURE_INCOMPAT_CSUM_V2 or
564 JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end    564 JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end of the revocation
565 block is a ``struct jbd2_journal_revoke_tail``    565 block is a ``struct jbd2_journal_revoke_tail``, which has this format:
566                                                   566 
567 .. list-table::                                   567 .. list-table::
568    :widths: 8 8 24 40                             568    :widths: 8 8 24 40
569    :header-rows: 1                                569    :header-rows: 1
570                                                   570 
571    * - Offset                                     571    * - Offset
572      - Type                                       572      - Type
573      - Name                                       573      - Name
574      - Description                                574      - Description
575    * - 0x0                                        575    * - 0x0
576      - __be32                                     576      - __be32
577      - r_checksum                                 577      - r_checksum
578      - Checksum of the journal UUID + revocati    578      - Checksum of the journal UUID + revocation block
579                                                   579 
580 Commit Block                                      580 Commit Block
581 ~~~~~~~~~~~~                                      581 ~~~~~~~~~~~~
582                                                   582 
583 The commit block is a sentry that indicates th    583 The commit block is a sentry that indicates that a transaction has been
584 completely written to the journal. Once this c    584 completely written to the journal. Once this commit block reaches the
585 journal, the data stored with this transaction    585 journal, the data stored with this transaction can be written to their
586 final locations on disk.                          586 final locations on disk.
587                                                   587 
588 The commit block is described by ``struct comm    588 The commit block is described by ``struct commit_header``, which is 32
589 bytes long (but uses a full block):               589 bytes long (but uses a full block):
590                                                   590 
591 .. list-table::                                   591 .. list-table::
592    :widths: 8 8 24 40                             592    :widths: 8 8 24 40
593    :header-rows: 1                                593    :header-rows: 1
594                                                   594 
595    * - Offset                                     595    * - Offset
596      - Type                                       596      - Type
597      - Name                                       597      - Name
598      - Descriptor                                 598      - Descriptor
599    * - 0x0                                        599    * - 0x0
600      - journal_header_s                           600      - journal_header_s
601      - (open coded)                               601      - (open coded)
602      - Common block header.                       602      - Common block header.
603    * - 0xC                                        603    * - 0xC
604      - unsigned char                              604      - unsigned char
605      - h_chksum_type                              605      - h_chksum_type
606      - The type of checksum to use to verify t    606      - The type of checksum to use to verify the integrity of the data blocks
607        in the transaction. See jbd2_checksum_t    607        in the transaction. See jbd2_checksum_type_ for more info.
608    * - 0xD                                        608    * - 0xD
609      - unsigned char                              609      - unsigned char
610      - h_chksum_size                              610      - h_chksum_size
611      - The number of bytes used by the checksu    611      - The number of bytes used by the checksum. Most likely 4.
612    * - 0xE                                        612    * - 0xE
613      - unsigned char                              613      - unsigned char
614      - h_padding[2]                               614      - h_padding[2]
615      -                                            615      -
616    * - 0x10                                       616    * - 0x10
617      - __be32                                     617      - __be32
618      - h_chksum[JBD2_CHECKSUM_BYTES]              618      - h_chksum[JBD2_CHECKSUM_BYTES]
619      - 32 bytes of space to store checksums. I    619      - 32 bytes of space to store checksums. If
620        JBD2_FEATURE_INCOMPAT_CSUM_V2 or JBD2_F    620        JBD2_FEATURE_INCOMPAT_CSUM_V2 or JBD2_FEATURE_INCOMPAT_CSUM_V3
621        are set, the first ``__be32`` is the ch    621        are set, the first ``__be32`` is the checksum of the journal UUID and
622        the entire commit block, with this fiel    622        the entire commit block, with this field zeroed. If
623        JBD2_FEATURE_COMPAT_CHECKSUM is set, th    623        JBD2_FEATURE_COMPAT_CHECKSUM is set, the first ``__be32`` is the
624        crc32 of all the blocks already written    624        crc32 of all the blocks already written to the transaction.
625    * - 0x30                                       625    * - 0x30
626      - __be64                                     626      - __be64
627      - h_commit_sec                               627      - h_commit_sec
628      - The time that the transaction was commi    628      - The time that the transaction was committed, in seconds since the epoch.
629    * - 0x38                                       629    * - 0x38
630      - __be32                                     630      - __be32
631      - h_commit_nsec                              631      - h_commit_nsec
632      - Nanoseconds component of the above time    632      - Nanoseconds component of the above timestamp.
633                                                   633 
634 Fast commits                                      634 Fast commits
635 ~~~~~~~~~~~~                                      635 ~~~~~~~~~~~~
636                                                   636 
637 Fast commit area is organized as a log of tag     637 Fast commit area is organized as a log of tag length values. Each TLV has
638 a ``struct ext4_fc_tl`` in the beginning which    638 a ``struct ext4_fc_tl`` in the beginning which stores the tag and the length
639 of the entire field. It is followed by variabl    639 of the entire field. It is followed by variable length tag specific value.
640 Here is the list of supported tags and their m    640 Here is the list of supported tags and their meanings:
641                                                   641 
642 .. list-table::                                   642 .. list-table::
643    :widths: 8 20 20 32                            643    :widths: 8 20 20 32
644    :header-rows: 1                                644    :header-rows: 1
645                                                   645 
646    * - Tag                                        646    * - Tag
647      - Meaning                                    647      - Meaning
648      - Value struct                               648      - Value struct
649      - Description                                649      - Description
650    * - EXT4_FC_TAG_HEAD                           650    * - EXT4_FC_TAG_HEAD
651      - Fast commit area header                    651      - Fast commit area header
652      - ``struct ext4_fc_head``                    652      - ``struct ext4_fc_head``
653      - Stores the TID of the transaction after    653      - Stores the TID of the transaction after which these fast commits should
654        be applied.                                654        be applied.
655    * - EXT4_FC_TAG_ADD_RANGE                      655    * - EXT4_FC_TAG_ADD_RANGE
656      - Add extent to inode                        656      - Add extent to inode
657      - ``struct ext4_fc_add_range``               657      - ``struct ext4_fc_add_range``
658      - Stores the inode number and extent to b    658      - Stores the inode number and extent to be added in this inode
659    * - EXT4_FC_TAG_DEL_RANGE                      659    * - EXT4_FC_TAG_DEL_RANGE
660      - Remove logical offsets to inode            660      - Remove logical offsets to inode
661      - ``struct ext4_fc_del_range``               661      - ``struct ext4_fc_del_range``
662      - Stores the inode number and the logical    662      - Stores the inode number and the logical offset range that needs to be
663        removed                                    663        removed
664    * - EXT4_FC_TAG_CREAT                          664    * - EXT4_FC_TAG_CREAT
665      - Create directory entry for a newly crea    665      - Create directory entry for a newly created file
666      - ``struct ext4_fc_dentry_info``             666      - ``struct ext4_fc_dentry_info``
667      - Stores the parent inode number, inode n    667      - Stores the parent inode number, inode number and directory entry of the
668        newly created file                         668        newly created file
669    * - EXT4_FC_TAG_LINK                           669    * - EXT4_FC_TAG_LINK
670      - Link a directory entry to an inode         670      - Link a directory entry to an inode
671      - ``struct ext4_fc_dentry_info``             671      - ``struct ext4_fc_dentry_info``
672      - Stores the parent inode number, inode n    672      - Stores the parent inode number, inode number and directory entry
673    * - EXT4_FC_TAG_UNLINK                         673    * - EXT4_FC_TAG_UNLINK
674      - Unlink a directory entry of an inode       674      - Unlink a directory entry of an inode
675      - ``struct ext4_fc_dentry_info``             675      - ``struct ext4_fc_dentry_info``
676      - Stores the parent inode number, inode n    676      - Stores the parent inode number, inode number and directory entry
677                                                   677 
678    * - EXT4_FC_TAG_PAD                            678    * - EXT4_FC_TAG_PAD
679      - Padding (unused area)                      679      - Padding (unused area)
680      - None                                       680      - None
681      - Unused bytes in the fast commit area.      681      - Unused bytes in the fast commit area.
682                                                   682 
683    * - EXT4_FC_TAG_TAIL                           683    * - EXT4_FC_TAG_TAIL
684      - Mark the end of a fast commit              684      - Mark the end of a fast commit
685      - ``struct ext4_fc_tail``                    685      - ``struct ext4_fc_tail``
686      - Stores the TID of the commit, CRC of th    686      - Stores the TID of the commit, CRC of the fast commit of which this tag
687        represents the end of                      687        represents the end of
688                                                   688 
689 Fast Commit Replay Idempotence                    689 Fast Commit Replay Idempotence
690 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    690 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
691                                                   691 
692 Fast commits tags are idempotent in nature pro    692 Fast commits tags are idempotent in nature provided the recovery code follows
693 certain rules. The guiding principle that the     693 certain rules. The guiding principle that the commit path follows while
694 committing is that it stores the result of a p    694 committing is that it stores the result of a particular operation instead of
695 storing the procedure.                            695 storing the procedure.
696                                                   696 
697 Let's consider this rename operation: 'mv /a /    697 Let's consider this rename operation: 'mv /a /b'. Let's assume dirent '/a'
698 was associated with inode 10. During fast comm    698 was associated with inode 10. During fast commit, instead of storing this
699 operation as a procedure "rename a to b", we s    699 operation as a procedure "rename a to b", we store the resulting file system
700 state as a "series" of outcomes:                  700 state as a "series" of outcomes:
701                                                   701 
702 - Link dirent b to inode 10                       702 - Link dirent b to inode 10
703 - Unlink dirent a                                 703 - Unlink dirent a
704 - Inode 10 with valid refcount                    704 - Inode 10 with valid refcount
705                                                   705 
706 Now when recovery code runs, it needs "enforce    706 Now when recovery code runs, it needs "enforce" this state on the file
707 system. This is what guarantees idempotence of    707 system. This is what guarantees idempotence of fast commit replay.
708                                                   708 
709 Let's take an example of a procedure that is n    709 Let's take an example of a procedure that is not idempotent and see how fast
710 commits make it idempotent. Consider following    710 commits make it idempotent. Consider following sequence of operations:
711                                                   711 
712 1) rm A                                           712 1) rm A
713 2) mv B A                                         713 2) mv B A
714 3) read A                                         714 3) read A
715                                                   715 
716 If we store this sequence of operations as is     716 If we store this sequence of operations as is then the replay is not idempotent.
717 Let's say while in replay, we crash after (2).    717 Let's say while in replay, we crash after (2). During the second replay,
718 file A (which was actually created as a result    718 file A (which was actually created as a result of "mv B A" operation) would get
719 deleted. Thus, file named A would be absent wh    719 deleted. Thus, file named A would be absent when we try to read A. So, this
720 sequence of operations is not idempotent. Howe    720 sequence of operations is not idempotent. However, as mentioned above, instead
721 of storing the procedure fast commits store th    721 of storing the procedure fast commits store the outcome of each procedure. Thus
722 the fast commit log for above procedure would     722 the fast commit log for above procedure would be as follows:
723                                                   723 
724 (Let's assume dirent A was linked to inode 10     724 (Let's assume dirent A was linked to inode 10 and dirent B was linked to
725 inode 11 before the replay)                       725 inode 11 before the replay)
726                                                   726 
727 1) Unlink A                                       727 1) Unlink A
728 2) Link A to inode 11                             728 2) Link A to inode 11
729 3) Unlink B                                       729 3) Unlink B
730 4) Inode 11                                       730 4) Inode 11
731                                                   731 
732 If we crash after (3) we will have file A link    732 If we crash after (3) we will have file A linked to inode 11. During the second
733 replay, we will remove file A (inode 11). But     733 replay, we will remove file A (inode 11). But we will create it back and make
734 it point to inode 11. We won't find B, so we'l    734 it point to inode 11. We won't find B, so we'll just skip that step. At this
735 point, the refcount for inode 11 is not reliab    735 point, the refcount for inode 11 is not reliable, but that gets fixed by the
736 replay of last inode 11 tag. Thus, by converti    736 replay of last inode 11 tag. Thus, by converting a non-idempotent procedure
737 into a series of idempotent outcomes, fast com    737 into a series of idempotent outcomes, fast commits ensured idempotence during
738 the replay.                                       738 the replay.
739                                                   739 
740 Journal Checkpoint                                740 Journal Checkpoint
741 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    741 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
742                                                   742 
743 Checkpointing the journal ensures all transact    743 Checkpointing the journal ensures all transactions and their associated buffers
744 are submitted to the disk. In-progress transac    744 are submitted to the disk. In-progress transactions are waited upon and included
745 in the checkpoint. Checkpointing is used inter    745 in the checkpoint. Checkpointing is used internally during critical updates to
746 the filesystem including journal recovery, fil    746 the filesystem including journal recovery, filesystem resizing, and freeing of
747 the journal_t structure.                          747 the journal_t structure.
748                                                   748 
749 A journal checkpoint can be triggered from use    749 A journal checkpoint can be triggered from userspace via the ioctl
750 EXT4_IOC_CHECKPOINT. This ioctl takes a single    750 EXT4_IOC_CHECKPOINT. This ioctl takes a single, u64 argument for flags.
751 Currently, three flags are supported. First, E    751 Currently, three flags are supported. First, EXT4_IOC_CHECKPOINT_FLAG_DRY_RUN
752 can be used to verify input to the ioctl. It r    752 can be used to verify input to the ioctl. It returns error if there is any
753 invalid input, otherwise it returns success wi    753 invalid input, otherwise it returns success without performing
754 any checkpointing. This can be used to check w    754 any checkpointing. This can be used to check whether the ioctl exists on a
755 system and to verify there are no issues with     755 system and to verify there are no issues with arguments or flags. The
756 other two flags are EXT4_IOC_CHECKPOINT_FLAG_D    756 other two flags are EXT4_IOC_CHECKPOINT_FLAG_DISCARD and
757 EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT. These flags     757 EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT. These flags cause the journal blocks to be
758 discarded or zero-filled, respectively, after     758 discarded or zero-filled, respectively, after the journal checkpoint is
759 complete. EXT4_IOC_CHECKPOINT_FLAG_DISCARD and    759 complete. EXT4_IOC_CHECKPOINT_FLAG_DISCARD and EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT
760 cannot both be set. The ioctl may be useful wh    760 cannot both be set. The ioctl may be useful when snapshotting a system or for
761 complying with content deletion SLOs.             761 complying with content deletion SLOs.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php