~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/ext4/journal.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/filesystems/ext4/journal.rst (Version linux-6.12-rc7) and /Documentation/filesystems/ext4/journal.rst (Version linux-2.4.37.11)


  1 .. SPDX-License-Identifier: GPL-2.0               
  2                                                   
  3 Journal (jbd2)                                    
  4 --------------                                    
  5                                                   
  6 Introduced in ext3, the ext4 filesystem employ    
  7 filesystem against metadata inconsistencies in    
  8 to 10,240,000 file system blocks (see man mke2    
  9 size limits) can be reserved inside the filesy    
 10 “important” data writes on-disk as quickly    
 11 data transaction is fully written to the disk     
 12 cache, a record of the data being committed is    
 13 some later point in time, the journal code wri    
 14 final locations on disk (this could involve a     
 15 read-write-erases) before erasing the commit r    
 16 crash during the second slow write, the journa    
 17 way to the latest commit record, guaranteeing     
 18 gets written through the journal to the disk.     
 19 guarantee that the filesystem does not become     
 20 metadata update.                                  
 21                                                   
 22 For performance reasons, ext4 by default only     
 23 through the journal. This means that file data    
 24 guaranteed to be in any consistent state after    
 25 guarantee level (``data=ordered``) is not sati    
 26 option to control journal behavior. If ``data=    
 27 metadata are written to disk through the journ    
 28 safest. If ``data=writeback``, dirty data bloc    
 29 disk before the metadata are written to disk t    
 30                                                   
 31 In case of ``data=ordered`` mode, Ext4 also su    
 32 help reduce commit latency significantly. The     
 33 mode works by logging metadata blocks to the j    
 34 mode, Ext4 only stores the minimal delta neede    
 35 affected metadata in fast commit space that is    
 36 Once the fast commit area fills in or if fast     
 37 or if JBD2 commit timer goes off, Ext4 perform    
 38 A full commit invalidates all the fast commits    
 39 it and thus it makes the fast commit area empt    
 40 commits. This feature needs to be enabled at m    
 41                                                   
 42 The journal inode is typically inode 8. The fi    
 43 journal inode are replicated in the ext4 super    
 44 is normal (but hidden) file within the filesys    
 45 consumes an entire block group, though mke2fs     
 46 middle of the disk.                               
 47                                                   
 48 All fields in jbd2 are written to disk in big-    
 49 opposite of ext4.                                 
 50                                                   
 51 NOTE: Both ext4 and ocfs2 use jbd2.               
 52                                                   
 53 The maximum size of a journal embedded in an e    
 54 blocks. jbd2 itself does not seem to care.        
 55                                                   
 56 Layout                                            
 57 ~~~~~~                                            
 58                                                   
 59 Generally speaking, the journal has this forma    
 60                                                   
 61 .. list-table::                                   
 62    :widths: 16 48 16                              
 63    :header-rows: 1                                
 64                                                   
 65    * - Superblock                                 
 66      - descriptor_block (data_blocks or revoca    
 67        revocations] commmit_block                 
 68      - [more transactions...]                     
 69    * -                                            
 70      - One transaction                            
 71      -                                            
 72                                                   
 73 Notice that a transaction begins with either a    
 74 or a block revocation list. A finished transac    
 75 commit. If there is no commit record (or the c    
 76 transaction will be discarded during replay.      
 77                                                   
 78 External Journal                                  
 79 ~~~~~~~~~~~~~~~~                                  
 80                                                   
 81 Optionally, an ext4 filesystem can be created     
 82 device (as opposed to an internal journal, whi    
 83 In this case, on the filesystem device, ``s_jo    
 84 zero and ``s_journal_uuid`` should be set. On     
 85 will be an ext4 super block in the usual place    
 86 The journal superblock will be in the next ful    
 87 superblock.                                       
 88                                                   
 89 .. list-table::                                   
 90    :widths: 12 12 12 32 12                        
 91    :header-rows: 1                                
 92                                                   
 93    * - 1024 bytes of padding                      
 94      - ext4 Superblock                            
 95      - Journal Superblock                         
 96      - descriptor_block (data_blocks or revoca    
 97        revocations] commmit_block                 
 98      - [more transactions...]                     
 99    * -                                            
100      -                                            
101      -                                            
102      - One transaction                            
103      -                                            
104                                                   
105 Block Header                                      
106 ~~~~~~~~~~~~                                      
107                                                   
108 Every block in the journal starts with a commo    
109 ``struct journal_header_s``:                      
110                                                   
111 .. list-table::                                   
112    :widths: 8 8 24 40                             
113    :header-rows: 1                                
114                                                   
115    * - Offset                                     
116      - Type                                       
117      - Name                                       
118      - Description                                
119    * - 0x0                                        
120      - __be32                                     
121      - h_magic                                    
122      - jbd2 magic number, 0xC03B3998.             
123    * - 0x4                                        
124      - __be32                                     
125      - h_blocktype                                
126      - Description of what this block contains    
127        below.                                     
128    * - 0x8                                        
129      - __be32                                     
130      - h_sequence                                 
131      - The transaction ID that goes with this     
132                                                   
133 .. _jbd2_blocktype:                               
134                                                   
135 The journal block type can be any one of:         
136                                                   
137 .. list-table::                                   
138    :widths: 16 64                                 
139    :header-rows: 1                                
140                                                   
141    * - Value                                      
142      - Description                                
143    * - 1                                          
144      - Descriptor. This block precedes a serie    
145        written through the journal during a tr    
146    * - 2                                          
147      - Block commit record. This block signifi    
148        transaction.                               
149    * - 3                                          
150      - Journal superblock, v1.                    
151    * - 4                                          
152      - Journal superblock, v2.                    
153    * - 5                                          
154      - Block revocation records. This speeds u    
155        journal to skip writing blocks that wer    
156                                                   
157 Super Block                                       
158 ~~~~~~~~~~~                                       
159                                                   
160 The super block for the journal is much simple    
161 The key data kept within are size of the journ    
162 start of the log of transactions.                 
163                                                   
164 The journal superblock is recorded as ``struct    
165 which is 1024 bytes long:                         
166                                                   
167 .. list-table::                                   
168    :widths: 8 8 24 40                             
169    :header-rows: 1                                
170                                                   
171    * - Offset                                     
172      - Type                                       
173      - Name                                       
174      - Description                                
175    * -                                            
176      -                                            
177      -                                            
178      - Static information describing the journ    
179    * - 0x0                                        
180      - journal_header_t (12 bytes)                
181      - s_header                                   
182      - Common header identifying this as a sup    
183    * - 0xC                                        
184      - __be32                                     
185      - s_blocksize                                
186      - Journal device block size.                 
187    * - 0x10                                       
188      - __be32                                     
189      - s_maxlen                                   
190      - Total number of blocks in this journal.    
191    * - 0x14                                       
192      - __be32                                     
193      - s_first                                    
194      - First block of log information.            
195    * -                                            
196      -                                            
197      -                                            
198      - Dynamic information describing the curr    
199    * - 0x18                                       
200      - __be32                                     
201      - s_sequence                                 
202      - First commit ID expected in log.           
203    * - 0x1C                                       
204      - __be32                                     
205      - s_start                                    
206      - Block number of the start of log. Contr    
207        being zero does not imply that the jour    
208    * - 0x20                                       
209      - __be32                                     
210      - s_errno                                    
211      - Error value, as set by jbd2_journal_abo    
212    * -                                            
213      -                                            
214      -                                            
215      - The remaining fields are only valid in     
216    * - 0x24                                       
217      - __be32                                     
218      - s_feature_compat;                          
219      - Compatible feature set. See the table j    
220    * - 0x28                                       
221      - __be32                                     
222      - s_feature_incompat                         
223      - Incompatible feature set. See the table    
224    * - 0x2C                                       
225      - __be32                                     
226      - s_feature_ro_compat                        
227      - Read-only compatible feature set. There    
228    * - 0x30                                       
229      - __u8                                       
230      - s_uuid[16]                                 
231      - 128-bit uuid for journal. This is compa    
232        super block at mount time.                 
233    * - 0x40                                       
234      - __be32                                     
235      - s_nr_users                                 
236      - Number of file systems sharing this jou    
237    * - 0x44                                       
238      - __be32                                     
239      - s_dynsuper                                 
240      - Location of dynamic super block copy. (    
241    * - 0x48                                       
242      - __be32                                     
243      - s_max_transaction                          
244      - Limit of journal blocks per transaction    
245    * - 0x4C                                       
246      - __be32                                     
247      - s_max_trans_data                           
248      - Limit of data blocks per transaction. (    
249    * - 0x50                                       
250      - __u8                                       
251      - s_checksum_type                            
252      - Checksum algorithm used for the journal    
253        more info.                                 
254    * - 0x51                                       
255      - __u8[3]                                    
256      - s_padding2                                 
257      -                                            
258    * - 0x54                                       
259      - __be32                                     
260      - s_num_fc_blocks                            
261      - Number of fast commit blocks in the jou    
262    * - 0x58                                       
263      - __be32                                     
264      - s_head                                     
265      - Block number of the head (first unused     
266        up-to-date when the journal is empty.      
267    * - 0x5C                                       
268      - __u32                                      
269      - s_padding[40]                              
270      -                                            
271    * - 0xFC                                       
272      - __be32                                     
273      - s_checksum                                 
274      - Checksum of the entire superblock, with    
275    * - 0x100                                      
276      - __u8                                       
277      - s_users[16*48]                             
278      - ids of all file systems sharing the log    
279        shared external journals, but I imagine    
280        the jbd2 code, might.                      
281                                                   
282 .. _jbd2_compat:                                  
283                                                   
284 The journal compat features are any combinatio    
285                                                   
286 .. list-table::                                   
287    :widths: 16 64                                 
288    :header-rows: 1                                
289                                                   
290    * - Value                                      
291      - Description                                
292    * - 0x1                                        
293      - Journal maintains checksums on the data    
294        (JBD2_FEATURE_COMPAT_CHECKSUM)             
295                                                   
296 .. _jbd2_incompat:                                
297                                                   
298 The journal incompat features are any combinat    
299                                                   
300 .. list-table::                                   
301    :widths: 16 64                                 
302    :header-rows: 1                                
303                                                   
304    * - Value                                      
305      - Description                                
306    * - 0x1                                        
307      - Journal has block revocation records. (    
308    * - 0x2                                        
309      - Journal can deal with 64-bit block numb    
310        (JBD2_FEATURE_INCOMPAT_64BIT)              
311    * - 0x4                                        
312      - Journal commits asynchronously. (JBD2_F    
313    * - 0x8                                        
314      - This journal uses v2 of the checksum on    
315        metadata block gets its own checksum, a    
316        descriptor table contain checksums for     
317        journal. (JBD2_FEATURE_INCOMPAT_CSUM_V2    
318    * - 0x10                                       
319      - This journal uses v3 of the checksum on    
320        v2, but the journal block tag size is f    
321        block numbers. (JBD2_FEATURE_INCOMPAT_C    
322    * - 0x20                                       
323      - Journal has fast commit blocks. (JBD2_F    
324                                                   
325 .. _jbd2_checksum_type:                           
326                                                   
327 Journal checksum type codes are one of the fol    
328 most likely choices.                              
329                                                   
330 .. list-table::                                   
331    :widths: 16 64                                 
332    :header-rows: 1                                
333                                                   
334    * - Value                                      
335      - Description                                
336    * - 1                                          
337      - CRC32                                      
338    * - 2                                          
339      - MD5                                        
340    * - 3                                          
341      - SHA1                                       
342    * - 4                                          
343      - CRC32C                                     
344                                                   
345 Descriptor Block                                  
346 ~~~~~~~~~~~~~~~~                                  
347                                                   
348 The descriptor block contains an array of jour    
349 describe the final locations of the data block    
350 journal. Descriptor blocks are open-coded inst    
351 described by a data structure, but here is the    
352 Descriptor blocks consume at least 36 bytes, b    
353                                                   
354 .. list-table::                                   
355    :widths: 8 8 24 40                             
356    :header-rows: 1                                
357                                                   
358    * - Offset                                     
359      - Type                                       
360      - Name                                       
361      - Descriptor                                 
362    * - 0x0                                        
363      - journal_header_t                           
364      - (open coded)                               
365      - Common block header.                       
366    * - 0xC                                        
367      - struct journal_block_tag_s                 
368      - open coded array[]                         
369      - Enough tags either to fill up the block    
370        blocks that follow this descriptor bloc    
371                                                   
372 Journal block tags have any of the following f    
373 journal feature and block tag flags are set.      
374                                                   
375 If JBD2_FEATURE_INCOMPAT_CSUM_V3 is set, the j    
376 defined as ``struct journal_block_tag3_s``, wh    
377 following. The size is 16 or 32 bytes.            
378                                                   
379 .. list-table::                                   
380    :widths: 8 8 24 40                             
381    :header-rows: 1                                
382                                                   
383    * - Offset                                     
384      - Type                                       
385      - Name                                       
386      - Descriptor                                 
387    * - 0x0                                        
388      - __be32                                     
389      - t_blocknr                                  
390      - Lower 32-bits of the location of where     
391        should end up on disk.                     
392    * - 0x4                                        
393      - __be32                                     
394      - t_flags                                    
395      - Flags that go with the descriptor. See     
396        more info.                                 
397    * - 0x8                                        
398      - __be32                                     
399      - t_blocknr_high                             
400      - Upper 32-bits of the location of where     
401        should end up on disk. This is zero if     
402        not enabled.                               
403    * - 0xC                                        
404      - __be32                                     
405      - t_checksum                                 
406      - Checksum of the journal UUID, the seque    
407    * -                                            
408      -                                            
409      -                                            
410      - This field appears to be open coded. It    
411        tag, after t_checksum. This field is no    
412        is set.                                    
413    * - 0x8 or 0xC                                 
414      - char                                       
415      - uuid[16]                                   
416      - A UUID to go with this tag. This field     
417        ``j_uuid`` field in ``struct journal_s`    
418        field.                                     
419                                                   
420 .. _jbd2_tag_flags:                               
421                                                   
422 The journal tag flags are any combination of t    
423                                                   
424 .. list-table::                                   
425    :widths: 16 64                                 
426    :header-rows: 1                                
427                                                   
428    * - Value                                      
429      - Description                                
430    * - 0x1                                        
431      - On-disk block is escaped. The first fou    
432        happened to match the jbd2 magic number    
433    * - 0x2                                        
434      - This block has the same UUID as previou    
435        omitted.                                   
436    * - 0x4                                        
437      - The data block was deleted by the trans    
438    * - 0x8                                        
439      - This is the last tag in this descriptor    
440                                                   
441 If JBD2_FEATURE_INCOMPAT_CSUM_V3 is NOT set, t    
442 is defined as ``struct journal_block_tag_s``,     
443 following. The size is 8, 12, 24, or 28 bytes:    
444                                                   
445 .. list-table::                                   
446    :widths: 8 8 24 40                             
447    :header-rows: 1                                
448                                                   
449    * - Offset                                     
450      - Type                                       
451      - Name                                       
452      - Descriptor                                 
453    * - 0x0                                        
454      - __be32                                     
455      - t_blocknr                                  
456      - Lower 32-bits of the location of where     
457        should end up on disk.                     
458    * - 0x4                                        
459      - __be16                                     
460      - t_checksum                                 
461      - Checksum of the journal UUID, the seque    
462        Note that only the lower 16 bits are st    
463    * - 0x6                                        
464      - __be16                                     
465      - t_flags                                    
466      - Flags that go with the descriptor. See     
467        more info.                                 
468    * -                                            
469      -                                            
470      -                                            
471      - This next field is only present if the     
472        64-bit block numbers.                      
473    * - 0x8                                        
474      - __be32                                     
475      - t_blocknr_high                             
476      - Upper 32-bits of the location of where     
477        should end up on disk.                     
478    * -                                            
479      -                                            
480      -                                            
481      - This field appears to be open coded. It    
482        tag, after t_flags or t_blocknr_high. T    
483        "same UUID" flag is set.                   
484    * - 0x8 or 0xC                                 
485      - char                                       
486      - uuid[16]                                   
487      - A UUID to go with this tag. This field     
488        ``j_uuid`` field in ``struct journal_s`    
489        field.                                     
490                                                   
491 If JBD2_FEATURE_INCOMPAT_CSUM_V2 or               
492 JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end    
493 ``struct jbd2_journal_block_tail``, which look    
494                                                   
495 .. list-table::                                   
496    :widths: 8 8 24 40                             
497    :header-rows: 1                                
498                                                   
499    * - Offset                                     
500      - Type                                       
501      - Name                                       
502      - Descriptor                                 
503    * - 0x0                                        
504      - __be32                                     
505      - t_checksum                                 
506      - Checksum of the journal UUID + the desc    
507        to zero.                                   
508                                                   
509 Data Block                                        
510 ~~~~~~~~~~                                        
511                                                   
512 In general, the data blocks being written to d    
513 are written verbatim into the journal file aft    
514 However, if the first four bytes of the block     
515 number then those four bytes are replaced with    
516 flag is set in the descriptor block tag.          
517                                                   
518 Revocation Block                                  
519 ~~~~~~~~~~~~~~~~                                  
520                                                   
521 A revocation block is used to prevent replay o    
522 transaction. This is used to mark blocks that     
523 time but are no longer journalled. Typically t    
524 block is freed and re-allocated as a file data    
525 journal replay after the file block was writte    
526 corruption.                                       
527                                                   
528 **NOTE**: This mechanism is NOT used to expres    
529 superseded by this other journal block”, as     
530 mistakenly thought. Any block being added to a    
531 the removal of all existing revocation records    
532                                                   
533 Revocation blocks are described in                
534 ``struct jbd2_journal_revoke_header_s``, are a    
535 length, but use a full block:                     
536                                                   
537 .. list-table::                                   
538    :widths: 8 8 24 40                             
539    :header-rows: 1                                
540                                                   
541    * - Offset                                     
542      - Type                                       
543      - Name                                       
544      - Description                                
545    * - 0x0                                        
546      - journal_header_t                           
547      - r_header                                   
548      - Common block header.                       
549    * - 0xC                                        
550      - __be32                                     
551      - r_count                                    
552      - Number of bytes used in this block.        
553    * - 0x10                                       
554      - __be32 or __be64                           
555      - blocks[0]                                  
556      - Blocks to revoke.                          
557                                                   
558 After r_count is a linear array of block numbe    
559 revoked by this transaction. The size of each     
560 the superblock advertises 64-bit block number     
561 otherwise.                                        
562                                                   
563 If JBD2_FEATURE_INCOMPAT_CSUM_V2 or               
564 JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end    
565 block is a ``struct jbd2_journal_revoke_tail``    
566                                                   
567 .. list-table::                                   
568    :widths: 8 8 24 40                             
569    :header-rows: 1                                
570                                                   
571    * - Offset                                     
572      - Type                                       
573      - Name                                       
574      - Description                                
575    * - 0x0                                        
576      - __be32                                     
577      - r_checksum                                 
578      - Checksum of the journal UUID + revocati    
579                                                   
580 Commit Block                                      
581 ~~~~~~~~~~~~                                      
582                                                   
583 The commit block is a sentry that indicates th    
584 completely written to the journal. Once this c    
585 journal, the data stored with this transaction    
586 final locations on disk.                          
587                                                   
588 The commit block is described by ``struct comm    
589 bytes long (but uses a full block):               
590                                                   
591 .. list-table::                                   
592    :widths: 8 8 24 40                             
593    :header-rows: 1                                
594                                                   
595    * - Offset                                     
596      - Type                                       
597      - Name                                       
598      - Descriptor                                 
599    * - 0x0                                        
600      - journal_header_s                           
601      - (open coded)                               
602      - Common block header.                       
603    * - 0xC                                        
604      - unsigned char                              
605      - h_chksum_type                              
606      - The type of checksum to use to verify t    
607        in the transaction. See jbd2_checksum_t    
608    * - 0xD                                        
609      - unsigned char                              
610      - h_chksum_size                              
611      - The number of bytes used by the checksu    
612    * - 0xE                                        
613      - unsigned char                              
614      - h_padding[2]                               
615      -                                            
616    * - 0x10                                       
617      - __be32                                     
618      - h_chksum[JBD2_CHECKSUM_BYTES]              
619      - 32 bytes of space to store checksums. I    
620        JBD2_FEATURE_INCOMPAT_CSUM_V2 or JBD2_F    
621        are set, the first ``__be32`` is the ch    
622        the entire commit block, with this fiel    
623        JBD2_FEATURE_COMPAT_CHECKSUM is set, th    
624        crc32 of all the blocks already written    
625    * - 0x30                                       
626      - __be64                                     
627      - h_commit_sec                               
628      - The time that the transaction was commi    
629    * - 0x38                                       
630      - __be32                                     
631      - h_commit_nsec                              
632      - Nanoseconds component of the above time    
633                                                   
634 Fast commits                                      
635 ~~~~~~~~~~~~                                      
636                                                   
637 Fast commit area is organized as a log of tag     
638 a ``struct ext4_fc_tl`` in the beginning which    
639 of the entire field. It is followed by variabl    
640 Here is the list of supported tags and their m    
641                                                   
642 .. list-table::                                   
643    :widths: 8 20 20 32                            
644    :header-rows: 1                                
645                                                   
646    * - Tag                                        
647      - Meaning                                    
648      - Value struct                               
649      - Description                                
650    * - EXT4_FC_TAG_HEAD                           
651      - Fast commit area header                    
652      - ``struct ext4_fc_head``                    
653      - Stores the TID of the transaction after    
654        be applied.                                
655    * - EXT4_FC_TAG_ADD_RANGE                      
656      - Add extent to inode                        
657      - ``struct ext4_fc_add_range``               
658      - Stores the inode number and extent to b    
659    * - EXT4_FC_TAG_DEL_RANGE                      
660      - Remove logical offsets to inode            
661      - ``struct ext4_fc_del_range``               
662      - Stores the inode number and the logical    
663        removed                                    
664    * - EXT4_FC_TAG_CREAT                          
665      - Create directory entry for a newly crea    
666      - ``struct ext4_fc_dentry_info``             
667      - Stores the parent inode number, inode n    
668        newly created file                         
669    * - EXT4_FC_TAG_LINK                           
670      - Link a directory entry to an inode         
671      - ``struct ext4_fc_dentry_info``             
672      - Stores the parent inode number, inode n    
673    * - EXT4_FC_TAG_UNLINK                         
674      - Unlink a directory entry of an inode       
675      - ``struct ext4_fc_dentry_info``             
676      - Stores the parent inode number, inode n    
677                                                   
678    * - EXT4_FC_TAG_PAD                            
679      - Padding (unused area)                      
680      - None                                       
681      - Unused bytes in the fast commit area.      
682                                                   
683    * - EXT4_FC_TAG_TAIL                           
684      - Mark the end of a fast commit              
685      - ``struct ext4_fc_tail``                    
686      - Stores the TID of the commit, CRC of th    
687        represents the end of                      
688                                                   
689 Fast Commit Replay Idempotence                    
690 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    
691                                                   
692 Fast commits tags are idempotent in nature pro    
693 certain rules. The guiding principle that the     
694 committing is that it stores the result of a p    
695 storing the procedure.                            
696                                                   
697 Let's consider this rename operation: 'mv /a /    
698 was associated with inode 10. During fast comm    
699 operation as a procedure "rename a to b", we s    
700 state as a "series" of outcomes:                  
701                                                   
702 - Link dirent b to inode 10                       
703 - Unlink dirent a                                 
704 - Inode 10 with valid refcount                    
705                                                   
706 Now when recovery code runs, it needs "enforce    
707 system. This is what guarantees idempotence of    
708                                                   
709 Let's take an example of a procedure that is n    
710 commits make it idempotent. Consider following    
711                                                   
712 1) rm A                                           
713 2) mv B A                                         
714 3) read A                                         
715                                                   
716 If we store this sequence of operations as is     
717 Let's say while in replay, we crash after (2).    
718 file A (which was actually created as a result    
719 deleted. Thus, file named A would be absent wh    
720 sequence of operations is not idempotent. Howe    
721 of storing the procedure fast commits store th    
722 the fast commit log for above procedure would     
723                                                   
724 (Let's assume dirent A was linked to inode 10     
725 inode 11 before the replay)                       
726                                                   
727 1) Unlink A                                       
728 2) Link A to inode 11                             
729 3) Unlink B                                       
730 4) Inode 11                                       
731                                                   
732 If we crash after (3) we will have file A link    
733 replay, we will remove file A (inode 11). But     
734 it point to inode 11. We won't find B, so we'l    
735 point, the refcount for inode 11 is not reliab    
736 replay of last inode 11 tag. Thus, by converti    
737 into a series of idempotent outcomes, fast com    
738 the replay.                                       
739                                                   
740 Journal Checkpoint                                
741 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    
742                                                   
743 Checkpointing the journal ensures all transact    
744 are submitted to the disk. In-progress transac    
745 in the checkpoint. Checkpointing is used inter    
746 the filesystem including journal recovery, fil    
747 the journal_t structure.                          
748                                                   
749 A journal checkpoint can be triggered from use    
750 EXT4_IOC_CHECKPOINT. This ioctl takes a single    
751 Currently, three flags are supported. First, E    
752 can be used to verify input to the ioctl. It r    
753 invalid input, otherwise it returns success wi    
754 any checkpointing. This can be used to check w    
755 system and to verify there are no issues with     
756 other two flags are EXT4_IOC_CHECKPOINT_FLAG_D    
757 EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT. These flags     
758 discarded or zero-filled, respectively, after     
759 complete. EXT4_IOC_CHECKPOINT_FLAG_DISCARD and    
760 cannot both be set. The ioctl may be useful wh    
761 complying with content deletion SLOs.             
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php