~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/ext4.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/admin-guide/ext4.rst (Version linux-6.12-rc7) and /Documentation/admin-guide/ext4.rst (Version linux-4.10.17)


  1 .. SPDX-License-Identifier: GPL-2.0               
  2                                                   
  3 ========================                          
  4 ext4 General Information                          
  5 ========================                          
  6                                                   
  7 Ext4 is an advanced level of the ext3 filesyst    
  8 scalability and reliability enhancements for s    
  9 (64 bit) in keeping with increasing disk capac    
 10 feature requirements.                             
 11                                                   
 12 Mailing list:   linux-ext4@vger.kernel.org        
 13 Web site:       http://ext4.wiki.kernel.org       
 14                                                   
 15                                                   
 16 Quick usage instructions                          
 17 ========================                          
 18                                                   
 19 Note: More extensive information for getting s    
 20 found at the ext4 wiki site at the URL:           
 21 http://ext4.wiki.kernel.org/index.php/Ext4_How    
 22                                                   
 23   - The latest version of e2fsprogs can be fou    
 24                                                   
 25     https://www.kernel.org/pub/linux/kernel/pe    
 26                                                   
 27         or                                        
 28                                                   
 29     http://sourceforge.net/project/showfiles.p    
 30                                                   
 31         or grab the latest git repository from    
 32                                                   
 33    https://git.kernel.org/pub/scm/fs/ext2/e2fs    
 34                                                   
 35   - Create a new filesystem using the ext4 fil    
 36                                                   
 37         # mke2fs -t ext4 /dev/hda1                
 38                                                   
 39     Or to configure an existing ext3 filesyste    
 40                                                   
 41         # tune2fs -O extents /dev/hda1            
 42                                                   
 43     If the filesystem was created with 128 byt    
 44     converted to use 256 byte for greater effi    
 45                                                   
 46         # tune2fs -I 256 /dev/hda1                
 47                                                   
 48   - Mounting:                                     
 49                                                   
 50         # mount -t ext4 /dev/hda1 /wherever       
 51                                                   
 52   - When comparing performance with other file    
 53     important to try multiple workloads; very     
 54     workload parameter can completely change t    
 55     filesystems do well compared to others.  W    
 56     note that ext4 enables write barriers by d    
 57     not enable write barriers by default.  So     
 58     explicitly specify whether barriers are en    
 59     '-o barriers=[0|1]' mount option for both     
 60     for a fair comparison.  When tuning ext3 f    
 61     it is often worthwhile to try changing the    
 62     data=writeback' can be faster for some wor    
 63     running mounted with data=writeback can po    
 64     exposed in recently written files in case     
 65     which could be a security exposure in some    
 66     the filesystem with a large journal can al    
 67     metadata-intensive workloads.                 
 68                                                   
 69 Features                                          
 70 ========                                          
 71                                                   
 72 Currently Available                               
 73 -------------------                               
 74                                                   
 75 * ability to use filesystems > 16TB (e2fsprogs    
 76 * extent format reduces metadata overhead (RAM    
 77 * extent format more robust in face of on-disk    
 78 * internal redundancy in tree                     
 79 * improved file allocation (multi-block alloc)    
 80 * lift 32000 subdirectory limit imposed by i_l    
 81 * nsec timestamps for mtime, atime, ctime, cre    
 82 * inode version field on disk (NFSv4, Lustre)     
 83 * reduced e2fsck time via uninit_bg feature       
 84 * journal checksumming for robustness, perform    
 85 * persistent file preallocation (e.g for strea    
 86 * ability to pack bitmaps and inode tables int    
 87   flex_bg feature                                 
 88 * large file support                              
 89 * inode allocation using large virtual block g    
 90 * delayed allocation                              
 91 * large block (up to pagesize) support            
 92 * efficient new ordered mode in JBD2 and ext4     
 93   the ordering)                                   
 94 * Case-insensitive file name lookups              
 95 * file-based encryption support (fscrypt)         
 96 * file-based verity support (fsverity)            
 97                                                   
 98 [1] Filesystems with a block size of 1k may se    
 99 directory hash tree having a maximum depth of     
100                                                   
101 case-insensitive file name lookups                
102 ==============================================    
103                                                   
104 The case-insensitive file name lookup feature     
105 per-directory basis, allowing the user to mix     
106 case-sensitive directories in the same filesys    
107 flipping the +F inode attribute of an empty di    
108 case-insensitive string match operation is onl    
109 text in encoded in a byte sequence.  For that     
110 case-insensitive directories, the filesystem m    
111 casefold feature, which stores the filesystem-    
112 model used.  By default, the charset adopted i    
113 Unicode (12.1.0, by the time of this writing),    
114 form.  The comparison algorithm is implemented    
115 strings to the Canonical decomposition form, a    
116 followed by a byte per byte comparison.           
117                                                   
118 The case-awareness is name-preserving on the d    
119 name provided by userspace is a byte-per-byte     
120 written in the disk.  The Unicode normalizatio    
121 kernel is thus an internal representation, and    
122 userspace nor to the disk, with the important     
123 used on large case-insensitive directories wit    
124 directories, the hash must be calculated using    
125 the filename, meaning that the normalization f    
126 impact on where the directory entry is stored.    
127                                                   
128 When we change from viewing filenames as opaqu    
129 them as encoded strings we need to address wha    
130 tries to create a file with an invalid name.      
131 within the kernel leaves the decision of what     
132 filesystem, which select its preferred behavio    
133 the strict mode.  When Ext4 encounters one of     
134 filesystem did not require strict mode, it fal    
135 entire string as an opaque byte sequence, whic    
136 operate on that file, but the case-insensitive    
137                                                   
138 Options                                           
139 =======                                           
140                                                   
141 When mounting an ext4 filesystem, the followin    
142 (*) == default                                    
143                                                   
144   ro                                              
145         Mount filesystem read only. Note that     
146         thus write to the partition) even when    
147         options "ro,noload" can be used to pre    
148                                                   
149   journal_checksum                                
150         Enable checksumming of the journal tra    
151         recovery code in e2fsck and the kernel    
152         kernel.  It is a compatible change and    
153         kernels.                                  
154                                                   
155   journal_async_commit                            
156         Commit block can be written to disk wi    
157         blocks. If enabled older kernels canno    
158         enable 'journal_checksum' internally.     
159                                                   
160   journal_path=path, journal_dev=devnum           
161         When the external journal device's maj    
162         these options allow the user to specif    
163         journal device is identified through e    
164         encoded in devnum, or via a path to th    
165                                                   
166   norecovery, noload                              
167         Don't load the journal on mounting.  N    
168         not unmounted cleanly, skipping the jo    
169         filesystem containing inconsistencies     
170         problems.                                 
171                                                   
172   data=journal                                    
173         All data are committed into the journa    
174         main file system.  Enabling this mode     
175         and O_DIRECT support.                     
176                                                   
177   data=ordered  (*)                               
178         All data are forced directly out to th    
179         metadata being committed to the journa    
180                                                   
181   data=writeback                                  
182         Data ordering is not preserved, data m    
183         system after its metadata has been com    
184                                                   
185   commit=nrsec  (*)                               
186         This setting limits the maximum age of    
187         'nrsec' seconds.  The default value is    
188         you lose your power, you will lose as     
189         metadata changes (your filesystem will    
190         to the journaling). This default value    
191         performance, but it's good for data-sa    
192         the same effect as leaving it at the d    
193         to very large values will improve perf    
194         delayed allocation even older data can    
195         writeback of those data begins only af    
196         /proc/sys/vm/dirty_expire_centisecs.      
197                                                   
198   barrier=<0|1(*)>, barrier(*), nobarrier         
199         This enables/disables the use of write    
200         barrier=0 disables, barrier=1 enables.    
201         which can support barriers, and if jbd    
202         write, it will disable again with a wa    
203         proper on-disk ordering of journal com    
204         caches safe to use, at some performanc    
205         battery-backed in one way or another,     
206         improve performance.  The mount option    
207         also be used to enable or disable barr    
208         ext4 mount options.                       
209                                                   
210   inode_readahead_blks=n                          
211         This tuning parameter controls the max    
212         that ext4's inode table readahead algo    
213         buffer cache.  The default value is 32    
214                                                   
215   bsddf (*)                                       
216         Make 'df' act like BSD.                   
217                                                   
218   minixdf                                         
219         Make 'df' act like Minix.                 
220                                                   
221   debug                                           
222         Extra debugging information is sent to    
223                                                   
224   abort                                           
225         Simulate the effects of calling ext4_a    
226         This is normally used while remounting    
227         mounted.                                  
228                                                   
229   errors=remount-ro                               
230         Remount the filesystem read-only on an    
231                                                   
232   errors=continue                                 
233         Keep going on a filesystem error.         
234                                                   
235   errors=panic                                    
236         Panic and halt the machine if an error    
237         override the errors behavior specified    
238         configured using tune2fs)                 
239                                                   
240   data_err=ignore(*)                              
241         Just print an error message if an erro    
242         ordered mode.                             
243   data_err=abort                                  
244         Abort the journal if an error occurs i    
245         mode.                                     
246                                                   
247   grpid | bsdgroups                               
248         New objects have the group ID of their    
249                                                   
250   nogrpid (*) | sysvgroups                        
251         New objects have the group ID of their    
252                                                   
253   resgid=n                                        
254         The group ID which may use the reserve    
255                                                   
256   resuid=n                                        
257         The user ID which may use the reserved    
258                                                   
259   sb=                                             
260         Use alternate superblock at this locat    
261                                                   
262   quota, noquota, grpquota, usrquota              
263         These options are ignored by the files    
264         quota tools to recognize volumes where    
265         documentation in the quota-tools packa    
266         (http://sourceforge.net/projects/linux    
267                                                   
268   jqfmt=<quota type>, usrjquota=<file>, grpjqu    
269         These options tell filesystem details     
270         information can be properly updated du    
271         the above quota options. See documenta    
272         for more details (http://sourceforge.n    
273                                                   
274   stripe=n                                        
275         Number of filesystem blocks that mball    
276         size and alignment. For RAID5/6 system    
277         data disks *  RAID chunk size in file     
278                                                   
279   delalloc      (*)                               
280         Defer block allocation until just befo    
281         in question.  This allows ext4 to bett    
282         efficiently.                              
283                                                   
284   nodelalloc                                      
285         Disable delayed allocation.  Blocks ar    
286         copied from userspace to the page cach    
287         call or when an mmap'ed page which was    
288         written for the first time.               
289                                                   
290   max_batch_time=usec                             
291         Maximum amount of time ext4 should wai    
292         operations to be batch together with a    
293         Since a synchronous write operation is    
294         a wait for the I/O complete, it doesn'    
295         throughput win, we wait for a small am    
296         transactions can piggyback on the sync    
297         used is designed to automatically tune    
298         measuring the amount of time (on avera    
299         committing a transaction.  Call this t    
300         time that the transaction has been run    
301         time, ext4 will try sleeping for the c    
302         operations will join the transaction.     
303         the max_batch_time, which defaults to     
304         optimization can be turned off entirel    
305                                                   
306   min_batch_time=usec                             
307         This parameter sets the commit time (a    
308         min_batch_time.  It defaults to zero m    
309         parameter may improve the throughput o    
310         workloads on very fast disks, at the c    
311                                                   
312   journal_ioprio=prio                             
313         The I/O priority (from 0 to 7, where 0    
314         should be used for I/O operations subm    
315         commit operation.  This defaults to 3,    
316         priority than the default I/O priority    
317                                                   
318   auto_da_alloc(*), noauto_da_alloc               
319         Many broken applications don't use fsy    
320         files via patterns such as fd = open("    
321         rename("foo.new", "foo"), or worse yet    
322         O_TRUNC)/write(fd,..)/close(fd).  If a    
323         will detect the replace-via-rename and    
324         and force that any delayed allocation     
325         the next journal commit, in the defaul    
326         blocks of the new file are forced to d    
327         is committed.  This provides roughly t    
328         ext3, and avoids the "zero-length" pro    
329         system crashes before the delayed allo    
330                                                   
331   noinit_itable                                   
332         Do not initialize any uninitialized in    
333         background.  This feature may be used     
334         install process can complete as quickl    
335         initialization process would then be d    
336         file system is unmounted.                 
337                                                   
338   init_itable=n                                   
339         The lazy itable init code will wait n     
340         it took to zero out the previous block    
341         minimizes the impact on the system per    
342         inode table is being initialized.         
343                                                   
344   discard, nodiscard(*)                           
345         Controls whether ext4 should issue dis    
346         underlying block device when blocks ar    
347         devices and sparse/thinly-provisioned     
348         until sufficient testing has been done    
349                                                   
350   nouid32                                         
351         Disables 32-bit UIDs and GIDs.  This i    
352         older kernels which only store and exp    
353                                                   
354   block_validity(*), noblock_validity             
355         These options enable or disable the in    
356         filesystem metadata blocks within inte    
357         allows multi- block allocator and othe    
358         corrupted allocation bitmaps which cau    
359         overlap with filesystem metadata block    
360                                                   
361   dioread_lock, dioread_nolock                    
362         Controls whether or not ext4 should us    
363         dioread_nolock option is specified ext    
364         extent before buffer write and convert    
365         IO completes. This approach allows ext    
366         mutex, which improves scalability on h    
367         does not work with data journaling and    
368         ignored with kernel warning. Note that    
369         used for extent-based files.  Because     
370         comprises it is off by default (e.g. d    
371                                                   
372   max_dir_size_kb=n                               
373         This limits the size of directories so    
374         beyond the specified limit in kilobyte    
375         This is useful in memory constrained e    
376         directory can cause severe performance    
377         Of Memory killer.  (For example, if th    
378         available, a 176mb directory may serio    
379                                                   
380   i_version                                       
381         Enable 64-bit inode version support. T    
382                                                   
383   dax                                             
384         Use direct access (no page cache).  Se    
385         Documentation/filesystems/dax.rst.  No    
386         incompatible with data=journal.           
387                                                   
388   inlinecrypt                                     
389         When possible, encrypt/decrypt the con    
390         blk-crypto framework rather than files    
391         allows the use of inline encryption ha    
392         unaffected. For more details, see         
393         Documentation/block/inline-encryption.    
394                                                   
395 Data Mode                                         
396 =========                                         
397 There are 3 different data modes:                 
398                                                   
399 * writeback mode                                  
400                                                   
401   In data=writeback mode, ext4 does not journa    
402   a similar level of journaling as that of XFS    
403   mode - metadata journaling.  A crash+recover    
404   appear in files which were written shortly b    
405   typically provide the best ext4 performance.    
406                                                   
407 * ordered mode                                    
408                                                   
409   In data=ordered mode, ext4 only officially j    
410   groups metadata information related to data     
411   a single unit called a transaction.  When it    
412   out to disk, the associated data blocks are     
413   mode performs slightly slower than writeback    
414   journal mode.                                   
415                                                   
416 * journal mode                                    
417                                                   
418   data=journal mode provides full data and met    
419   written to the journal first, and then to it    
420   a crash, the journal can be replayed, bringi    
421   consistent state.  This mode is the slowest     
422   from and written to disk at the same time wh    
423   modes.  Enabling this mode will disable dela    
424   support.                                        
425                                                   
426 /proc entries                                     
427 =============                                     
428                                                   
429 Information about mounted ext4 file systems ca    
430 /proc/fs/ext4.  Each mounted filesystem will h    
431 /proc/fs/ext4 based on its device name (i.e.,     
432 /proc/fs/ext4/dm-0).   The files in each per-d    
433 in table below.                                   
434                                                   
435 Files in /proc/fs/ext4/<devname>                  
436                                                   
437   mb_groups                                       
438         details of multiblock allocator buddy     
439                                                   
440 /sys entries                                      
441 ============                                      
442                                                   
443 Information about mounted ext4 file systems ca    
444 /sys/fs/ext4.  Each mounted filesystem will ha    
445 /sys/fs/ext4 based on its device name (i.e., /    
446 /sys/fs/ext4/dm-0).   The files in each per-de    
447 in table below.                                   
448                                                   
449 Files in /sys/fs/ext4/<devname>:                  
450                                                   
451 (see also Documentation/ABI/testing/sysfs-fs-e    
452                                                   
453   delayed_allocation_blocks                       
454         This file is read-only and shows the n    
455         the page cache, but which do not have     
456         allocated yet.                            
457                                                   
458   inode_goal                                      
459         Tuning parameter which (if non-zero) c    
460         the inode allocator in preference to a    
461         This is intended for debugging use onl    
462         systems.                                  
463                                                   
464   inode_readahead_blks                            
465         Tuning parameter which controls the ma    
466         blocks that ext4's inode table readahe    
467         the buffer cache.                         
468                                                   
469   lifetime_write_kbytes                           
470         This file is read-only and shows the n    
471         have been written to this filesystem s    
472                                                   
473   max_writeback_mb_bump                           
474         The maximum number of megabytes the wr    
475         out before move on to another inode.      
476                                                   
477   mb_group_prealloc                               
478         The multiblock allocator will round up    
479         multiple of this tuning parameter if t    
480         ext4 superblock                           
481                                                   
482   mb_max_to_scan                                  
483         The maximum number of extents the mult    
484         find the best extent.                     
485                                                   
486   mb_min_to_scan                                  
487         The minimum number of extents the mult    
488         find the best extent.                     
489                                                   
490   mb_order2_req                                   
491         Tuning parameter which controls the mi    
492         power of 2) where the buddy cache is u    
493                                                   
494   mb_stats                                        
495         Controls whether the multiblock alloca    
496         which are shown during the unmount. 1     
497         means not to collect statistics.          
498                                                   
499   mb_stream_req                                   
500         Files which have fewer blocks than thi    
501         their blocks allocated out of a block     
502         pool, so that small files are packed c    
503         will have its blocks allocated out of     
504         pool.                                     
505                                                   
506   session_write_kbytes                            
507         This file is read-only and shows the n    
508         have been written to this filesystem s    
509                                                   
510   reserved_clusters                               
511         This is RW file and contains number of    
512         system which will be used in the speci    
513         zeroout, unexpected ENOSPC, or possibl    
514         4096 clusters, whichever is smaller an    
515         can never exceed number of clusters in    
516         enough space for the reserved space wh    
517         _not_ fail.                               
518                                                   
519 Ioctls                                            
520 ======                                            
521                                                   
522 Ext4 implements various ioctls which can be us    
523 ext4-specific functionality. An incomplete lis    
524 table below. This list includes truly ext4-spe    
525 well as ioctls that may have been ext4-specifi    
526 by some other filesystem(s) too (``FS_IOC_*``)    
527                                                   
528 Table of Ext4 ioctls                              
529                                                   
530   FS_IOC_GETFLAGS                                 
531         Get additional attributes associated w    
532         an integer bitfield, with bit values d    
533                                                   
534   FS_IOC_SETFLAGS                                 
535         Set additional attributes associated w    
536         an integer bitfield, with bit values d    
537                                                   
538   EXT4_IOC_GETVERSION, EXT4_IOC_GETVERSION_OLD    
539         Get the inode i_generation number stor    
540         i_generation number is normally change    
541         and it is particularly useful for netw    
542         version of this ioctl is an alias for     
543                                                   
544   EXT4_IOC_SETVERSION, EXT4_IOC_SETVERSION_OLD    
545         Set the inode i_generation number stor    
546         version of this ioctl is an alias for     
547                                                   
548   EXT4_IOC_GROUP_EXTEND                           
549         This ioctl has the same purpose as the    
550         to resize filesystem to the end of the    
551         further resize has to be done with res    
552         offline. The argument points to the un    
553         the filesystem new block count.           
554                                                   
555   EXT4_IOC_MOVE_EXT                               
556         Move the block extents from orig_fd (t    
557         to the donor_fd (the one specified in     
558         an argument to this ioctl). Then, exch    
559         orig_fd and donor_fd.  This is especia    
560         defragmentation, because the allocator    
561         moved blocks better, ideally into one     
562                                                   
563   EXT4_IOC_GROUP_ADD                              
564         Add a new group descriptor to an exist    
565         block. The new group descriptor is des    
566         structure, which is passed as an argum    
567         especially useful in conjunction with     
568         allows online resize of the filesystem    
569         block group.  Those two ioctls combine    
570         resize tool (e.g. resize2fs).             
571                                                   
572   EXT4_IOC_MIGRATE                                
573         This ioctl operates on the filesystem     
574         ext3 indirect block mapped inode to ex    
575         through indirect block mapping of the     
576         contiguous block ranges into ext4 exte    
577         inodes are swapped. This ioctl might h    
578         ext4 filesystem, however suggestion is    
579         and copy data from the backup. Note, t    
580         extents for this ioctl to work.           
581                                                   
582   EXT4_IOC_ALLOC_DA_BLKS                          
583         Force all of the delay allocated block    
584         application-expected ext3 behaviour. N    
585         triggering a write of the data blocks,    
586         the future as it is not necessary and     
587         sake of simplicity.                       
588                                                   
589   EXT4_IOC_RESIZE_FS                              
590         Resize the filesystem to a new size.      
591         filesystem is passed in via 64 bit int    
592         allocates bitmaps and inode table, the    
593         the new number of blocks.                 
594                                                   
595   EXT4_IOC_SWAP_BOOT                              
596         Swap i_blocks and associated attribute    
597         i_flags, ...) from the specified inode    
598         (#5). This is typically used to store     
599         the filesystem, where it can't be chan    
600         The data blocks of the previous boot l    
601         given inode.                              
602                                                   
603 References                                        
604 ==========                                        
605                                                   
606 kernel source:  <file:fs/ext4/>                   
607                 <file:fs/jbd2/>                   
608                                                   
609 programs:       http://e2fsprogs.sourceforge.n    
610                                                   
611 useful links:   https://fedoraproject.org/wiki    
612                 http://www.bullopensource.org/    
613                 http://ext4.wiki.kernel.org/in    
614                 https://fedoraproject.org/wiki    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php