~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/ext4/blockgroup.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/filesystems/ext4/blockgroup.rst (Architecture i386) and /Documentation/filesystems/ext4/blockgroup.rst (Architecture sparc)


  1 .. SPDX-License-Identifier: GPL-2.0                 1 .. SPDX-License-Identifier: GPL-2.0
  2                                                     2 
  3 Layout                                              3 Layout
  4 ------                                              4 ------
  5                                                     5 
  6 The layout of a standard block group is approx      6 The layout of a standard block group is approximately as follows (each
  7 of these fields is discussed in a separate sec      7 of these fields is discussed in a separate section below):
  8                                                     8 
  9 .. list-table::                                     9 .. list-table::
 10    :widths: 1 1 1 1 1 1 1 1                        10    :widths: 1 1 1 1 1 1 1 1
 11    :header-rows: 1                                 11    :header-rows: 1
 12                                                    12 
 13    * - Group 0 Padding                             13    * - Group 0 Padding
 14      - ext4 Super Block                            14      - ext4 Super Block
 15      - Group Descriptors                           15      - Group Descriptors
 16      - Reserved GDT Blocks                         16      - Reserved GDT Blocks
 17      - Data Block Bitmap                           17      - Data Block Bitmap
 18      - inode Bitmap                                18      - inode Bitmap
 19      - inode Table                                 19      - inode Table
 20      - Data Blocks                                 20      - Data Blocks
 21    * - 1024 bytes                                  21    * - 1024 bytes
 22      - 1 block                                     22      - 1 block
 23      - many blocks                                 23      - many blocks
 24      - many blocks                                 24      - many blocks
 25      - 1 block                                     25      - 1 block
 26      - 1 block                                     26      - 1 block
 27      - many blocks                                 27      - many blocks
 28      - many more blocks                            28      - many more blocks
 29                                                    29 
 30 For the special case of block group 0, the fir     30 For the special case of block group 0, the first 1024 bytes are unused,
 31 to allow for the installation of x86 boot sect     31 to allow for the installation of x86 boot sectors and other oddities.
 32 The superblock will start at offset 1024 bytes     32 The superblock will start at offset 1024 bytes, whichever block that
 33 happens to be (usually 0). However, if for som     33 happens to be (usually 0). However, if for some reason the block size =
 34 1024, then block 0 is marked in use and the su     34 1024, then block 0 is marked in use and the superblock goes in block 1.
 35 For all other block groups, there is no paddin     35 For all other block groups, there is no padding.
 36                                                    36 
 37 The ext4 driver primarily works with the super     37 The ext4 driver primarily works with the superblock and the group
 38 descriptors that are found in block group 0. R     38 descriptors that are found in block group 0. Redundant copies of the
 39 superblock and group descriptors are written t     39 superblock and group descriptors are written to some of the block groups
 40 across the disk in case the beginning of the d     40 across the disk in case the beginning of the disk gets trashed, though
 41 not all block groups necessarily host a redund     41 not all block groups necessarily host a redundant copy (see following
 42 paragraph for more details). If the group does     42 paragraph for more details). If the group does not have a redundant
 43 copy, the block group begins with the data blo     43 copy, the block group begins with the data block bitmap. Note also that
 44 when the filesystem is freshly formatted, mkfs     44 when the filesystem is freshly formatted, mkfs will allocate “reserve
 45 GDT block” space after the block group descr     45 GDT block” space after the block group descriptors and before the start
 46 of the block bitmaps to allow for future expan     46 of the block bitmaps to allow for future expansion of the filesystem. By
 47 default, a filesystem is allowed to increase i     47 default, a filesystem is allowed to increase in size by a factor of
 48 1024x over the original filesystem size.           48 1024x over the original filesystem size.
 49                                                    49 
 50 The location of the inode table is given by ``     50 The location of the inode table is given by ``grp.bg_inode_table_*``. It
 51 is continuous range of blocks large enough to      51 is continuous range of blocks large enough to contain
 52 ``sb.s_inodes_per_group * sb.s_inode_size`` by     52 ``sb.s_inodes_per_group * sb.s_inode_size`` bytes.
 53                                                    53 
 54 As for the ordering of items in a block group,     54 As for the ordering of items in a block group, it is generally
 55 established that the super block and the group     55 established that the super block and the group descriptor table, if
 56 present, will be at the beginning of the block     56 present, will be at the beginning of the block group. The bitmaps and
 57 the inode table can be anywhere, and it is qui     57 the inode table can be anywhere, and it is quite possible for the
 58 bitmaps to come after the inode table, or for      58 bitmaps to come after the inode table, or for both to be in different
 59 groups (flex_bg). Leftover space is used for f     59 groups (flex_bg). Leftover space is used for file data blocks, indirect
 60 block maps, extent tree blocks, and extended a     60 block maps, extent tree blocks, and extended attributes.
 61                                                    61 
 62 Flexible Block Groups                              62 Flexible Block Groups
 63 ---------------------                              63 ---------------------
 64                                                    64 
 65 Starting in ext4, there is a new feature calle     65 Starting in ext4, there is a new feature called flexible block groups
 66 (flex_bg). In a flex_bg, several block groups      66 (flex_bg). In a flex_bg, several block groups are tied together as one
 67 logical block group; the bitmap spaces and the     67 logical block group; the bitmap spaces and the inode table space in the
 68 first block group of the flex_bg are expanded      68 first block group of the flex_bg are expanded to include the bitmaps
 69 and inode tables of all other block groups in      69 and inode tables of all other block groups in the flex_bg. For example,
 70 if the flex_bg size is 4, then group 0 will co     70 if the flex_bg size is 4, then group 0 will contain (in order) the
 71 superblock, group descriptors, data block bitm     71 superblock, group descriptors, data block bitmaps for groups 0-3, inode
 72 bitmaps for groups 0-3, inode tables for group     72 bitmaps for groups 0-3, inode tables for groups 0-3, and the remaining
 73 space in group 0 is for file data. The effect      73 space in group 0 is for file data. The effect of this is to group the
 74 block group metadata close together for faster     74 block group metadata close together for faster loading, and to enable
 75 large files to be continuous on disk. Backup c     75 large files to be continuous on disk. Backup copies of the superblock
 76 and group descriptors are always at the beginn     76 and group descriptors are always at the beginning of block groups, even
 77 if flex_bg is enabled. The number of block gro     77 if flex_bg is enabled. The number of block groups that make up a
 78 flex_bg is given by 2 ^ ``sb.s_log_groups_per_     78 flex_bg is given by 2 ^ ``sb.s_log_groups_per_flex``.
 79                                                    79 
 80 Meta Block Groups                                  80 Meta Block Groups
 81 -----------------                                  81 -----------------
 82                                                    82 
 83 Without the option META_BG, for safety concern     83 Without the option META_BG, for safety concerns, all block group
 84 descriptors copies are kept in the first block     84 descriptors copies are kept in the first block group. Given the default
 85 128MiB(2^27 bytes) block group size and 64-byt     85 128MiB(2^27 bytes) block group size and 64-byte group descriptors, ext4
 86 can have at most 2^27/64 = 2^21 block groups.      86 can have at most 2^27/64 = 2^21 block groups. This limits the entire
 87 filesystem size to 2^21 * 2^27 = 2^48bytes or      87 filesystem size to 2^21 * 2^27 = 2^48bytes or 256TiB.
 88                                                    88 
 89 The solution to this problem is to use the met     89 The solution to this problem is to use the metablock group feature
 90 (META_BG), which is already in ext3 for all 2.     90 (META_BG), which is already in ext3 for all 2.6 releases. With the
 91 META_BG feature, ext4 filesystems are partitio     91 META_BG feature, ext4 filesystems are partitioned into many metablock
 92 groups. Each metablock group is a cluster of b     92 groups. Each metablock group is a cluster of block groups whose group
 93 descriptor structures can be stored in a singl     93 descriptor structures can be stored in a single disk block. For ext4
 94 filesystems with 4 KB block size, a single met     94 filesystems with 4 KB block size, a single metablock group partition
 95 includes 64 block groups, or 8 GiB of disk spa     95 includes 64 block groups, or 8 GiB of disk space. The metablock group
 96 feature moves the location of the group descri     96 feature moves the location of the group descriptors from the congested
 97 first block group of the whole filesystem into     97 first block group of the whole filesystem into the first group of each
 98 metablock group itself. The backups are in the     98 metablock group itself. The backups are in the second and last group of
 99 each metablock group. This increases the 2^21      99 each metablock group. This increases the 2^21 maximum block groups limit
100 to the hard limit 2^32, allowing support for a    100 to the hard limit 2^32, allowing support for a 512PiB filesystem.
101                                                   101 
102 The change in the filesystem format replaces t    102 The change in the filesystem format replaces the current scheme where
103 the superblock is followed by a variable-lengt    103 the superblock is followed by a variable-length set of block group
104 descriptors. Instead, the superblock and a sin    104 descriptors. Instead, the superblock and a single block group descriptor
105 block is placed at the beginning of the first,    105 block is placed at the beginning of the first, second, and last block
106 groups in a meta-block group. A meta-block gro    106 groups in a meta-block group. A meta-block group is a collection of
107 block groups which can be described by a singl    107 block groups which can be described by a single block group descriptor
108 block. Since the size of the block group descr    108 block. Since the size of the block group descriptor structure is 64
109 bytes, a meta-block group contains 16 block gr    109 bytes, a meta-block group contains 16 block groups for filesystems with
110 a 1KB block size, and 64 block groups for file    110 a 1KB block size, and 64 block groups for filesystems with a 4KB
111 blocksize. Filesystems can either be created u    111 blocksize. Filesystems can either be created using this new block group
112 descriptor layout, or existing filesystems can    112 descriptor layout, or existing filesystems can be resized on-line, and
113 the field s_first_meta_bg in the superblock wi    113 the field s_first_meta_bg in the superblock will indicate the first
114 block group using this new layout.                114 block group using this new layout.
115                                                   115 
116 Please see an important note about ``BLOCK_UNI    116 Please see an important note about ``BLOCK_UNINIT`` in the section about
117 block and inode bitmaps.                          117 block and inode bitmaps.
118                                                   118 
119 Lazy Block Group Initialization                   119 Lazy Block Group Initialization
120 -------------------------------                   120 -------------------------------
121                                                   121 
122 A new feature for ext4 are three block group d    122 A new feature for ext4 are three block group descriptor flags that
123 enable mkfs to skip initializing other parts o    123 enable mkfs to skip initializing other parts of the block group
124 metadata. Specifically, the INODE_UNINIT and B    124 metadata. Specifically, the INODE_UNINIT and BLOCK_UNINIT flags mean
125 that the inode and block bitmaps for that grou    125 that the inode and block bitmaps for that group can be calculated and
126 therefore the on-disk bitmap blocks are not in    126 therefore the on-disk bitmap blocks are not initialized. This is
127 generally the case for an empty block group or    127 generally the case for an empty block group or a block group containing
128 only fixed-location block group metadata. The     128 only fixed-location block group metadata. The INODE_ZEROED flag means
129 that the inode table has been initialized; mkf    129 that the inode table has been initialized; mkfs will unset this flag and
130 rely on the kernel to initialize the inode tab    130 rely on the kernel to initialize the inode tables in the background.
131                                                   131 
132 By not writing zeroes to the bitmaps and inode    132 By not writing zeroes to the bitmaps and inode table, mkfs time is
133 reduced considerably. Note the feature flag is    133 reduced considerably. Note the feature flag is RO_COMPAT_GDT_CSUM,
134 but the dumpe2fs output prints this as “unin    134 but the dumpe2fs output prints this as “uninit_bg”. They are the same
135 thing.                                            135 thing.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php