1 .. SPDX-License-Identifier: GPL-2.0 2 3 4 ============================== 5 The Second Extended Filesystem 6 ============================== 7 8 ext2 was originally released in January 1993. 9 Theodore Ts'o and Stephen Tweedie, it was a ma 10 Extended Filesystem. It is currently still (A 11 filesystem in use by Linux. There are also im 12 for NetBSD, FreeBSD, the GNU HURD, Windows 95/ 13 14 Options 15 ======= 16 17 Most defaults are determined by the filesystem 18 set using tune2fs(8). Kernel-determined defaul 19 20 ==================== === ============== 21 bsddf (*) Makes ``df`` a 22 minixdf Makes ``df`` a 23 24 check=none, nocheck (*) Don't do extra 25 (check=normal 26 27 dax Use direct acc 28 Documentation/ 29 30 debug Extra debuggin 31 kernel syslog. 32 33 errors=continue Keep going on 34 errors=remount-ro Remount the fi 35 errors=panic Panic and halt 36 37 grpid, bsdgroups Give objects t 38 nogrpid, sysvgroups New objects ha 39 40 nouid32 Use 16-bit UID 41 42 oldalloc Enable the old 43 have better pe 44 feedback if it 45 orlov (*) Use the Orlov 46 (See http://lw 47 http://lwn.net 48 49 resuid=n The user ID wh 50 resgid=n The group ID w 51 52 sb=n Use alternate 53 54 user_xattr Enable "user." 55 (requires CONF 56 nouser_xattr Don't support 57 58 acl Enable POSIX A 59 (requires CONF 60 noacl Don't support 61 62 quota, usrquota Enable user di 63 (requires CONF 64 65 grpquota Enable group d 66 (requires CONF 67 ==================== === ============== 68 69 noquota option ls silently ignored by ext2. 70 71 72 Specification 73 ============= 74 75 ext2 shares many properties with traditional U 76 the concepts of blocks, inodes and directories 77 specification for Access Control Lists (ACLs), 78 compression though these are not yet implement 79 separate patches). There is also a versioning 80 features (such as journalling) to be added in 81 manner. 82 83 Blocks 84 ------ 85 86 The space in the device or file is split up in 87 a fixed size, of 1024, 2048 or 4096 bytes (819 88 which is decided when the filesystem is create 89 less wasted space per file, but require slight 90 and also impose other limits on the size of fi 91 92 Block Groups 93 ------------ 94 95 Blocks are clustered into block groups in orde 96 and minimise the amount of head seeking when r 97 of consecutive data. Information about each b 98 descriptor table stored in the block(s) immedi 99 Two blocks near the start of each group are re 100 bitmap and the inode usage bitmap which show w 101 are in use. Since each bitmap is limited to a 102 that the maximum size of a block group is 8 ti 103 104 The block(s) following the bitmaps in each blo 105 as the inode table for that block group and th 106 blocks. The block allocation algorithm attemp 107 in the same block group as the inode which con 108 109 The Superblock 110 -------------- 111 112 The superblock contains all the information ab 113 the filing system. The primary copy of the su 114 offset of 1024 bytes from the start of the dev 115 to mounting the filesystem. Since it is so im 116 the superblock are stored in block groups thro 117 The first version of ext2 (revision 0) stores 118 every block group, along with backups of the g 119 Because this can consume a considerable amount 120 filesystems, later revisions can optionally re 121 copies by only putting backups in specific gro 122 superblock feature). The groups chosen are 0, 123 124 The information in the superblock contains fie 125 number of inodes and blocks in the filesystem 126 how many inodes and blocks are in each block g 127 was mounted (and if it was cleanly unmounted), 128 what version of the filesystem it is (see the 129 and which OS created it. 130 131 If the filesystem is revision 1 or higher, the 132 such as a volume name, a unique identification 133 and space for optional filesystem features to 134 135 All fields in the superblock (as in all other 136 on the disc in little endian format, so a file 137 machines without having to know what machine i 138 139 Inodes 140 ------ 141 142 The inode (index node) is a fundamental concep 143 Each object in the filesystem is represented b 144 structure contains pointers to the filesystem 145 data held in the object and all of the metadat 146 its name. The metadata about an object includ 147 group, flags, size, number of blocks used, acc 148 modification time, deletion time, number of li 149 (for NFS) and extended attributes (EAs) and/or 150 151 There are some reserved fields which are curre 152 structure and several which are overloaded. O 153 directory ACL if the inode is a directory and 154 bits of the file size if the inode is a regula 155 larger than 2GB). The translator field is unu 156 by the HURD to reference the inode of a progra 157 interpret this object. Most of the remaining 158 used up for both Linux and the HURD for larger 159 The HURD also has a larger mode field so it us 160 fields to store the extra more bits. 161 162 There are pointers to the first 12 blocks whic 163 in the inode. There is a pointer to an indire 164 pointers to the next set of blocks), a pointer 165 block (which contains pointers to indirect blo 166 trebly-indirect block (which contains pointers 167 168 The flags field contains some ext2-specific fl 169 for by the standard chmod flags. These flags 170 and changed with the chattr command, and allow 171 behaviour on a per-file basis. There are flag 172 undeletable, compression, synchronous updates, 173 dumpable, no-atime, indexed directories, and d 174 of these are supported yet. 175 176 Directories 177 ----------- 178 179 A directory is a filesystem object and has an 180 It is a specially formatted file containing re 181 each name with an inode number. Later revisio 182 encode the type of the object (file, directory 183 socket) to avoid the need to check the inode i 184 (support for taking advantage of this feature 185 Glibc 2.2). 186 187 The inode allocation code tries to assign inod 188 block group as the directory in which they are 189 190 The current implementation of ext2 uses a sing 191 the filenames in the directory; a pending enha 192 filenames to allow lookup without the need to 193 194 The current implementation never removes empty 195 have been allocated to hold more files. 196 197 Special files 198 ------------- 199 200 Symbolic links are also filesystem objects wit 201 special mention because the data for them is s 202 itself if the symlink is less than 60 bytes lo 203 which would normally be used to store the poin 204 This is a worthwhile optimisation as it we avo 205 block for the symlink, and most symlinks are l 206 207 Character and block special devices never have 208 them. Instead, their device number is stored 209 the fields which would be used to point to the 210 211 Reserved Space 212 -------------- 213 214 In ext2, there is a mechanism for reserving a 215 for a particular user (normally the super-user 216 allow for the system to continue functioning e 217 fill up all the space available to them (this 218 quotas). It also keeps the filesystem from fi 219 helps combat fragmentation. 220 221 Filesystem check 222 ---------------- 223 224 At boot time, most systems run a consistency c 225 filesystems. The superblock of the ext2 files 226 fields which indicate whether fsck should actu 227 the filesystem at boot can take a long time if 228 run if the filesystem was not cleanly unmounte 229 count has been exceeded or if the maximum time 230 exceeded. 231 232 Feature Compatibility 233 --------------------- 234 235 The compatibility feature mechanism used in ex 236 It safely allows features to be added to the f 237 unnecessarily sacrificing compatibility with o 238 filesystem code. The feature compatibility me 239 the original revision 0 (EXT2_GOOD_OLD_REV) of 240 revision 1. There are three 32-bit fields, on 241 (COMPAT), one for read-only compatible (RO_COM 242 incompatible (INCOMPAT) features. 243 244 These feature flags have specific meanings for 245 246 A COMPAT flag indicates that a feature is pres 247 but the on-disk format is 100% compatible with 248 a kernel which didn't know anything about this 249 the filesystem without any chance of corruptin 250 making it inconsistent). This is essentially 251 "this filesystem has a (hidden) feature" that 252 want to be aware of (more on e2fsck and featur 253 HAS_JOURNAL feature is a COMPAT flag because t 254 a regular file with data blocks in it so the k 255 take any special notice of it if it doesn't un 256 257 An RO_COMPAT flag indicates that the on-disk f 258 with older on-disk formats for reading (i.e. t 259 the visible on-disk format). However, an old 260 filesystem would/could corrupt the filesystem, 261 most common such feature, SPARSE_SUPER, is an 262 sparse groups allow file data blocks where sup 263 backups used to live, and ext2_free_blocks() r 264 which would leading to inconsistent bitmaps. 265 get an error if it tried to free a series of b 266 boundary, but this is a legitimate layout in a 267 268 An INCOMPAT flag indicates the on-disk format 269 way that makes it unreadable by older kernels, 270 cause a problem if an old kernel tried to moun 271 INCOMPAT flag because older kernels would thin 272 than 256 characters, which would lead to corru 273 The COMPRESSION flag is an obvious INCOMPAT fl 274 doesn't understand compression, you would just 275 read() instead of it automatically decompressi 276 RECOVER flag is needed to prevent a kernel whi 277 ext3 journal from mounting the filesystem with 278 279 For e2fsck, it needs to be more strict with th 280 flags than the kernel. If it doesn't understa 281 RO_COMPAT, or INCOMPAT flags it will refuse to 282 because it has no way of verifying whether a g 283 or not. Allowing e2fsck to succeed on a files 284 feature is a false sense of security for the u 285 a filesystem with unknown features is a good i 286 update to the latest e2fsck. This also means 287 flags to ext2 also needs to update e2fsck to v 288 289 Metadata 290 -------- 291 292 It is frequently claimed that the ext2 impleme 293 asynchronous metadata is faster than the ffs s 294 scheme but less reliable. Both methods are eq 295 respective fsck programs. 296 297 If you're exceptionally paranoid, there are 3 298 writes synchronous on ext2: 299 300 - per-file if you have the program source: use 301 - per-file if you don't have the source: use " 302 - per-filesystem: add the "sync" option to mou 303 304 the first and last are not ext2 specific but d 305 be written synchronously. See also Journaling 306 307 Limitations 308 ----------- 309 310 There are various limits imposed by the on-dis 311 limits are imposed by the current implementati 312 Many of the limits are determined at the time 313 created, and depend upon the block size chosen 314 data blocks is fixed at filesystem creation ti 315 increase the number of inodes is to increase t 316 No tools currently exist which can change the 317 318 Most of these limits could be overcome with sl 319 format and using a compatibility flag to signa 320 the expense of some compatibility). 321 322 ===================== ======= ======= = 323 Filesystem block size 1kB 2kB 324 ===================== ======= ======= = 325 File size limit 16GB 256GB 326 Filesystem size limit 2047GB 8192GB 1 327 ===================== ======= ======= = 328 329 There is a 2.4 kernel limit of 2048GB for a si 330 filesystem larger than that can be created at 331 an upper limit on the block size imposed by th 332 so 8kB blocks are only allowed on Alpha system 333 which support larger pages). 334 335 There is an upper limit of 32000 subdirectorie 336 337 There is a "soft" upper limit of about 10-15k 338 with the current linear linked-list directory 339 stems from performance problems when creating 340 finding) files in such large directories. Usi 341 (under development) allows 100k-1M+ files in a 342 performance problems (although RAM size become 343 344 The (meaningless) absolute upper limit of file 345 (imposed by the file size, the realistic limit 346 is over 130 trillion files. It would be highe 347 enough 4-character names to make up unique dir 348 have to be 8 character filenames, even then we 349 running out of unique filenames. 350 351 Journaling 352 ---------- 353 354 A journaling extension to the ext2 code has be 355 Tweedie. It avoids the risks of metadata corr 356 wait for e2fsck to complete after a crash, wit 357 to the on-disk ext2 layout. In a nutshell, th 358 file which stores whole metadata (and optional 359 been modified, prior to writing them into the 360 it is possible to add a journal to an existing 361 the need for data conversion. 362 363 When changes to the filesystem (e.g. a file is 364 a transaction in the journal and can either be 365 the time of a crash. If a transaction is comp 366 (or in the normal case where the system does n 367 in that transaction are guaranteed to represen 368 and are copied into the filesystem. If a tran 369 the time of the crash, then there is no guaran 370 the blocks in that transaction so they are dis 371 filesystem changes they represent are also los 372 Check Documentation/filesystems/ext4/ if you w 373 ext4 and journaling. 374 375 References 376 ========== 377 378 ======================= ====================== 379 The kernel source file:/usr/src/linux/fs 380 e2fsprogs (e2fsck) http://e2fsprogs.sourc 381 Design & Implementation http://e2fsprogs.sourc 382 Journaling (ext3) ftp://ftp.uk.linux.org 383 Filesystem Resizing http://ext2resize.sour 384 Compression [1]_ http://e2compr.sourcef 385 ======================= ====================== 386 387 Implementations for: 388 389 ======================= ====================== 390 Windows 95/98/NT/2000 http://www.chrysocome. 391 Windows 95 [1]_ http://www.yipton.net/ 392 DOS client [1]_ ftp://metalab.unc.edu/ 393 OS/2 [2]_ ftp://metalab.unc.edu/ 394 RISC OS client http://www.esw-heim.tu 395 ======================= ====================== 396 397 .. [1] no longer actively developed/supported 398 .. [2] no longer actively developed/supported
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.