1 .. SPDX-License-Identifier: GPL-2.0 2 3 ========================================== 4 WHAT IS Flash-Friendly File System (F2FS)? 5 ========================================== 6 7 NAND flash memory-based storage devices, such 8 been equipped on a variety systems ranging fro 9 they are known to have different characteristi 10 disks, a file system, an upper layer to the st 11 changes from the sketch in the design level. 12 13 F2FS is a file system exploiting NAND flash me 14 is based on Log-structured File System (LFS). 15 addressing the fundamental issues in LFS, whic 16 tree and high cleaning overhead. 17 18 Since a NAND flash memory-based storage device 19 according to its internal geometry or flash me 20 F2FS and its tools support various parameters 21 layout, but also for selecting allocation and 22 23 The following git tree provides the file syste 24 a consistency checking tool (fsck.f2fs), and a 25 26 - git://git.kernel.org/pub/scm/linux/kernel/gi 27 28 For sending patches, please use the following 29 30 - linux-f2fs-devel@lists.sourceforge.net 31 32 For reporting bugs, please use the following f 33 34 - https://bugzilla.kernel.org/enter_bug.cgi?pr 35 36 Background and Design issues 37 ============================ 38 39 Log-structured File System (LFS) 40 -------------------------------- 41 "A log-structured file system writes all modif 42 a log-like structure, thereby speeding up bot 43 The log is the only structure on disk; it cont 44 files can be read back from the log efficientl 45 areas on disk for fast writing, we divide the 46 segment cleaner to compress the live informati 47 segments." from Rosenblum, M. and Ousterhout, 48 implementation of a log-structured file system 49 10, 1, 26–52. 50 51 Wandering Tree Problem 52 ---------------------- 53 In LFS, when a file data is updated and writte 54 pointer block is updated due to the changed lo 55 block is also updated due to the direct pointe 56 the upper index structures such as inode, inod 57 also updated recursively. This problem is call 58 and in order to enhance the performance, it sh 59 propagation as much as possible. 60 61 [1] Bityutskiy, A. 2005. JFFS3 design issues. 62 63 Cleaning Overhead 64 ----------------- 65 Since LFS is based on out-of-place writes, it 66 scattered across the whole storage. In order t 67 needs to reclaim these obsolete blocks seamles 68 as a cleaning process. 69 70 The process consists of three operations as fo 71 72 1. A victim segment is selected through refere 73 2. It loads parent index structures of all the 74 segment summary blocks. 75 3. It checks the cross-reference between the d 76 4. It moves valid data selectively. 77 78 This cleaning job may cause unexpected long de 79 is to hide the latencies to users. And also de 80 amount of valid data to be moved, and move the 81 82 Key Features 83 ============ 84 85 Flash Awareness 86 --------------- 87 - Enlarge the random write area for better per 88 spatial locality 89 - Align FS data structures to the operational 90 91 Wandering Tree Problem 92 ---------------------- 93 - Use a term, “node”, that represents inod 94 - Introduce Node Address Table (NAT) containin 95 blocks; this will cut off the update propaga 96 97 Cleaning Overhead 98 ----------------- 99 - Support a background cleaning process 100 - Support greedy and cost-benefit algorithms f 101 - Support multi-head logs for static/dynamic h 102 - Introduce adaptive logging for efficient blo 103 104 Mount Options 105 ============= 106 107 108 ======================== ===================== 109 background_gc=%s Turn on/off cleaning 110 collection, triggered 111 idle. If background_g 112 collection and if bac 113 will be turned off. I 114 on synchronous garbag 115 Default value for thi 116 collection is on by d 117 gc_merge When background_gc is 118 let background GC thr 119 it can eliminate the 120 GC operation when GC 121 I/O and CPU resources 122 nogc_merge Disable GC merge feat 123 disable_roll_forward Disable the roll-forw 124 norecovery Disable the roll-forw 125 only (i.e., -o ro,dis 126 discard/nodiscard Enable/disable real-t 127 enabled, f2fs will is 128 segment is cleaned. 129 heap/no_heap Deprecated. 130 nouser_xattr Disable Extended User 131 by default if CONFIG_ 132 noacl Disable POSIX Access 133 by default if CONFIG_ 134 active_logs=%u Support configuring t 135 current design, f2fs 136 Default number is 6. 137 disable_ext_identify Disable the extension 138 is not aware of cold 139 inline_xattr Enable the inline xat 140 noinline_xattr Disable the inline xa 141 inline_xattr_size=%u Support configuring i 142 flexible inline xattr 143 inline_data Enable the inline dat 144 files can be written 145 inline_dentry Enable the inline dir 146 directory entries can 147 space of inode block 148 dentries is limited t 149 noinline_dentry Disable the inline de 150 flush_merge Merge concurrent cach 151 to eliminate redundan 152 device handles the ca 153 recommend to enable t 154 nobarrier This option can be us 155 its cached data shoul 156 If this option is set 157 but f2fs still guaran 158 data writes. 159 barrier If this option is set 160 issued. 161 fastboot This option is used w 162 time as much as possi 163 can be sacrificed. 164 extent_cache Enable an extent cach 165 as many as extent whi 166 address and physical 167 increasing the cache 168 noextent_cache Disable an extent cac 169 the above extent_cach 170 noinline_data Disable the inline da 171 enabled by default. 172 data_flush Enable data flushing 173 persist data of regul 174 reserve_root=%d Support configuring r 175 allocation from a pri 176 gid, unit: 4KB, the d 177 resuid=%d The user ID which may 178 resgid=%d The group ID which ma 179 fault_injection=%d Enable fault injectio 180 specified injection r 181 fault_type=%d Support configuring f 182 enabled with fault_in 183 is shown below, it su 184 185 ===================== 186 Type_Name 187 ===================== 188 FAULT_KMALLOC 189 FAULT_KVMALLOC 190 FAULT_PAGE_ALLOC 191 FAULT_PAGE_GET 192 FAULT_ALLOC_BIO 193 FAULT_ALLOC_NID 194 FAULT_ORPHAN 195 FAULT_BLOCK 196 FAULT_DIR_DEPTH 197 FAULT_EVICT_INODE 198 FAULT_TRUNCATE 199 FAULT_READ_IO 200 FAULT_CHECKPOINT 201 FAULT_DISCARD 202 FAULT_WRITE_IO 203 FAULT_SLAB_ALLOC 204 FAULT_DQUOT_INIT 205 FAULT_LOCK_OP 206 FAULT_BLKADDR_VALIDIT 207 FAULT_BLKADDR_CONSIST 208 FAULT_NO_SEGMENT 209 ===================== 210 mode=%s Control block allocat 211 and "lfs". In "lfs" m 212 writes towards main a 213 "fragment:segment" an 214 These are developer o 215 fragmentation/after-G 216 modes to understand f 217 and eventually get so 218 In "fragment:segment" 219 position. With this, 220 In "fragment:block", 221 "max_fragment_chunk" 222 We added some randomn 223 it close to realistic 224 1..<max_fragment_chun 225 length of 1..<max_fra 226 allocated blocks will 227 Note that "fragment:b 228 option for more rando 229 Please, use these opt 230 recommend to re-forma 231 usrquota Enable plain user dis 232 grpquota Enable plain group di 233 prjquota Enable plain project 234 usrjquota=<file> Appoint specified fil 235 grpjquota=<file> information can be pr 236 prjjquota=<file> <quota file>: must be 237 jqfmt=<quota type> <quota type>: [vfsold 238 offusrjquota Turn off user journal 239 offgrpjquota Turn off group journa 240 offprjjquota Turn off project jour 241 quota Enable plain user dis 242 noquota Disable all plain dis 243 alloc_mode=%s Adjust block allocati 244 and "default". 245 fsync_mode=%s Control the policy of 246 "strict", and "nobarr 247 default, fsync will f 248 light operation to im 249 In "strict" mode, fsy 250 with xfs, ext4 and bt 251 pass, but the perform 252 based on "posix", but 253 non-atomic files like 254 test_dummy_encryption 255 test_dummy_encryption=%s 256 Enable dummy encrypti 257 context. The fake fsc 258 The argument may be e 259 select the correspond 260 checkpoint=%s[:%u[%]] Set to "disable" to t 261 to reenable checkpoin 262 disabled, any unmount 263 the filesystem conten 264 filesystem was mounte 265 While mounting with c 266 run garbage collectio 267 be used. If this take 268 EAGAIN. You may optio 269 of the disk you would 270 avoid additional garb 271 number of blocks, or 272 with checkpoint=disab 273 hide up to all remain 274 would be unusable can 275 This space is reclaim 276 checkpoint_merge When checkpoint is en 277 daemon and make it to 278 much as possible to e 279 we can eliminate the 280 operation when the ch 281 a cgroup having low i 282 do better, we set the 283 to "3", to give one h 284 This is the same way 285 journaling thread of 286 nocheckpoint_merge Disable checkpoint me 287 compress_algorithm=%s Control compress algo 288 "lz4", "zstd" and "lz 289 compress_algorithm=%s:%d Control compress algo 290 "lz4" and "zstd" supp 291 algorithm level 292 lz4 3 - 16 293 zstd 1 - 22 294 compress_log_size=%u Support configuring c 295 be 4KB * (1 << %u). T 296 compress_extension=%s Support adding specif 297 compression on those 298 with '.ext' has high 299 on compression extens 300 these file by default 301 For other files, we c 302 Note that, there is o 303 can be set to enable 304 nocompress_extension=%s Support adding specif 305 compression on those 306 If you know exactly w 307 The same extension na 308 extension at the same 309 If the compress exten 310 nocompress extension 311 Don't allow use '*' t 312 After add nocompress_ 313 dir_flag < comp_exten 314 See more in compressi 315 316 compress_chksum Support verifying chk 317 compress_mode=%s Control file compress 318 modes. In "fs" mode ( 319 on the compression en 320 the automaic compress 321 choosing the target f 322 compression/decompres 323 ioctls. 324 compress_cache Support to use addres 325 cache compressed bloc 326 random read. 327 inlinecrypt When possible, encryp 328 files using the blk-c 329 filesystem-layer encr 330 inline encryption har 331 unaffected. For more 332 Documentation/block/i 333 atgc Enable age-threshold 334 effectiveness and eff 335 discard_unit=%s Control discard unit, 336 and "section", issued 337 aligned to the unit, 338 so that small discard 339 For blkzoned device, 340 default, it is helpfu 341 reduce memory cost by 342 discard. 343 memory=%s Control memory mode. 344 "low" mode is introdu 345 Because of the nature 346 will try to save memo 347 "normal" mode is the 348 age_extent_cache Enable an age extent 349 data block update fre 350 order to provide bett 351 allocation. 352 errors=%s Specify f2fs behavior 353 "panic", "continue" a 354 panic immediately, co 355 the partition in read 356 mode. 357 ===================== 358 mode 359 ===================== 360 access ops 361 syscall errors 362 mount option 363 pending dir write 364 pending non-dir write 365 pending node write 366 pending meta write 367 ===================== 368 ======================== ===================== 369 370 Debugfs Entries 371 =============== 372 373 /sys/kernel/debug/f2fs/ contains information a 374 f2fs. Each file shows the whole f2fs informati 375 376 /sys/kernel/debug/f2fs/status includes: 377 378 - major file system information managed by f2 379 - average SIT information about whole segment 380 - current memory footprint consumed by f2fs. 381 382 Sysfs Entries 383 ============= 384 385 Information about mounted f2fs file systems ca 386 /sys/fs/f2fs. Each mounted filesystem will ha 387 /sys/fs/f2fs based on its device name (i.e., / 388 The files in each per-device directory are sho 389 390 Files in /sys/fs/f2fs/<devname> 391 (see also Documentation/ABI/testing/sysfs-fs-f 392 393 Usage 394 ===== 395 396 1. Download userland tools and compile them. 397 398 2. Skip, if f2fs was compiled statically insid 399 Otherwise, insert the f2fs.ko module:: 400 401 # insmod f2fs.ko 402 403 3. Create a directory to use when mounting:: 404 405 # mkdir /mnt/f2fs 406 407 4. Format the block device, and then mount as 408 409 # mkfs.f2fs -l label /dev/block_device 410 # mount -t f2fs /dev/block_device /mnt 411 412 mkfs.f2fs 413 --------- 414 The mkfs.f2fs is for the use of formatting a p 415 which builds a basic on-disk layout. 416 417 The quick options consist of: 418 419 =============== =========================== 420 ``-l [label]`` Give a volume label, up to 421 ``-a [0 or 1]`` Split start location of eac 422 423 1 is set by default, which 424 ``-o [int]`` Set overprovision ratio in 425 426 5 is set by default. 427 ``-s [int]`` Set the number of segments 428 429 1 is set by default. 430 ``-z [int]`` Set the number of sections 431 432 1 is set by default. 433 ``-e [str]`` Set basic extension list. e 434 ``-t [0 or 1]`` Disable discard command or 435 436 1 is set by default, which 437 =============== =========================== 438 439 Note: please refer to the manpage of mkfs.f2fs 440 441 fsck.f2fs 442 --------- 443 The fsck.f2fs is a tool to check the consisten 444 partition, which examines whether the filesyst 445 are cross-referenced correctly or not. 446 Note that, initial version of the tool does no 447 448 The quick options consist of:: 449 450 -d debug level [default:0] 451 452 Note: please refer to the manpage of fsck.f2fs 453 454 dump.f2fs 455 --------- 456 The dump.f2fs shows the information of specifi 457 file. Each file is dump_ssa and dump_sit. 458 459 The dump.f2fs is used to debug on-disk data st 460 It shows on-disk inode information recognized 461 able to dump all the SSA and SIT entries into 462 ./dump_sit respectively. 463 464 The options consist of:: 465 466 -d debug level [default:0] 467 -i inode no (hex) 468 -s [SIT dump segno from #1~#2 (decimal), for 469 -a [SSA dump segno from #1~#2 (decimal), for 470 471 Examples:: 472 473 # dump.f2fs -i [ino] /dev/sdx 474 # dump.f2fs -s 0~-1 /dev/sdx (SIT dump) 475 # dump.f2fs -a 0~-1 /dev/sdx (SSA dump) 476 477 Note: please refer to the manpage of dump.f2fs 478 479 sload.f2fs 480 ---------- 481 The sload.f2fs gives a way to insert files and 482 image. This tool is useful when building f2fs 483 484 Note: please refer to the manpage of sload.f2f 485 486 resize.f2fs 487 ----------- 488 The resize.f2fs lets a user resize the f2fs-fo 489 all the files and directories stored in the im 490 491 Note: please refer to the manpage of resize.f2 492 493 defrag.f2fs 494 ----------- 495 The defrag.f2fs can be used to defragment scat 496 filesystem metadata across the disk. This can 497 more free consecutive space. 498 499 Note: please refer to the manpage of defrag.f2 500 501 f2fs_io 502 ------- 503 The f2fs_io is a simple tool to issue various 504 f2fs-specific ones, which is very useful for Q 505 506 Note: please refer to the manpage of f2fs_io(8 507 508 Design 509 ====== 510 511 On-disk Layout 512 -------------- 513 514 F2FS divides the whole volume into a number of 515 to 2MB in size. A section is composed of conse 516 consists of a set of sections. By default, sec 517 segment size identically, but users can easily 518 519 F2FS splits the entire volume into six areas, 520 consist of multiple segments as described belo 521 522 al 523 |-> align with the segment si 524 _________________________________________ 525 | | | Segment | 526 | Superblock | Checkpoint | Info. | 527 | (SB) | (CP) | Table (SIT) | 528 |____________|_____2______|______N______|_ 529 530 531 532 ._________ 533 |_Segment_ 534 . 535 ._________ 536 |_section_ 537 . 538 .________. 539 |__zone__| 540 541 - Superblock (SB) 542 It is located at the beginning of the parti 543 to avoid file system crash. It contains bas 544 default parameters of f2fs. 545 546 - Checkpoint (CP) 547 It contains file system information, bitmap 548 inode lists, and summary entries of current 549 550 - Segment Information Table (SIT) 551 It contains segment information such as val 552 validity of all the blocks. 553 554 - Node Address Table (NAT) 555 It is composed of a block address table for 556 Main area. 557 558 - Segment Summary Area (SSA) 559 It contains summary entries which contains 560 data and node blocks stored in Main area. 561 562 - Main Area 563 It contains file and directory data includi 564 565 In order to avoid misalignment between file sy 566 aligns the start block address of CP with the 567 start block address of Main area with the zone 568 in SSA area. 569 570 Reference the following survey for additional 571 https://wiki.linaro.org/WorkingGroups/Kernel/P 572 573 File System Metadata Structure 574 ------------------------------ 575 576 F2FS adopts the checkpointing scheme to mainta 577 mount time, F2FS first tries to find the last 578 CP area. In order to reduce the scanning time, 579 One of them always indicates the last valid da 580 mechanism. In addition to CP, NAT and SIT also 581 582 For file system consistency, each CP points to 583 valid, as shown as below:: 584 585 +--------+----------+---------+ 586 | CP | SIT | NAT | 587 +--------+----------+---------+ 588 . . . . 589 . . . . 590 . . . 591 +-------+-------+--------+--------+--------+ 592 | CP #0 | CP #1 | SIT #0 | SIT #1 | NAT #0 | 593 +-------+-------+--------+--------+--------+ 594 | ^ 595 | | 596 `---------------------------------------- 597 598 Index Structure 599 --------------- 600 601 The key data structure to manage the data loca 602 traditional file structures, F2FS has three ty 603 indirect node. F2FS assigns 4KB to an inode bl 604 indices, two direct node pointers, two indirec 605 indirect node pointer as described below. One 606 data blocks, and one indirect node block conta 607 one inode block (i.e., a file) covers:: 608 609 4KB * (923 + 2 * 1018 + 2 * 1018 * 1018 + 10 610 611 Inode block (4KB) 612 |- data (923) 613 |- direct node (2) 614 | `- data (1018) 615 |- indirect node (2) 616 | `- direct node (1018) 617 | `- data (1018) 618 `- double indirect node (1) 619 `- indirect node (101 620 `- direc 621 622 623 Note that all the node blocks are mapped by NA 624 each node is translated by the NAT table. In t 625 tree problem, F2FS is able to cut off the prop 626 leaf data writes. 627 628 Directory Structure 629 ------------------- 630 631 A directory entry occupies 11 bytes, which con 632 633 - hash hash value of the file name 634 - ino inode number 635 - len the length of file name 636 - type file type such as directory, s 637 638 A dentry block consists of 214 dentry slots an 639 used to represent whether each dentry is valid 640 4KB with the following composition. 641 642 :: 643 644 Dentry Block(4 K) = bitmap (27 bytes) + rese 645 dentries(11 * 214 bytes) 646 647 [Bucket] 648 +-------------------------------- 649 |dentry block 1 | dentry block 2 650 +-------------------------------- 651 . . 652 . . 653 . [Dentry Block Structure: 4KB] 654 +--------+----------+----------+------------ 655 | bitmap | reserved | dentries | file names 656 +--------+----------+----------+------------ 657 [Dentry Block: 4KB] . . 658 . . 659 . . 660 +------+------+-----+------+ 661 | hash | ino | len | type | 662 +------+------+-----+------+ 663 [Dentry Structure: 11 bytes] 664 665 F2FS implements multi-level hash tables for di 666 a hash table with dedicated number of hash buc 667 "A(2B)" means a bucket includes 2 data blocks. 668 669 :: 670 671 ---------------------- 672 A : bucket 673 B : block 674 N : MAX_DIR_HASH_DEPTH 675 ---------------------- 676 677 level #0 | A(2B) 678 | 679 level #1 | A(2B) - A(2B) 680 | 681 level #2 | A(2B) - A(2B) - A(2B) - A(2B) 682 . | . . . . 683 level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) 684 . | . . . . 685 level #N | A(4B) - A(4B) - A(4B) - A(4B) 686 687 The number of blocks and buckets are determine 688 689 ,- 2, if n < MAX_D 690 # of blocks in level #n = | 691 `- 4, Otherwise 692 693 ,- 2^(n + dir_lev 694 | if n + d 695 # of buckets in level #n = | 696 `- 2^((MAX_DIR_HA 697 Otherwis 698 699 When F2FS finds a file name in a directory, at 700 name is calculated. Then, F2FS scans the hash 701 dentry consisting of the file name and its ino 702 scans the next hash table in level #1. In this 703 each levels incrementally from 1 to N. In each 704 one bucket determined by the following equatio 705 complexity:: 706 707 bucket number to scan in level #n = (hash va 708 709 In the case of file creation, F2FS finds empty 710 file name. F2FS searches the empty slots in th 711 1 to N in the same way as the lookup operation 712 713 The following figure shows an example of two c 714 715 --------------> Dir <-------------- 716 | | 717 child child 718 719 child - child [hole] - 720 721 child - child - child [hole] - 722 723 Case 1: Case 2: 724 Number of children = 6, Number of 725 File size = 7 File size 726 727 Default Block Allocation 728 ------------------------ 729 730 At runtime, F2FS manages six active logs insid 731 and Hot/Warm/Cold data. 732 733 - Hot node contains direct node blocks of 734 - Warm node contains direct node blocks ex 735 - Cold node contains indirect node blocks 736 - Hot data contains dentry blocks 737 - Warm data contains data blocks except ho 738 - Cold data contains multimedia data or mi 739 740 LFS has two schemes for free space management: 741 tion. The copy-and-compaction scheme which is 742 for devices showing very good sequential write 743 are served all the time for writing new data. 744 overhead under high utilization. Contrarily, t 745 from random writes, but no cleaning process is 746 scheme where the copy-and-compaction scheme is 747 policy is dynamically changed to the threaded 748 system status. 749 750 In order to align F2FS with underlying flash-b 751 segment in a unit of section. F2FS expects tha 752 same as the unit size of garbage collection in 753 to the mapping granularity in FTL, F2FS alloca 754 logs from different zones as much as possible, 755 the active logs into one allocation unit accor 756 757 Cleaning process 758 ---------------- 759 760 F2FS does cleaning both on demand and in the b 761 triggered when there are not enough free segme 762 cleaner is operated by a kernel thread, and tr 763 system is idle. 764 765 F2FS supports two victim selection policies: g 766 In the greedy algorithm, F2FS selects a victim 767 of valid blocks. In the cost-benefit algorithm 768 according to the segment age and the number of 769 log block thrashing problem in the greedy algo 770 algorithm for on-demand cleaner, while backgro 771 algorithm. 772 773 In order to identify whether the data in the v 774 F2FS manages a bitmap. Each bit represents the 775 bitmap is composed of a bit stream covering wh 776 777 Write-hint Policy 778 ----------------- 779 780 F2FS sets the whint all the time with the belo 781 782 ===================== ======================== 783 User F2FS 784 ===================== ======================== 785 N/A META 786 N/A HOT_NODE 787 N/A WARM_NODE 788 N/A COLD_NODE 789 ioctl(COLD) COLD_DATA 790 extension list " 791 792 -- buffered io 793 N/A COLD_DATA 794 N/A HOT_DATA 795 N/A WARM_DATA 796 797 -- direct io 798 WRITE_LIFE_EXTREME COLD_DATA 799 WRITE_LIFE_SHORT HOT_DATA 800 WRITE_LIFE_NOT_SET WARM_DATA 801 WRITE_LIFE_NONE " 802 WRITE_LIFE_MEDIUM " 803 WRITE_LIFE_LONG " 804 ===================== ======================== 805 806 Fallocate(2) Policy 807 ------------------- 808 809 The default policy follows the below POSIX rul 810 811 Allocating disk space 812 The default operation (i.e., mode is zero) 813 the disk space within the range specified 814 file size (as reported by stat(2)) will be 815 greater than the file size. Any subregion 816 by offset and len that did not contain dat 817 initialized to zero. This default behavio 818 behavior of the posix_fallocate(3) library 819 as a method of optimally implementing that 820 821 However, once F2FS receives ioctl(fd, F2FS_IOC 822 fallocate(fd, DEFAULT_MODE), it allocates on-d 823 zero or random data, which is useful to the be 824 825 1. create(fd) 826 2. ioctl(fd, F2FS_IOC_SET_PIN_FILE) 827 3. fallocate(fd, 0, 0, size) 828 4. address = fibmap(fd, offset) 829 5. open(blkdev) 830 6. write(blkdev, address) 831 832 Compression implementation 833 -------------------------- 834 835 - New term named cluster is defined as basic u 836 be divided into multiple clusters logically. 837 (n >= 0) logical pages, compression size is 838 cluster can be compressed or not. 839 840 - In cluster metadata layout, one special bloc 841 a cluster is a compressed one or normal one; 842 metadata maps cluster to [1, 4 << n - 1] phy 843 stores data including compress header and co 844 845 - In order to eliminate write amplification du 846 support compression on write-once file, data 847 all logical blocks in cluster contain valid 848 cluster data is lower than specified thresho 849 850 - To enable compression on regular inode, ther 851 852 * chattr +c file 853 * chattr +c dir; touch dir/file 854 * mount w/ -o compress_extension=ext; touch 855 * mount w/ -o compress_extension=*; touch an 856 857 - To disable compression on regular inode, the 858 859 * chattr -c file 860 * mount w/ -o nocompress_extension=ext; touc 861 862 - Priority in between FS_COMPR_FL, FS_NOCOMP_F 863 864 * compress_extension=so; nocompress_extensio 865 dir/foo.so; touch dir/bar.zip; touch dir/b 866 should be compresse, bar.zip should be non 867 can enable compress on bar.zip. 868 * compress_extension=so; nocompress_extensio 869 dir/foo.so; touch dir/bar.zip; touch dir/b 870 compresse, bar.zip and baz.txt should be n 871 chattr+c dir/bar.zip; chattr+c dir/baz.txt 872 and baz.txt. 873 874 - At this point, compression feature doesn't e 875 directly in order to guarantee potential dat 876 Instead, the main goal is to reduce data wri 877 possible, resulting in extending disk life t 878 congestion. Alternatively, we've added ioctl 879 interface to reclaim compressed space and sh 880 special flag to the inode. Once the compress 881 will block writing data to the file until ei 882 reserved via ioctl(F2FS_IOC_RESERVE_COMPRESS 883 truncated to zero. 884 885 Compress metadata layout:: 886 887 [Dnode Structu 888 +----------------------------- 889 | cluster 1 | cluster 2 | .... 890 +----------------------------- 891 . . 892 . . 893 . Compressed Cluster . 894 +----------+---------+---------+---------+ 895 |compr flag| block 1 | block 2 | block 3 | 896 +----------+---------+---------+---------+ 897 . . 898 . 899 . 900 +-------------+-------------+--------- 901 | data length | data chksum | reserved 902 +-------------+-------------+--------- 903 904 Compression mode 905 -------------------------- 906 907 f2fs supports "fs" and "user" compression mode 908 With this option, f2fs provides a choice to se 909 compression enabled files (refer to "Compressi 910 enable compression on a regular inode). 911 912 1) compress_mode=fs 913 This is the default option. f2fs does automati 914 compression enabled files. 915 916 2) compress_mode=user 917 This disables the automatic compression and gi 918 target file and the timing. The user can do ma 919 compression enabled files using F2FS_IOC_DECOM 920 ioctls like the below. 921 922 To decompress a file, 923 924 fd = open(filename, O_WRONLY, 0); 925 ret = ioctl(fd, F2FS_IOC_DECOMPRESS_FILE); 926 927 To compress a file, 928 929 fd = open(filename, O_WRONLY, 0); 930 ret = ioctl(fd, F2FS_IOC_COMPRESS_FILE); 931 932 NVMe Zoned Namespace devices 933 ---------------------------- 934 935 - ZNS defines a per-zone capacity which can be 936 zone-size. Zone-capacity is the number of us 937 F2FS checks if zone-capacity is less than zo 938 segment which starts after the zone-capacity 939 the free segment bitmap at initial mount tim 940 as permanently used so they are not allocate 941 consequently are not needed to be garbage co 942 zone-capacity is not aligned to default segm 943 can start before the zone-capacity and span 944 Such spanning segments are also considered a 945 past the zone-capacity are considered unusab
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.