1 .. SPDX-License-Identifier: GPL-2.0 2 3 =================================== 4 Cache on Already Mounted Filesystem 5 =================================== 6 7 .. Contents: 8 9 (*) Overview. 10 11 (*) Requirements. 12 13 (*) Configuration. 14 15 (*) Starting the cache. 16 17 (*) Things to avoid. 18 19 (*) Cache culling. 20 21 (*) Cache structure. 22 23 (*) Security model and SELinux. 24 25 (*) A note on security. 26 27 (*) Statistical information. 28 29 (*) Debugging. 30 31 (*) On-demand Read. 32 33 34 Overview 35 ======== 36 37 CacheFiles is a caching backend that's meant t 38 an already mounted filesystem of a local type 39 40 CacheFiles uses a userspace daemon to do some 41 reaping stale nodes and culling. This is call 42 /sbin. 43 44 The filesystem and data integrity of the cache 45 filesystem providing the backing services. No 46 attempt to journal anything since the journall 47 filesystems are very specific in nature. 48 49 CacheFiles creates a misc character device - " 50 to communication with the daemon. Only one th 51 and while it is open, a cache is at least part 52 opens this and sends commands down it to contr 53 54 CacheFiles is currently limited to a single ca 55 56 CacheFiles attempts to maintain at least a cer 57 the filesystem, shrinking the cache by culling 58 space if necessary - see the "Cache Culling" s 59 placed on the same medium as a live set of dat 60 spare space and automatically contract when th 61 space. 62 63 64 65 Requirements 66 ============ 67 68 The use of CacheFiles and its daemon requires 69 available in the system and in the cache files 70 71 - dnotify. 72 73 - extended attributes (xattrs). 74 75 - openat() and friends. 76 77 - bmap() support on files in the files 78 79 - The use of bmap() to detect a partia 80 81 It is strongly recommended that the "dir_index 82 filesystems being used as a cache. 83 84 85 Configuration 86 ============= 87 88 The cache is configured by a script in /etc/ca 89 set up cache ready for use. The following scr 90 91 brun <N>%, bcull <N>%, bstop <N>%, frun <N>%, 92 Configure the culling limits. Optiona 93 The defaults are 7% (run), 5% (cull) a 94 95 The commands beginning with a 'b' are 96 beginning with an 'f' are file count l 97 98 dir <path> 99 Specify the directory containing the r 100 101 tag <name> 102 Specify a tag to FS-Cache to use in di 103 Optional. The default is "CacheFiles" 104 105 debug <mask> 106 Specify a numeric bitmask to control d 107 Optional. The default is zero (all of 108 OR'd into the mask to collect various 109 110 == ====================== 111 1 Turn on trace of funct 112 2 Turn on trace of funct 113 4 Turn on trace of inter 114 == ====================== 115 116 This mask can also be set through sysf 117 118 echo 5 > /sys/module/cachefile 119 120 121 Starting the Cache 122 ================== 123 124 The cache is started by running the daemon. T 125 configures the cache and tells it to begin cac 126 binds to fscache and the cache becomes live. 127 128 The daemon is run as follows:: 129 130 /sbin/cachefilesd [-d]* [-s] [-n] [-f 131 132 The flags are: 133 134 ``-d`` 135 Increase the debugging level. This ca 136 is cumulative with itself. 137 138 ``-s`` 139 Send messages to stderr instead of sys 140 141 ``-n`` 142 Don't daemonise and go into background 143 144 ``-f <configfile>`` 145 Use an alternative configuration file 146 147 148 Things to Avoid 149 =============== 150 151 Do not mount other things within the cache as 152 kernel module contains its own very cut-down p 153 mountpoints, but the daemon can't avoid them. 154 155 Do not create, rename or unlink files and dire 156 cache is active, as this may cause the state t 157 158 Renaming files in the cache might make objects 159 filename is part of the lookup key). 160 161 Do not change or remove the extended attribute 162 cache as this will cause the cache state manag 163 164 Do not create files or directories in the cach 165 serve incorrect data. 166 167 Do not chmod files in the cache. The module c 168 permissions to prevent random users being able 169 170 171 Cache Culling 172 ============= 173 174 The cache may need culling occasionally to mak 175 discarding objects from the cache that have be 176 anything else. Culling is based on the access 177 directories are culled if not in use. 178 179 Cache culling is done on the basis of the perc 180 percentage of files available in the underlyin 181 "limits": 182 183 brun, frun 184 If the amount of free space and the numbe 185 rises above both these limits, then culli 186 187 bcull, fcull 188 If the amount of available space or the n 189 cache falls below either of these limits, 190 191 bstop, fstop 192 If the amount of available space or the n 193 cache falls below either of these limits, 194 disk space or files is permitted until cu 195 these limits again. 196 197 These must be configured thusly:: 198 199 0 <= bstop < bcull < brun < 100 200 0 <= fstop < fcull < frun < 100 201 202 Note that these are percentages of available s 203 _not_ appear as 100 minus the percentage displ 204 205 The userspace daemon scans the cache to build 206 These are then culled in least recently used o 207 started as soon as space is made in the table. 208 their atimes have changed or if the kernel mod 209 210 211 Cache Structure 212 =============== 213 214 The CacheFiles module will create two director 215 given: 216 217 * cache/ 218 * graveyard/ 219 220 The active cache objects all reside in the fir 221 kernel module moves any retired or culled obje 222 to the graveyard from which the daemon will ac 223 224 The daemon uses dnotify to monitor the graveya 225 anything that appears therein. 226 227 228 The module represents index objects as directo 229 "J...". Note that the "cache/" directory is i 230 231 Data objects are represented as files if they 232 if they do. Their filenames all begin "D..." 233 directory, data objects will have a file in th 234 actually holds the data. 235 236 Special objects are similar to data objects, e 237 "S..." or "T...". 238 239 240 If an object has children, then it will be rep 241 Immediately in the representative directory ar 242 named for hash values of the child object keys 243 this directory, if possible, will be placed th 244 objects:: 245 246 /INDEX /INDEX /INDEX 247 /=========/==========/================ 248 cache/@4a/I03nfs/@30/Ji000000000000000 249 cache/@4a/I03nfs/@30/Ji000000000000000 250 cache/@4a/I03nfs/@30/Ji000000000000000 251 cache/@4a/I03nfs/@30/Ji000000000000000 252 253 254 If the key is so long that it exceeds NAME_MAX 255 it, then it will be cut into pieces, the first 256 make a nest of directories, and the last one o 257 inside the last directory. The names of the i 258 '+' prepended:: 259 260 J1223/@23/+xy...z/+kl...m/Epqr 261 262 263 Note that keys are raw data, and not only may 264 they may also contain things like '/' and NUL 265 be suitable for turning directly into a filena 266 267 To handle this, CacheFiles will use a suitably 268 "base-64" encode ones that aren't directly sui 269 object filenames indicate the encoding: 270 271 =============== =============== ====== 272 OBJECT TYPE PRINTABLE ENCODE 273 =============== =============== ====== 274 Index "I..." "J..." 275 Data "D..." "E..." 276 Special "S..." "T..." 277 =============== =============== ====== 278 279 Intermediate directories are always "@" or "+" 280 281 282 Each object in the cache has an extended attri 283 type ID (required to distinguish special objec 284 the netfs. The latter is used to detect stale 285 or retire them. 286 287 288 Note that CacheFiles will erase from the cache 289 any file of an incorrect type (such as a FIFO 290 291 292 Security Model and SELinux 293 ========================== 294 295 CacheFiles is implemented to deal properly wit 296 the Linux kernel and the SELinux facility. 297 298 One of the problems that CacheFiles faces is t 299 behalf of a process, and running in that proce 300 security context that is not appropriate for a 301 because the files in the cache are inaccessibl 302 the process creates a file in the cache, that 303 processes. 304 305 The way CacheFiles works is to temporarily cha 306 fsgid and actor security label) that the proce 307 security context of the process when it the ta 308 some other process (so signalling and suchlike 309 310 311 When the CacheFiles module is asked to bind to 312 313 (1) Finds the security label attached to the 314 that as the security label with which it 315 this is:: 316 317 cachefiles_var_t 318 319 (2) Finds the security label of the process w 320 (presumed to be the cachefilesd daemon), 321 322 cachefilesd_t 323 324 and asks LSM to supply a security ID as w 325 daemon's label. By default, this will be 326 327 cachefiles_kernel_t 328 329 SELinux transitions the daemon's security 330 based on a rule of this form in the polic 331 332 type_transition <daemon's-ID> kernel_t 333 334 For instance:: 335 336 type_transition cachefilesd_t kernel_t 337 338 339 The module's security ID gives it permission t 340 and directories in the cache, to find and acce 341 cache, to set and access extended attributes o 342 write files in the cache. 343 344 The daemon's security ID gives it only a very 345 may scan directories, stat files and erase fil 346 not read or write files in the cache, and so i 347 data cached therein; nor is it permitted to cr 348 349 350 There are policy source files available in: 351 352 https://people.redhat.com/~dhowells/fs 353 354 and later versions. In that tarball, see the 355 356 cachefilesd.te 357 cachefilesd.fc 358 cachefilesd.if 359 360 They are built and installed directly by the R 361 362 If a non-RPM based system is being used, then 363 directory and run:: 364 365 make -f /usr/share/selinux/devel/Makef 366 semodule -i cachefilesd.pp 367 368 You will need checkpolicy and selinux-policy-d 369 build. 370 371 372 By default, the cache is located in /var/fscac 373 it should be elsewhere, than either the above 374 an auxiliary policy must be installed to label 375 cache. 376 377 For instructions on how to add an auxiliary po 378 located elsewhere when SELinux is in enforcing 379 380 /usr/share/doc/cachefilesd-*/move-cach 381 382 When the cachefilesd rpm is installed; alterna 383 in the sources. 384 385 386 A Note on Security 387 ================== 388 389 CacheFiles makes use of the split security in 390 its own task_security structure, and redirects 391 when it acts on behalf of another process, in 392 393 The reason it does this is that it calls vfs_m 394 bypassing security and calling inode ops direc 395 may deny the CacheFiles access to the cache da 396 circumstances the caching code is running in t 397 process issued the original syscall on the net 398 399 Furthermore, should CacheFiles create a file o 400 parameters with that object is created (UID, G 401 derived from that process that issued the syst 402 preventing other processes from accessing the 403 cache management daemon (cachefilesd). 404 405 What is required is to temporarily override th 406 issued the system call. We can't, however, ju 407 security data as that affects the process as a 408 This means it may lose signals or ptrace event 409 the process looks like in /proc. 410 411 So CacheFiles makes use of a logical split in 412 objective security (task->real_cred) and the s 413 The objective security holds the intrinsic sec 414 is never overridden. This is what appears in 415 process is the target of an operation by some 416 example). 417 418 The subjective security holds the active secur 419 may be overridden. This is not seen externall 420 acts upon another object, for example SIGKILLi 421 file. 422 423 LSM hooks exist that allow SELinux (or Smack o 424 for CacheFiles to run in a context of a specif 425 files and directories with another security la 426 427 428 Statistical Information 429 ======================= 430 431 If FS-Cache is compiled with the following opt 432 433 CONFIG_CACHEFILES_HISTOGRAM=y 434 435 then it will gather certain statistics and dis 436 437 /proc/fs/cachefiles/histogram 438 439 :: 440 441 cat /proc/fs/cachefiles/histogram 442 JIFS SECS LOOKUPS MKDIRS CREATE 443 ===== ===== ========= ========= ====== 444 445 This shows the breakdown of the number of 446 between 0 jiffies and HZ-1 jiffies a vari 447 columns are as follows: 448 449 ======= ====================== 450 COLUMN TIME MEASUREMENT 451 ======= ====================== 452 LOOKUPS Length of time to perf 453 MKDIRS Length of time to perf 454 CREATES Length of time to perf 455 ======= ====================== 456 457 Each row shows the number of events that 458 Each step is 1 jiffy in size. The JIFS c 459 jiffy range covered, and the SECS field t 460 461 462 Debugging 463 ========= 464 465 If CONFIG_CACHEFILES_DEBUG is enabled, the Cac 466 debugging enabled by adjusting the value in:: 467 468 /sys/module/cachefiles/parameters/debu 469 470 This is a bitmask of debugging streams to enab 471 472 ======= ======= ====================== 473 BIT VALUE STREAM 474 ======= ======= ====================== 475 0 1 General 476 1 2 477 2 4 478 ======= ======= ====================== 479 480 The appropriate set of values should be OR'd t 481 the control file. For example:: 482 483 echo $((1|4|8)) >/sys/module/cachefile 484 485 will turn on all function entry debugging. 486 487 488 On-demand Read 489 ============== 490 491 When working in its original mode, CacheFiles 492 remote networking fs - while in on-demand read 493 scenario where on-demand read semantics are ne 494 distribution. 495 496 The essential difference between these two mod 497 occurs: In the original mode, the netfs will f 498 server and then write it to the cache file; in 499 the data and writing it into the cache is dele 500 501 ``CONFIG_CACHEFILES_ONDEMAND`` should be enabl 502 503 504 Protocol Communication 505 ---------------------- 506 507 The on-demand read mode uses a simple protocol 508 and user daemon. The protocol can be modeled a 509 510 kernel --[request]--> user daemon --[r 511 512 CacheFiles will send requests to the user daem 513 should poll the devnode ('/dev/cachefiles') to 514 request to be processed. A POLLIN event will 515 request. 516 517 The user daemon then reads the devnode to fetc 518 be noted that each read only gets one request. 519 the request, the user daemon should write the 520 521 Each request starts with a message header of t 522 523 struct cachefiles_msg { 524 __u32 msg_id; 525 __u32 opcode; 526 __u32 len; 527 __u32 object_id; 528 __u8 data[]; 529 }; 530 531 where: 532 533 * ``msg_id`` is a unique ID identifyin 534 requests. 535 536 * ``opcode`` indicates the type of thi 537 538 * ``object_id`` is a unique ID identif 539 540 * ``data`` indicates the payload of th 541 542 * ``len`` indicates the whole length o 543 header and following type-specific p 544 545 546 Turning on On-demand Mode 547 ------------------------- 548 549 An optional parameter becomes available to the 550 551 bind [ondemand] 552 553 When the "bind" command is given no argument, 554 When it is given the "ondemand" argument, i.e. 555 mode will be enabled. 556 557 558 The OPEN Request 559 ---------------- 560 561 When the netfs opens a cache file for the firs 562 CACHEFILES_OP_OPEN opcode, a.k.a an OPEN reque 563 daemon. The payload format is of the form:: 564 565 struct cachefiles_open { 566 __u32 volume_key_size; 567 __u32 cookie_key_size; 568 __u32 fd; 569 __u32 flags; 570 __u8 data[]; 571 }; 572 573 where: 574 575 * ``data`` contains the volume_key fol 576 The volume key is a NUL-terminated s 577 data. 578 579 * ``volume_key_size`` indicates the si 580 581 * ``cookie_key_size`` indicates the si 582 583 * ``fd`` indicates an anonymous fd ref 584 which the user daemon can perform wr 585 cache file. 586 587 588 The user daemon can use the given (volume_key, 589 the requested cache file. With the given anon 590 fetch the data and write it to the cache file 591 kernel has not triggered a cache miss yet. 592 593 Be noted that each cache file has a unique obj 594 anonymous fds. The user daemon may duplicate 595 anonymous fd indicated by the @fd field throug 596 be mapped to multiple anonymous fds, while the 597 maintain the mapping. 598 599 When implementing a user daemon, please be car 600 ``/proc/sys/fs/nr_open`` and ``/proc/sys/fs/fi 601 be huge since they're related to the number of 602 open files of each individual filesystem. 603 604 The user daemon should reply the OPEN request 605 open) command on the devnode:: 606 607 copen <msg_id>,<cache_size> 608 609 where: 610 611 * ``msg_id`` must match the msg_id fie 612 613 * When >= 0, ``cache_size`` indicates 614 when < 0, ``cache_size`` indicates a 615 user daemon. 616 617 618 The CLOSE Request 619 ----------------- 620 621 When a cookie withdrawn, a CLOSE request (opco 622 sent to the user daemon. This tells the user 623 associated with the given object_id. The CLOS 624 and shouldn't be replied. 625 626 627 The READ Request 628 ---------------- 629 630 When a cache miss is encountered in on-demand 631 READ request (opcode CACHEFILES_OP_READ) to th 632 daemon to fetch the contents of the requested 633 form:: 634 635 struct cachefiles_read { 636 __u64 off; 637 __u64 len; 638 }; 639 640 where: 641 642 * ``off`` indicates the starting offse 643 644 * ``len`` indicates the length of the 645 646 647 When it receives a READ request, the user daem 648 and write it to the cache file identified by o 649 650 When it has finished processing the READ reque 651 by using the CACHEFILES_IOC_READ_COMPLETE ioct 652 associated with the object_id given in the REA 653 form:: 654 655 ioctl(fd, CACHEFILES_IOC_READ_COMPLETE 656 657 where: 658 659 * ``fd`` is one of the anonymous fds a 660 given. 661 662 * ``msg_id`` must match the msg_id fie
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.