1 .. SPDX-License-Identifier: GPL-2.0 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 ============================== !! 3 =============================== 4 Network Filesystem Caching API !! 4 FS-Cache Network Filesystem API 5 ============================== !! 5 =============================== >> 6 >> 7 There's an API by which a network filesystem can make use of the FS-Cache >> 8 facilities. This is based around a number of principles: >> 9 >> 10 (1) Caches can store a number of different object types. There are two main >> 11 object types: indices and files. The first is a special type used by >> 12 FS-Cache to make finding objects faster and to make retiring of groups of >> 13 objects easier. >> 14 >> 15 (2) Every index, file or other object is represented by a cookie. This cookie >> 16 may or may not have anything associated with it, but the netfs doesn't >> 17 need to care. 6 18 7 Fscache provides an API by which a network fil !! 19 (3) Barring the top-level index (one entry per cached netfs), the index 8 caching facilities. The API is arranged aroun !! 20 hierarchy for each netfs is structured according the whim of the netfs. 9 21 10 (1) A cache is logically organised into volum !! 22 This API is declared in <linux/fscache.h>. 11 within those volumes. << 12 23 13 (2) Volumes and data storage objects are repr !! 24 .. This document contains the following sections: 14 cookie. << 15 25 16 (3) Cookies have keys that distinguish them f !! 26 (1) Network filesystem definition >> 27 (2) Index definition >> 28 (3) Object definition >> 29 (4) Network filesystem (un)registration >> 30 (5) Cache tag lookup >> 31 (6) Index registration >> 32 (7) Data file registration >> 33 (8) Miscellaneous object registration >> 34 (9) Setting the data file size >> 35 (10) Page alloc/read/write >> 36 (11) Page uncaching >> 37 (12) Index and data file consistency >> 38 (13) Cookie enablement >> 39 (14) Miscellaneous cookie operations >> 40 (15) Cookie unregistration >> 41 (16) Index invalidation >> 42 (17) Data file invalidation >> 43 (18) FS-Cache specific page flags. 17 44 18 (4) Cookies have coherency data that allows a << 19 cached data is still valid. << 20 45 21 (5) I/O is done asynchronously where possible !! 46 Network Filesystem Definition >> 47 ============================= 22 48 23 This API is used by:: !! 49 FS-Cache needs a description of the network filesystem. This is specified >> 50 using a record of the following structure:: 24 51 25 #include <linux/fscache.h>. !! 52 struct fscache_netfs { >> 53 uint32_t version; >> 54 const char *name; >> 55 struct fscache_cookie *primary_index; >> 56 ... >> 57 }; 26 58 27 .. This document contains the following sectio !! 59 This first two fields should be filled in before registration, and the third >> 60 will be filled in by the registration function; any other fields should just be >> 61 ignored and are for internal use only. 28 62 29 (1) Overview !! 63 The fields are: 30 (2) Volume registration << 31 (3) Data file registration << 32 (4) Declaring a cookie to be in use << 33 (5) Resizing a data file (truncation) << 34 (6) Data I/O API << 35 (7) Data file coherency << 36 (8) Data file invalidation << 37 (9) Write back resource management << 38 (10) Caching of local modifications << 39 (11) Page release and invalidation << 40 << 41 << 42 Overview << 43 ======== << 44 << 45 The fscache hierarchy is organised on two leve << 46 point of view. The upper level represents "vo << 47 represents "data storage objects". These are << 48 cookie, hereafter referred to as "volume cooki << 49 << 50 A network filesystem acquires a volume cookie << 51 which represents all the information that defi << 52 or server address, volume ID or share name). << 53 printable string that can be used as a directo << 54 and shouldn't begin with a '.'). The maximum << 55 maximum size of a filename component (allowing << 56 its own purposes). << 57 << 58 A filesystem would typically have a volume coo << 59 << 60 The filesystem then acquires a cookie for each << 61 object key. Object keys are binary blobs and << 62 their parent volume. The cache backend is res << 63 blob into something it can use and may employ << 64 improve its ability to find an object. This i << 65 filesystem. << 66 << 67 A filesystem would typically have a cookie for << 68 in iget and relinquish it when evicting the co << 69 << 70 Once it has a cookie, the filesystem needs to << 71 This causes fscache to send the cache backend << 72 for the cookie in the background, to check its << 73 mark the object as being under modification. << 74 << 75 A filesystem would typically "use" the cookie << 76 unuse it in file release and it needs to use t << 77 truncate the cookie locally. It *also* needs << 78 pagecache becomes dirty and unuse it when writ << 79 slightly tricky, and provision is made for it. << 80 << 81 When performing a read, write or resize on a c << 82 begin an operation. This copies the resources << 83 extra pins into the cache to stop cache withdr << 84 structures being used. The actual operation c << 85 invalidations can be detected upon completion. << 86 << 87 The filesystem is expected to use netfslib to << 88 actually required and it can use the fscache I << 89 << 90 << 91 Volume Registration << 92 =================== << 93 << 94 The first step for a network filesystem is to << 95 volume it wants to access:: << 96 << 97 struct fscache_volume * << 98 fscache_acquire_volume(const char *vol << 99 const char *cac << 100 const void *coh << 101 size_t coherenc << 102 << 103 This function creates a volume cookie with the << 104 and notes the coherency data. << 105 << 106 The volume key must be a printable string with << 107 should begin with the name of the filesystem a << 108 characters. It should uniquely represent the << 109 what's stored in the cache. << 110 << 111 The caller may also specify the name of the ca << 112 fscache will look up or create a cache cookie << 113 of that name if it is online or comes online. << 114 it will use the first cache that comes to hand << 115 << 116 The specified coherency data is stored in the << 117 against coherency data stored on disk. The da << 118 is provided. If the coherency data doesn't ma << 119 be invalidated. << 120 << 121 This function can return errors such as EBUSY << 122 use by an acquired volume or ENOMEM if an allo << 123 also return a NULL volume cookie if fscache is << 124 pass a NULL cookie to any function that takes << 125 cause that function to do nothing. << 126 << 127 << 128 When the network filesystem has finished with << 129 by calling:: << 130 << 131 void fscache_relinquish_volume(struct << 132 const v << 133 bool in << 134 << 135 This will cause the volume to be committed or << 136 coherency data will be set to the value suppli << 137 must match the length specified when the volum << 138 data cookies obtained in this volume must be r << 139 relinquished. << 140 64 >> 65 (1) The name of the netfs (used as the key in the toplevel index). 141 66 142 Data File Registration !! 67 (2) The version of the netfs (if the name matches but the version doesn't, the 143 ====================== !! 68 entire in-cache hierarchy for this netfs will be scrapped and begun >> 69 afresh). >> 70 >> 71 (3) The cookie representing the primary index will be allocated according to >> 72 another parameter passed into the registration function. >> 73 >> 74 For example, kAFS (linux/fs/afs/) uses the following definitions to describe >> 75 itself:: >> 76 >> 77 struct fscache_netfs afs_cache_netfs = { >> 78 .version = 0, >> 79 .name = "afs", >> 80 }; >> 81 >> 82 >> 83 Index Definition >> 84 ================ >> 85 >> 86 Indices are used for two purposes: >> 87 >> 88 (1) To aid the finding of a file based on a series of keys (such as AFS's >> 89 "cell", "volume ID", "vnode ID"). >> 90 >> 91 (2) To make it easier to discard a subset of all the files cached based around >> 92 a particular key - for instance to mirror the removal of an AFS volume. >> 93 >> 94 However, since it's unlikely that any two netfs's are going to want to define >> 95 their index hierarchies in quite the same way, FS-Cache tries to impose as few >> 96 restraints as possible on how an index is structured and where it is placed in >> 97 the tree. The netfs can even mix indices and data files at the same level, but >> 98 it's not recommended. >> 99 >> 100 Each index entry consists of a key of indeterminate length plus some auxiliary >> 101 data, also of indeterminate length. >> 102 >> 103 There are some limits on indices: >> 104 >> 105 (1) Any index containing non-index objects should be restricted to a single >> 106 cache. Any such objects created within an index will be created in the >> 107 first cache only. The cache in which an index is created can be >> 108 controlled by cache tags (see below). >> 109 >> 110 (2) The entry data must be atomically journallable, so it is limited to about >> 111 400 bytes at present. At least 400 bytes will be available. >> 112 >> 113 (3) The depth of the index tree should be judged with care as the search >> 114 function is recursive. Too many layers will run the kernel out of stack. >> 115 >> 116 >> 117 Object Definition >> 118 ================= >> 119 >> 120 To define an object, a structure of the following type should be filled out:: >> 121 >> 122 struct fscache_cookie_def >> 123 { >> 124 uint8_t name[16]; >> 125 uint8_t type; >> 126 >> 127 struct fscache_cache_tag *(*select_cache)( >> 128 const void *parent_netfs_data, >> 129 const void *cookie_netfs_data); >> 130 >> 131 enum fscache_checkaux (*check_aux)(void *cookie_netfs_data, >> 132 const void *data, >> 133 uint16_t datalen, >> 134 loff_t object_size); >> 135 >> 136 void (*get_context)(void *cookie_netfs_data, void *context); >> 137 >> 138 void (*put_context)(void *cookie_netfs_data, void *context); >> 139 >> 140 void (*mark_pages_cached)(void *cookie_netfs_data, >> 141 struct address_space *mapping, >> 142 struct pagevec *cached_pvec); >> 143 }; >> 144 >> 145 This has the following fields: >> 146 >> 147 (1) The type of the object [mandatory]. >> 148 >> 149 This is one of the following values: >> 150 >> 151 FSCACHE_COOKIE_TYPE_INDEX >> 152 This defines an index, which is a special FS-Cache type. >> 153 >> 154 FSCACHE_COOKIE_TYPE_DATAFILE >> 155 This defines an ordinary data file. >> 156 >> 157 Any other value between 2 and 255 >> 158 This defines an extraordinary object such as an XATTR. >> 159 >> 160 (2) The name of the object type (NUL terminated unless all 16 chars are used) >> 161 [optional]. >> 162 >> 163 (3) A function to select the cache in which to store an index [optional]. >> 164 >> 165 This function is invoked when an index needs to be instantiated in a cache >> 166 during the instantiation of a non-index object. Only the immediate index >> 167 parent for the non-index object will be queried. Any indices above that >> 168 in the hierarchy may be stored in multiple caches. This function does not >> 169 need to be supplied for any non-index object or any index that will only >> 170 have index children. >> 171 >> 172 If this function is not supplied or if it returns NULL then the first >> 173 cache in the parent's list will be chosen, or failing that, the first >> 174 cache in the master list. >> 175 >> 176 (4) A function to check the auxiliary data [optional]. >> 177 >> 178 This function will be called to check that a match found in the cache for >> 179 this object is valid. For instance with AFS it could check the auxiliary >> 180 data against the data version number returned by the server to determine >> 181 whether the index entry in a cache is still valid. >> 182 >> 183 If this function is absent, it will be assumed that matching objects in a >> 184 cache are always valid. >> 185 >> 186 The function is also passed the cache's idea of the object size and may >> 187 use this to manage coherency also. >> 188 >> 189 If present, the function should return one of the following values: >> 190 >> 191 FSCACHE_CHECKAUX_OKAY >> 192 - the entry is okay as is >> 193 >> 194 FSCACHE_CHECKAUX_NEEDS_UPDATE >> 195 - the entry requires update >> 196 >> 197 FSCACHE_CHECKAUX_OBSOLETE >> 198 - the entry should be deleted >> 199 >> 200 This function can also be used to extract data from the auxiliary data in >> 201 the cache and copy it into the netfs's structures. >> 202 >> 203 (5) A pair of functions to manage contexts for the completion callback >> 204 [optional]. >> 205 >> 206 The cache read/write functions are passed a context which is then passed >> 207 to the I/O completion callback function. To ensure this context remains >> 208 valid until after the I/O completion is called, two functions may be >> 209 provided: one to get an extra reference on the context, and one to drop a >> 210 reference to it. 144 211 145 Once it has a volume cookie, a network filesys !! 212 If the context is not used or is a type of object that won't go out of 146 cookie for data storage:: !! 213 scope, then these functions are not required. These functions are not >> 214 required for indices as indices may not contain data. These functions may >> 215 be called in interrupt context and so may not sleep. >> 216 >> 217 (6) A function to mark a page as retaining cache metadata [optional]. >> 218 >> 219 This is called by the cache to indicate that it is retaining in-memory >> 220 information for this page and that the netfs should uncache the page when >> 221 it has finished. This does not indicate whether there's data on the disk >> 222 or not. Note that several pages at once may be presented for marking. >> 223 >> 224 The PG_fscache bit is set on the pages before this function would be >> 225 called, so the function need not be provided if this is sufficient. >> 226 >> 227 This function is not required for indices as they're not permitted data. >> 228 >> 229 (7) A function to unmark all the pages retaining cache metadata [mandatory]. >> 230 >> 231 This is called by FS-Cache to indicate that a backing store is being >> 232 unbound from a cookie and that all the marks on the pages should be >> 233 cleared to prevent confusion. Note that the cache will have torn down all >> 234 its tracking information so that the pages don't need to be explicitly >> 235 uncached. >> 236 >> 237 This function is not required for indices as they're not permitted data. >> 238 >> 239 >> 240 Network Filesystem (Un)registration >> 241 =================================== >> 242 >> 243 The first step is to declare the network filesystem to the cache. This also >> 244 involves specifying the layout of the primary index (for AFS, this would be the >> 245 "cell" level). >> 246 >> 247 The registration function is:: >> 248 >> 249 int fscache_register_netfs(struct fscache_netfs *netfs); >> 250 >> 251 It just takes a pointer to the netfs definition. It returns 0 or an error as >> 252 appropriate. >> 253 >> 254 For kAFS, registration is done as follows:: >> 255 >> 256 ret = fscache_register_netfs(&afs_cache_netfs); >> 257 >> 258 The last step is, of course, unregistration:: >> 259 >> 260 void fscache_unregister_netfs(struct fscache_netfs *netfs); >> 261 >> 262 >> 263 Cache Tag Lookup >> 264 ================ >> 265 >> 266 FS-Cache permits the use of more than one cache. To permit particular index >> 267 subtrees to be bound to particular caches, the second step is to look up cache >> 268 representation tags. This step is optional; it can be left entirely up to >> 269 FS-Cache as to which cache should be used. The problem with doing that is that >> 270 FS-Cache will always pick the first cache that was registered. >> 271 >> 272 To get the representation for a named tag:: >> 273 >> 274 struct fscache_cache_tag *fscache_lookup_cache_tag(const char *name); >> 275 >> 276 This takes a text string as the name and returns a representation of a tag. It >> 277 will never return an error. It may return a dummy tag, however, if it runs out >> 278 of memory; this will inhibit caching with this tag. >> 279 >> 280 Any representation so obtained must be released by passing it to this function:: >> 281 >> 282 void fscache_release_cache_tag(struct fscache_cache_tag *tag); >> 283 >> 284 The tag will be retrieved by FS-Cache when it calls the object definition >> 285 operation select_cache(). >> 286 >> 287 >> 288 Index Registration >> 289 ================== >> 290 >> 291 The third step is to inform FS-Cache about part of an index hierarchy that can >> 292 be used to locate files. This is done by requesting a cookie for each index in >> 293 the path to the file:: 147 294 148 struct fscache_cookie * 295 struct fscache_cookie * 149 fscache_acquire_cookie(struct fscache_ !! 296 fscache_acquire_cookie(struct fscache_cookie *parent, 150 u8 advice, !! 297 const struct fscache_object_def *def, 151 const void *ind 298 const void *index_key, 152 size_t index_ke 299 size_t index_key_len, 153 const void *aux 300 const void *aux_data, 154 size_t aux_data 301 size_t aux_data_len, 155 loff_t object_s !! 302 void *netfs_data, >> 303 loff_t object_size, >> 304 bool enable); >> 305 >> 306 This function creates an index entry in the index represented by parent, >> 307 filling in the index entry by calling the operations pointed to by def. >> 308 >> 309 A unique key that represents the object within the parent must be pointed to by >> 310 index_key and is of length index_key_len. >> 311 >> 312 An optional blob of auxiliary data that is to be stored within the cache can be >> 313 pointed to with aux_data and should be of length aux_data_len. This would >> 314 typically be used for storing coherency data. >> 315 >> 316 The netfs may pass an arbitrary value in netfs_data and this will be presented >> 317 to it in the event of any calling back. This may also be used in tracing or >> 318 logging of messages. >> 319 >> 320 The cache tracks the size of the data attached to an object and this set to be >> 321 object_size. For indices, this should be 0. This value will be passed to the >> 322 ->check_aux() callback. >> 323 >> 324 Note that this function never returns an error - all errors are handled >> 325 internally. It may, however, return NULL to indicate no cookie. It is quite >> 326 acceptable to pass this token back to this function as the parent to another >> 327 acquisition (or even to the relinquish cookie, read page and write page >> 328 functions - see below). >> 329 >> 330 Note also that no indices are actually created in a cache until a non-index >> 331 object needs to be created somewhere down the hierarchy. Furthermore, an index >> 332 may be created in several different caches independently at different times. >> 333 This is all handled transparently, and the netfs doesn't see any of it. >> 334 >> 335 A cookie will be created in the disabled state if enabled is false. A cookie >> 336 must be enabled to do anything with it. A disabled cookie can be enabled by >> 337 calling fscache_enable_cookie() (see below). >> 338 >> 339 For example, with AFS, a cell would be added to the primary index. This index >> 340 entry would have a dependent inode containing volume mappings within this cell:: >> 341 >> 342 cell->cache = >> 343 fscache_acquire_cookie(afs_cache_netfs.primary_index, >> 344 &afs_cell_cache_index_def, >> 345 cell->name, strlen(cell->name), >> 346 NULL, 0, >> 347 cell, 0, true); >> 348 >> 349 And then a particular volume could be added to that index by ID, creating >> 350 another index for vnodes (AFS inode equivalents):: >> 351 >> 352 volume->cache = >> 353 fscache_acquire_cookie(volume->cell->cache, >> 354 &afs_volume_cache_index_def, >> 355 &volume->vid, sizeof(volume->vid), >> 356 NULL, 0, >> 357 volume, 0, true); >> 358 >> 359 >> 360 Data File Registration >> 361 ====================== 156 362 157 This creates the cookie in the volume using th !! 363 The fourth step is to request a data file be created in the cache. This is 158 key is a binary blob of the given length and m !! 364 identical to index cookie acquisition. The only difference is that the type in 159 This is saved into the cookie. There are no r !! 365 the object definition should be something other than index type:: 160 its length shouldn't exceed about three quarte !! 366 161 to allow for encoding. !! 367 vnode->cache = 162 !! 368 fscache_acquire_cookie(volume->cache, 163 The caller should also pass in a piece of cohe !! 369 &afs_vnode_cache_object_def, 164 of size aux_data_len will be allocated and the !! 370 &key, sizeof(key), 165 assumed that the size is invariant over time. !! 371 &aux, sizeof(aux), 166 check the validity of data in the cache. Func !! 372 vnode, vnode->status.size, true); 167 coherency data can be updated. << 168 << 169 The file size of the object being cached shoul << 170 used to trim the data and will be stored with << 171 << 172 This function never returns an error, though i << 173 allocation failure or if fscache is not enable << 174 volume cookie and pass the NULL cookie returne << 175 This will cause that function to do nothing. << 176 373 177 374 178 When the network filesystem has finished with !! 375 Miscellaneous Object Registration 179 by calling:: !! 376 ================================= 180 377 181 void fscache_relinquish_cookie(struct !! 378 An optional step is to request an object of miscellaneous type be created in 182 bool re !! 379 the cache. This is almost identical to index cookie acquisition. The only >> 380 difference is that the type in the object definition should be something other >> 381 than index type. While the parent object could be an index, it's more likely >> 382 it would be some other type of object such as a data file:: 183 383 184 This will cause fscache to either commit the s !! 384 xattr->cache = 185 delete it. !! 385 fscache_acquire_cookie(vnode->cache, >> 386 &afs_xattr_cache_object_def, >> 387 &xattr->name, strlen(xattr->name), >> 388 NULL, 0, >> 389 xattr, strlen(xattr->val), true); 186 390 >> 391 Miscellaneous objects might be used to store extended attributes or directory >> 392 entries for example. 187 393 188 Marking A Cookie In-Use << 189 ======================= << 190 394 191 Once a cookie has been acquired by a network f !! 395 Setting the Data File Size 192 tell fscache when it intends to use the cookie !! 396 ========================== 193 and should say when it has finished with it (t << 194 397 195 void fscache_use_cookie(struct fscache !! 398 The fifth step is to set the physical attributes of the file, such as its size. 196 bool will_modi !! 399 This doesn't automatically reserve any space in the cache, but permits the 197 void fscache_unuse_cookie(struct fscac !! 400 cache to adjust its metadata for data tracking appropriately:: 198 const void * << 199 const loff_t << 200 401 201 The *use* function tells fscache that it will !! 402 int fscache_attr_changed(struct fscache_cookie *cookie); 202 indicate if the user is intending to modify th << 203 done, this will trigger the cache backend to g << 204 needs to access/store data in the cache. This << 205 so may not be complete by the time the functio << 206 403 207 The *unuse* function indicates that a filesyst !! 404 The cache will return -ENOBUFS if there is no backing cache or if there is no 208 It optionally updates the stored coherency dat !! 405 space to allocate any extra metadata required in the cache. 209 decreases the in-use counter. When the last u << 210 scheduled for garbage collection. If not reus << 211 resources will be released to reduce system re << 212 406 213 A cookie must be marked in-use before it can b !! 407 Note that attempts to read or write data pages in the cache over this size may 214 resize - and an in-use mark must be kept whils !! 408 be rebuffed with -ENOBUFS. 215 pagecache in order to avoid an oops due to try << 216 exit. << 217 409 218 Note that in-use marks are cumulative. For ea !! 410 This operation schedules an attribute adjustment to happen asynchronously at 219 in-use, it must be unused. !! 411 some point in the future, and as such, it may happen after the function returns >> 412 to the caller. The attribute adjustment excludes read and write operations. 220 413 221 414 222 Resizing A Data File (Truncation) !! 415 Page alloc/read/write 223 ================================= !! 416 ===================== 224 417 225 If a network filesystem file is resized locall !! 418 And the sixth step is to store and retrieve pages in the cache. There are 226 should be called to notify the cache:: !! 419 three functions that are used to do this. 227 420 228 void fscache_resize_cookie(struct fsca !! 421 Note: 229 loff_t new_ << 230 422 231 The caller must have first marked the cookie i !! 423 (1) A page should not be re-read or re-allocated without uncaching it first. 232 size are passed in and the cache is synchronou << 233 be called from ``->setattr()`` inode operation << 234 << 235 << 236 Data I/O API << 237 ============ << 238 << 239 To do data I/O operations directly through a c << 240 are available:: << 241 << 242 int fscache_begin_read_operation(struc << 243 struc << 244 int fscache_read(struct netfs_cache_re << 245 loff_t start_pos, << 246 struct iov_iter *iter << 247 enum netfs_read_from_ << 248 netfs_io_terminated_t << 249 void *term_func_priv) << 250 int fscache_write(struct netfs_cache_r << 251 loff_t start_pos, << 252 struct iov_iter *ite << 253 netfs_io_terminated_ << 254 void *term_func_priv << 255 << 256 The *begin* function sets up an operation, att << 257 the cache resources block from the cookie. As << 258 (for instance, it will return -ENOBUFS if give << 259 nothing), then one of the other two functions << 260 << 261 The *read* and *write* functions initiate a di << 262 previously set up cache resources block, an in << 263 position, and an I/O iterator that describes b << 264 data. << 265 << 266 The read function also takes a parameter to in << 267 partially populated region (a hole) in the dis << 268 it, skip over an initial hole and place zeros << 269 424 270 The read and write functions can be given an o !! 425 (2) A read or allocated page must be uncached when the netfs page is released 271 will be run on completion:: !! 426 from the pagecache. >> 427 >> 428 (3) A page should only be written to the cache if previous read or allocated. >> 429 >> 430 This permits the cache to maintain its page tracking in proper order. >> 431 >> 432 >> 433 PAGE READ >> 434 --------- >> 435 >> 436 Firstly, the netfs should ask FS-Cache to examine the caches and read the >> 437 contents cached for a particular page of a particular file if present, or else >> 438 allocate space to store the contents if not:: 272 439 273 typedef 440 typedef 274 void (*netfs_io_terminated_t)(void *pr !! 441 void (*fscache_rw_complete_t)(struct page *page, 275 bool was !! 442 void *context, >> 443 int error); >> 444 >> 445 int fscache_read_or_alloc_page(struct fscache_cookie *cookie, >> 446 struct page *page, >> 447 fscache_rw_complete_t end_io_func, >> 448 void *context, >> 449 gfp_t gfp); >> 450 >> 451 The cookie argument must specify a cookie for an object that isn't an index, >> 452 the page specified will have the data loaded into it (and is also used to >> 453 specify the page number), and the gfp argument is used to control how any >> 454 memory allocations made are satisfied. >> 455 >> 456 If the cookie indicates the inode is not cached: >> 457 >> 458 (1) The function will return -ENOBUFS. >> 459 >> 460 Else if there's a copy of the page resident in the cache: >> 461 >> 462 (1) The mark_pages_cached() cookie operation will be called on that page. >> 463 >> 464 (2) The function will submit a request to read the data from the cache's >> 465 backing device directly into the page specified. 276 466 277 If a termination function is given, the operat !! 467 (3) The function will return 0. 278 and the termination function will be called up << 279 operation will be run synchronously. Note tha << 280 possible for the operation to complete before << 281 468 282 Both the read and write functions end the oper !! 469 (4) When the read is complete, end_io_func() will be invoked with: 283 detaching any pinned resources. << 284 470 285 The read operation will fail with ESTALE if in !! 471 * The netfs data supplied when the cookie was created. 286 operation was ongoing. << 287 472 >> 473 * The page descriptor. 288 474 289 Data File Coherency !! 475 * The context argument passed to the above function. This will be 290 =================== !! 476 maintained with the get_context/put_context functions mentioned above. 291 477 292 To request an update of the coherency data and !! 478 * An argument that's 0 on success or negative for an error code. 293 following should be called:: !! 479 >> 480 If an error occurs, it should be assumed that the page contains no usable >> 481 data. fscache_readpages_cancel() may need to be called. >> 482 >> 483 end_io_func() will be called in process context if the read is results in >> 484 an error, but it might be called in interrupt context if the read is >> 485 successful. >> 486 >> 487 Otherwise, if there's not a copy available in cache, but the cache may be able >> 488 to store the page: >> 489 >> 490 (1) The mark_pages_cached() cookie operation will be called on that page. >> 491 >> 492 (2) A block may be reserved in the cache and attached to the object at the >> 493 appropriate place. >> 494 >> 495 (3) The function will return -ENODATA. >> 496 >> 497 This function may also return -ENOMEM or -EINTR, in which case it won't have >> 498 read any data from the cache. >> 499 >> 500 >> 501 Page Allocate >> 502 ------------- >> 503 >> 504 Alternatively, if there's not expected to be any data in the cache for a page >> 505 because the file has been extended, a block can simply be allocated instead:: >> 506 >> 507 int fscache_alloc_page(struct fscache_cookie *cookie, >> 508 struct page *page, >> 509 gfp_t gfp); >> 510 >> 511 This is similar to the fscache_read_or_alloc_page() function, except that it >> 512 never reads from the cache. It will return 0 if a block has been allocated, >> 513 rather than -ENODATA as the other would. One or the other must be performed >> 514 before writing to the cache. >> 515 >> 516 The mark_pages_cached() cookie operation will be called on the page if >> 517 successful. >> 518 >> 519 >> 520 Page Write >> 521 ---------- >> 522 >> 523 Secondly, if the netfs changes the contents of the page (either due to an >> 524 initial download or if a user performs a write), then the page should be >> 525 written back to the cache:: >> 526 >> 527 int fscache_write_page(struct fscache_cookie *cookie, >> 528 struct page *page, >> 529 loff_t object_size, >> 530 gfp_t gfp); >> 531 >> 532 The cookie argument must specify a data file cookie, the page specified should >> 533 contain the data to be written (and is also used to specify the page number), >> 534 object_size is the revised size of the object and the gfp argument is used to >> 535 control how any memory allocations made are satisfied. >> 536 >> 537 The page must have first been read or allocated successfully and must not have >> 538 been uncached before writing is performed. >> 539 >> 540 If the cookie indicates the inode is not cached then: >> 541 >> 542 (1) The function will return -ENOBUFS. >> 543 >> 544 Else if space can be allocated in the cache to hold this page: >> 545 >> 546 (1) PG_fscache_write will be set on the page. >> 547 >> 548 (2) The function will submit a request to write the data to cache's backing >> 549 device directly from the page specified. >> 550 >> 551 (3) The function will return 0. >> 552 >> 553 (4) When the write is complete PG_fscache_write is cleared on the page and >> 554 anyone waiting for that bit will be woken up. >> 555 >> 556 Else if there's no space available in the cache, -ENOBUFS will be returned. It >> 557 is also possible for the PG_fscache_write bit to be cleared when no write took >> 558 place if unforeseen circumstances arose (such as a disk error). >> 559 >> 560 Writing takes place asynchronously. >> 561 >> 562 >> 563 Multiple Page Read >> 564 ------------------ >> 565 >> 566 A facility is provided to read several pages at once, as requested by the >> 567 readpages() address space operation:: >> 568 >> 569 int fscache_read_or_alloc_pages(struct fscache_cookie *cookie, >> 570 struct address_space *mapping, >> 571 struct list_head *pages, >> 572 int *nr_pages, >> 573 fscache_rw_complete_t end_io_func, >> 574 void *context, >> 575 gfp_t gfp); >> 576 >> 577 This works in a similar way to fscache_read_or_alloc_page(), except: >> 578 >> 579 (1) Any page it can retrieve data for is removed from pages and nr_pages and >> 580 dispatched for reading to the disk. Reads of adjacent pages on disk may >> 581 be merged for greater efficiency. >> 582 >> 583 (2) The mark_pages_cached() cookie operation will be called on several pages >> 584 at once if they're being read or allocated. >> 585 >> 586 (3) If there was an general error, then that error will be returned. >> 587 >> 588 Else if some pages couldn't be allocated or read, then -ENOBUFS will be >> 589 returned. >> 590 >> 591 Else if some pages couldn't be read but were allocated, then -ENODATA will >> 592 be returned. >> 593 >> 594 Otherwise, if all pages had reads dispatched, then 0 will be returned, the >> 595 list will be empty and ``*nr_pages`` will be 0. >> 596 >> 597 (4) end_io_func will be called once for each page being read as the reads >> 598 complete. It will be called in process context if error != 0, but it may >> 599 be called in interrupt context if there is no error. >> 600 >> 601 Note that a return of -ENODATA, -ENOBUFS or any other error does not preclude >> 602 some of the pages being read and some being allocated. Those pages will have >> 603 been marked appropriately and will need uncaching. >> 604 >> 605 >> 606 Cancellation of Unread Pages >> 607 ---------------------------- >> 608 >> 609 If one or more pages are passed to fscache_read_or_alloc_pages() but not then >> 610 read from the cache and also not read from the underlying filesystem then >> 611 those pages will need to have any marks and reservations removed. This can be >> 612 done by calling:: >> 613 >> 614 void fscache_readpages_cancel(struct fscache_cookie *cookie, >> 615 struct list_head *pages); >> 616 >> 617 prior to returning to the caller. The cookie argument should be as passed to >> 618 fscache_read_or_alloc_pages(). Every page in the pages list will be examined >> 619 and any that have PG_fscache set will be uncached. >> 620 >> 621 >> 622 Page Uncaching >> 623 ============== >> 624 >> 625 To uncache a page, this function should be called:: >> 626 >> 627 void fscache_uncache_page(struct fscache_cookie *cookie, >> 628 struct page *page); >> 629 >> 630 This function permits the cache to release any in-memory representation it >> 631 might be holding for this netfs page. This function must be called once for >> 632 each page on which the read or write page functions above have been called to >> 633 make sure the cache's in-memory tracking information gets torn down. >> 634 >> 635 Note that pages can't be explicitly deleted from the a data file. The whole >> 636 data file must be retired (see the relinquish cookie function below). >> 637 >> 638 Furthermore, note that this does not cancel the asynchronous read or write >> 639 operation started by the read/alloc and write functions, so the page >> 640 invalidation functions must use:: >> 641 >> 642 bool fscache_check_page_write(struct fscache_cookie *cookie, >> 643 struct page *page); >> 644 >> 645 to see if a page is being written to the cache, and:: >> 646 >> 647 void fscache_wait_on_page_write(struct fscache_cookie *cookie, >> 648 struct page *page); >> 649 >> 650 to wait for it to finish if it is. >> 651 >> 652 >> 653 When releasepage() is being implemented, a special FS-Cache function exists to >> 654 manage the heuristics of coping with vmscan trying to eject pages, which may >> 655 conflict with the cache trying to write pages to the cache (which may itself >> 656 need to allocate memory):: >> 657 >> 658 bool fscache_maybe_release_page(struct fscache_cookie *cookie, >> 659 struct page *page, >> 660 gfp_t gfp); >> 661 >> 662 This takes the netfs cookie, and the page and gfp arguments as supplied to >> 663 releasepage(). It will return false if the page cannot be released yet for >> 664 some reason and if it returns true, the page has been uncached and can now be >> 665 released. >> 666 >> 667 To make a page available for release, this function may wait for an outstanding >> 668 storage request to complete, or it may attempt to cancel the storage request - >> 669 in which case the page will not be stored in the cache this time. >> 670 >> 671 >> 672 Bulk Image Page Uncache >> 673 ----------------------- >> 674 >> 675 A convenience routine is provided to perform an uncache on all the pages >> 676 attached to an inode. This assumes that the pages on the inode correspond on a >> 677 1:1 basis with the pages in the cache:: >> 678 >> 679 void fscache_uncache_all_inode_pages(struct fscache_cookie *cookie, >> 680 struct inode *inode); >> 681 >> 682 This takes the netfs cookie that the pages were cached with and the inode that >> 683 the pages are attached to. This function will wait for pages to finish being >> 684 written to the cache and for the cache to finish with the page generally. No >> 685 error is returned. >> 686 >> 687 >> 688 Index and Data File consistency >> 689 =============================== >> 690 >> 691 To find out whether auxiliary data for an object is up to data within the >> 692 cache, the following function can be called:: >> 693 >> 694 int fscache_check_consistency(struct fscache_cookie *cookie, >> 695 const void *aux_data); >> 696 >> 697 This will call back to the netfs to check whether the auxiliary data associated >> 698 with a cookie is correct; if aux_data is non-NULL, it will update the auxiliary >> 699 data buffer first. It returns 0 if it is and -ESTALE if it isn't; it may also >> 700 return -ENOMEM and -ERESTARTSYS. >> 701 >> 702 To request an update of the index data for an index or other object, the >> 703 following function should be called:: 294 704 295 void fscache_update_cookie(struct fsca 705 void fscache_update_cookie(struct fscache_cookie *cookie, >> 706 const void *aux_data); >> 707 >> 708 This function will update the cookie's auxiliary data buffer from aux_data if >> 709 that is non-NULL and then schedule this to be stored on disk. The update >> 710 method in the parent index definition will be called to transfer the data. >> 711 >> 712 Note that partial updates may happen automatically at other times, such as when >> 713 data blocks are added to a data file object. >> 714 >> 715 >> 716 Cookie Enablement >> 717 ================= >> 718 >> 719 Cookies exist in one of two states: enabled and disabled. If a cookie is >> 720 disabled, it ignores all attempts to acquire child cookies; check, update or >> 721 invalidate its state; allocate, read or write backing pages - though it is >> 722 still possible to uncache pages and relinquish the cookie. >> 723 >> 724 The initial enablement state is set by fscache_acquire_cookie(), but the cookie >> 725 can be enabled or disabled later. To disable a cookie, call:: >> 726 >> 727 void fscache_disable_cookie(struct fscache_cookie *cookie, >> 728 const void *aux_data, >> 729 bool invalidate); >> 730 >> 731 If the cookie is not already disabled, this locks the cookie against other >> 732 enable and disable ops, marks the cookie as being disabled, discards or >> 733 invalidates any backing objects and waits for cessation of activity on any >> 734 associated object before unlocking the cookie. >> 735 >> 736 All possible failures are handled internally. The caller should consider >> 737 calling fscache_uncache_all_inode_pages() afterwards to make sure all page >> 738 markings are cleared up. >> 739 >> 740 Cookies can be enabled or reenabled with:: >> 741 >> 742 void fscache_enable_cookie(struct fscache_cookie *cookie, 296 const void 743 const void *aux_data, 297 const loff_ !! 744 loff_t object_size, >> 745 bool (*can_enable)(void *data), >> 746 void *data) 298 747 299 This will update the cookie's coherency data a !! 748 If the cookie is not already enabled, this locks the cookie against other >> 749 enable and disable ops, invokes can_enable() and, if the cookie is not an index >> 750 cookie, will begin the procedure of acquiring backing objects. 300 751 >> 752 The optional can_enable() function is passed the data argument and returns a >> 753 ruling as to whether or not enablement should actually be permitted to begin. 301 754 302 Data File Invalidation !! 755 All possible failures are handled internally. The cookie will only be marked 303 ====================== !! 756 as enabled if provisional backing objects are allocated. >> 757 >> 758 The object's data size is updated from object_size and is passed to the >> 759 ->check_aux() function. >> 760 >> 761 In both cases, the cookie's auxiliary data buffer is updated from aux_data if >> 762 that is non-NULL inside the enablement lock before proceeding. 304 763 305 Sometimes it will be necessary to invalidate a << 306 Typically this will be necessary when the serv << 307 of a remote third-party change - at which poin << 308 away the state and cached data that it had for << 309 server. << 310 764 311 To indicate that a cache object should be inva !! 765 Miscellaneous Cookie operations 312 called:: !! 766 =============================== 313 767 314 void fscache_invalidate(struct fscache !! 768 There are a number of operations that can be used to control cookies: 315 const void *au << 316 loff_t size, << 317 unsigned int f << 318 769 319 This increases the invalidation counter in the !! 770 * Cookie pinning:: 320 reads to fail with -ESTALE, sets the coherency << 321 information supplied, blocks new I/O on the co << 322 go and get rid of the old data. << 323 771 324 Invalidation runs asynchronously in a worker t !! 772 int fscache_pin_cookie(struct fscache_cookie *cookie); 325 too much. !! 773 void fscache_unpin_cookie(struct fscache_cookie *cookie); 326 774 >> 775 These operations permit data cookies to be pinned into the cache and to >> 776 have the pinning removed. They are not permitted on index cookies. 327 777 328 Write-Back Resource Management !! 778 The pinning function will return 0 if successful, -ENOBUFS in the cookie 329 ============================== !! 779 isn't backed by a cache, -EOPNOTSUPP if the cache doesn't support pinning, >> 780 -ENOSPC if there isn't enough space to honour the operation, -ENOMEM or >> 781 -EIO if there's any other problem. 330 782 331 To write data to the cache from network filesy !! 783 * Data space reservation:: 332 resources required need to be pinned at the po << 333 instance when the page is marked dirty) as it' << 334 a thread that's exiting. << 335 784 336 The following facilities are provided to manag !! 785 int fscache_reserve_space(struct fscache_cookie *cookie, loff_t size); 337 786 338 * An inode flag, ``I_PINNING_FSCACHE_WB``, is !! 787 This permits a netfs to request cache space be reserved to store up to the 339 in-use is held on the cookie for this inode !! 788 given amount of a file. It is permitted to ask for more than the current 340 the inode lock is held. !! 789 size of the file to allow for future file expansion. 341 790 342 * A flag, ``unpinned_fscache_wb`` is placed i !! 791 If size is given as zero then the reservation will be cancelled. 343 struct that gets set if ``__writeback_singl << 344 ``I_PINNING_FSCACHE_WB`` because all the di << 345 792 346 To support this, the following functions are p !! 793 The function will return 0 if successful, -ENOBUFS in the cookie isn't >> 794 backed by a cache, -EOPNOTSUPP if the cache doesn't support reservations, >> 795 -ENOSPC if there isn't enough space to honour the operation, -ENOMEM or >> 796 -EIO if there's any other problem. 347 797 348 bool fscache_dirty_folio(struct addres !! 798 Note that this doesn't pin an object in a cache; it can still be culled to 349 struct folio !! 799 make space if it's not in use. 350 struct fscach << 351 void fscache_unpin_writeback(struct wr << 352 struct fs << 353 void fscache_clear_inode_writeback(str << 354 str << 355 con << 356 800 357 The *set* function is intended to be called fr << 358 ``dirty_folio`` address space operation. If ` << 359 set, it sets that flag and increments the use << 360 must already have called ``fscache_use_cookie( << 361 801 362 The *unpin* function is intended to be called !! 802 Cookie Unregistration 363 ``write_inode`` superblock operation. It clea !! 803 ===================== 364 the cookie if unpinned_fscache_wb is set in th !! 804 >> 805 To get rid of a cookie, this function should be called:: >> 806 >> 807 void fscache_relinquish_cookie(struct fscache_cookie *cookie, >> 808 const void *aux_data, >> 809 bool retire); 365 810 366 The *clear* function is intended to be called !! 811 If retire is non-zero, then the object will be marked for recycling, and all 367 superblock operation. It must be called *afte !! 812 copies of it will be removed from all active caches in which it is present. 368 ``truncate_inode_pages_final()``, but *before* !! 813 Not only that but all child objects will also be retired. 369 up any hanging ``I_PINNING_FSCACHE_WB``. It a << 370 be updated. << 371 814 >> 815 If retire is zero, then the object may be available again when next the >> 816 acquisition function is called. Retirement here will overrule the pinning on a >> 817 cookie. 372 818 373 Caching of Local Modifications !! 819 The cookie's auxiliary data will be updated from aux_data if that is non-NULL 374 ============================== !! 820 so that the cache can lazily update it on disk. 375 821 376 If a network filesystem has locally modified d !! 822 One very important note - relinquish must NOT be called for a cookie unless all 377 cache, it needs to mark the pages to indicate !! 823 the cookies for "child" indices, objects and pages have been relinquished 378 if the mark is already present, it needs to wa !! 824 first. 379 (presumably due to an already in-progress oper << 380 competing DIO writes to the same storage in th << 381 825 382 Firstly, the netfs should determine if caching << 383 like:: << 384 826 385 bool caching = fscache_cookie_enabled( !! 827 Index Invalidation >> 828 ================== 386 829 387 If caching is to be attempted, pages should be !! 830 There is no direct way to invalidate an index subtree. To do this, the caller 388 the following functions provided by the netfs !! 831 should relinquish and retire the cookie they have, and then acquire a new one. 389 832 390 void set_page_fscache(struct page *pag << 391 void wait_on_page_fscache(struct page << 392 int wait_on_page_fscache_killable(stru << 393 833 394 Once all the pages in the span are marked, the !! 834 Data File Invalidation 395 schedule a write of that region:: !! 835 ====================== 396 836 397 void fscache_write_to_cache(struct fsc !! 837 Sometimes it will be necessary to invalidate an object that contains data. 398 struct add !! 838 Typically this will be necessary when the server tells the netfs of a foreign 399 loff_t sta !! 839 change - at which point the netfs has to throw away all the state it had for an 400 netfs_io_t !! 840 inode and reload from the server. 401 void *term << 402 bool cachi << 403 841 404 And if an error occurs before that point is re !! 842 To indicate that a cache object should be invalidated, the following function 405 by calling:: !! 843 can be called:: 406 844 407 void fscache_clear_page_bits(struct ad !! 845 void fscache_invalidate(struct fscache_cookie *cookie); 408 loff_t st << 409 bool cach << 410 846 411 In these functions, a pointer to the mapping t !! 847 This can be called with spinlocks held as it defers the work to a thread pool. 412 attached is passed in and start and len indica !! 848 All extant storage, retrieval and attribute change ops at this point are 413 going to be written (it doesn't have to align !! 849 cancelled and discarded. Some future operations will be rejected until the 414 but it does have to align to DIO boundaries on !! 850 cache has had a chance to insert a barrier in the operations queue. After 415 caching parameter indicates if caching should !! 851 that, operations will be queued again behind the invalidation operation. 416 functions do nothing. << 417 852 418 The write function takes some additional param !! 853 The invalidation operation will perform an attribute change operation and an 419 the cache object to be written to, i_size indi !! 854 auxiliary data update operation as it is very likely these will have changed. 420 and term_func indicates an optional completion << 421 term_func_priv will be passed, along with the << 422 855 423 Note that the write function will always run a !! 856 Using the following function, the netfs can wait for the invalidation operation 424 the pages upon completion before calling term_ !! 857 to have reached a point at which it can start submitting ordinary operations >> 858 once again:: 425 859 >> 860 void fscache_wait_on_invalidate(struct fscache_cookie *cookie); 426 861 427 Page Release and Invalidation << 428 ============================= << 429 862 430 Fscache keeps track of whether we have any dat !! 863 FS-cache Specific Page Flag 431 object we've just created. It knows it doesn' !! 864 =========================== 432 has done a write and then the page it wrote fr << 433 after which it *has* to look in the cache. << 434 865 435 To inform fscache that a page might now be in !! 866 FS-Cache makes use of a page flag, PG_private_2, for its own purpose. This is 436 should be called from the ``release_folio`` ad !! 867 given the alternative name PG_fscache. 437 868 438 void fscache_note_page_release(struct !! 869 PG_fscache is used to indicate that the page is known by the cache, and that >> 870 the cache must be informed if the page is going to go away. It's an indication >> 871 to the netfs that the cache has an interest in this page, where an interest may >> 872 be a pointer to it, resources allocated or reserved for it, or I/O in progress >> 873 upon it. 439 874 440 if the page has been released (ie. release_fol !! 875 The netfs can use this information in methods such as releasepage() to >> 876 determine whether it needs to uncache a page or update it. 441 877 442 Page release and page invalidation should also !! 878 Furthermore, if this bit is set, releasepage() and invalidatepage() operations 443 page to say that a DIO write is underway from !! 879 will be called on a page to get rid of it, even if PG_private is not set. This >> 880 allows caching to attempted on a page before read_cache_pages() to be called >> 881 after fscache_read_or_alloc_pages() as the former will try and release pages it >> 882 was given under certain circumstances. 444 883 445 void wait_on_page_fscache(struct page !! 884 This bit does not overlap with such as PG_private. This means that FS-Cache 446 int wait_on_page_fscache_killable(stru !! 885 can be used with a filesystem that uses the block buffering code. 447 886 >> 887 There are a number of operations defined on this flag:: 448 888 449 API Function Reference !! 889 int PageFsCache(struct page *page); 450 ====================== !! 890 void SetPageFsCache(struct page *page) >> 891 void ClearPageFsCache(struct page *page) >> 892 int TestSetPageFsCache(struct page *page) >> 893 int TestClearPageFsCache(struct page *page) 451 894 452 .. kernel-doc:: include/linux/fscache.h !! 895 These functions are bit test, bit set, bit clear, bit test and set and bit >> 896 test and clear operations on PG_fscache.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.