~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/caching/netfs-api.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0
  2 
  3 ==============================
  4 Network Filesystem Caching API
  5 ==============================
  6 
  7 Fscache provides an API by which a network filesystem can make use of local
  8 caching facilities.  The API is arranged around a number of principles:
  9 
 10  (1) A cache is logically organised into volumes and data storage objects
 11      within those volumes.
 12 
 13  (2) Volumes and data storage objects are represented by various types of
 14      cookie.
 15 
 16  (3) Cookies have keys that distinguish them from their peers.
 17 
 18  (4) Cookies have coherency data that allows a cache to determine if the
 19      cached data is still valid.
 20 
 21  (5) I/O is done asynchronously where possible.
 22 
 23 This API is used by::
 24 
 25         #include <linux/fscache.h>.
 26 
 27 .. This document contains the following sections:
 28 
 29          (1) Overview
 30          (2) Volume registration
 31          (3) Data file registration
 32          (4) Declaring a cookie to be in use
 33          (5) Resizing a data file (truncation)
 34          (6) Data I/O API
 35          (7) Data file coherency
 36          (8) Data file invalidation
 37          (9) Write back resource management
 38         (10) Caching of local modifications
 39         (11) Page release and invalidation
 40 
 41 
 42 Overview
 43 ========
 44 
 45 The fscache hierarchy is organised on two levels from a network filesystem's
 46 point of view.  The upper level represents "volumes" and the lower level
 47 represents "data storage objects".  These are represented by two types of
 48 cookie, hereafter referred to as "volume cookies" and "cookies".
 49 
 50 A network filesystem acquires a volume cookie for a volume using a volume key,
 51 which represents all the information that defines that volume (e.g. cell name
 52 or server address, volume ID or share name).  This must be rendered as a
 53 printable string that can be used as a directory name (ie. no '/' characters
 54 and shouldn't begin with a '.').  The maximum name length is one less than the
 55 maximum size of a filename component (allowing the cache backend one char for
 56 its own purposes).
 57 
 58 A filesystem would typically have a volume cookie for each superblock.
 59 
 60 The filesystem then acquires a cookie for each file within that volume using an
 61 object key.  Object keys are binary blobs and only need to be unique within
 62 their parent volume.  The cache backend is responsible for rendering the binary
 63 blob into something it can use and may employ hash tables, trees or whatever to
 64 improve its ability to find an object.  This is transparent to the network
 65 filesystem.
 66 
 67 A filesystem would typically have a cookie for each inode, and would acquire it
 68 in iget and relinquish it when evicting the cookie.
 69 
 70 Once it has a cookie, the filesystem needs to mark the cookie as being in use.
 71 This causes fscache to send the cache backend off to look up/create resources
 72 for the cookie in the background, to check its coherency and, if necessary, to
 73 mark the object as being under modification.
 74 
 75 A filesystem would typically "use" the cookie in its file open routine and
 76 unuse it in file release and it needs to use the cookie around calls to
 77 truncate the cookie locally.  It *also* needs to use the cookie when the
 78 pagecache becomes dirty and unuse it when writeback is complete.  This is
 79 slightly tricky, and provision is made for it.
 80 
 81 When performing a read, write or resize on a cookie, the filesystem must first
 82 begin an operation.  This copies the resources into a holding struct and puts
 83 extra pins into the cache to stop cache withdrawal from tearing down the
 84 structures being used.  The actual operation can then be issued and conflicting
 85 invalidations can be detected upon completion.
 86 
 87 The filesystem is expected to use netfslib to access the cache, but that's not
 88 actually required and it can use the fscache I/O API directly.
 89 
 90 
 91 Volume Registration
 92 ===================
 93 
 94 The first step for a network filesystem is to acquire a volume cookie for the
 95 volume it wants to access::
 96 
 97         struct fscache_volume *
 98         fscache_acquire_volume(const char *volume_key,
 99                                const char *cache_name,
100                                const void *coherency_data,
101                                size_t coherency_len);
102 
103 This function creates a volume cookie with the specified volume key as its name
104 and notes the coherency data.
105 
106 The volume key must be a printable string with no '/' characters in it.  It
107 should begin with the name of the filesystem and should be no longer than 254
108 characters.  It should uniquely represent the volume and will be matched with
109 what's stored in the cache.
110 
111 The caller may also specify the name of the cache to use.  If specified,
112 fscache will look up or create a cache cookie of that name and will use a cache
113 of that name if it is online or comes online.  If no cache name is specified,
114 it will use the first cache that comes to hand and set the name to that.
115 
116 The specified coherency data is stored in the cookie and will be matched
117 against coherency data stored on disk.  The data pointer may be NULL if no data
118 is provided.  If the coherency data doesn't match, the entire cache volume will
119 be invalidated.
120 
121 This function can return errors such as EBUSY if the volume key is already in
122 use by an acquired volume or ENOMEM if an allocation failure occurred.  It may
123 also return a NULL volume cookie if fscache is not enabled.  It is safe to
124 pass a NULL cookie to any function that takes a volume cookie.  This will
125 cause that function to do nothing.
126 
127 
128 When the network filesystem has finished with a volume, it should relinquish it
129 by calling::
130 
131         void fscache_relinquish_volume(struct fscache_volume *volume,
132                                        const void *coherency_data,
133                                        bool invalidate);
134 
135 This will cause the volume to be committed or removed, and if sealed the
136 coherency data will be set to the value supplied.  The amount of coherency data
137 must match the length specified when the volume was acquired.  Note that all
138 data cookies obtained in this volume must be relinquished before the volume is
139 relinquished.
140 
141 
142 Data File Registration
143 ======================
144 
145 Once it has a volume cookie, a network filesystem can use it to acquire a
146 cookie for data storage::
147 
148         struct fscache_cookie *
149         fscache_acquire_cookie(struct fscache_volume *volume,
150                                u8 advice,
151                                const void *index_key,
152                                size_t index_key_len,
153                                const void *aux_data,
154                                size_t aux_data_len,
155                                loff_t object_size)
156 
157 This creates the cookie in the volume using the specified index key.  The index
158 key is a binary blob of the given length and must be unique for the volume.
159 This is saved into the cookie.  There are no restrictions on the content, but
160 its length shouldn't exceed about three quarters of the maximum filename length
161 to allow for encoding.
162 
163 The caller should also pass in a piece of coherency data in aux_data.  A buffer
164 of size aux_data_len will be allocated and the coherency data copied in.  It is
165 assumed that the size is invariant over time.  The coherency data is used to
166 check the validity of data in the cache.  Functions are provided by which the
167 coherency data can be updated.
168 
169 The file size of the object being cached should also be provided.  This may be
170 used to trim the data and will be stored with the coherency data.
171 
172 This function never returns an error, though it may return a NULL cookie on
173 allocation failure or if fscache is not enabled.  It is safe to pass in a NULL
174 volume cookie and pass the NULL cookie returned to any function that takes it.
175 This will cause that function to do nothing.
176 
177 
178 When the network filesystem has finished with a cookie, it should relinquish it
179 by calling::
180 
181         void fscache_relinquish_cookie(struct fscache_cookie *cookie,
182                                        bool retire);
183 
184 This will cause fscache to either commit the storage backing the cookie or
185 delete it.
186 
187 
188 Marking A Cookie In-Use
189 =======================
190 
191 Once a cookie has been acquired by a network filesystem, the filesystem should
192 tell fscache when it intends to use the cookie (typically done on file open)
193 and should say when it has finished with it (typically on file close)::
194 
195         void fscache_use_cookie(struct fscache_cookie *cookie,
196                                 bool will_modify);
197         void fscache_unuse_cookie(struct fscache_cookie *cookie,
198                                   const void *aux_data,
199                                   const loff_t *object_size);
200 
201 The *use* function tells fscache that it will use the cookie and, additionally,
202 indicate if the user is intending to modify the contents locally.  If not yet
203 done, this will trigger the cache backend to go and gather the resources it
204 needs to access/store data in the cache.  This is done in the background, and
205 so may not be complete by the time the function returns.
206 
207 The *unuse* function indicates that a filesystem has finished using a cookie.
208 It optionally updates the stored coherency data and object size and then
209 decreases the in-use counter.  When the last user unuses the cookie, it is
210 scheduled for garbage collection.  If not reused within a short time, the
211 resources will be released to reduce system resource consumption.
212 
213 A cookie must be marked in-use before it can be accessed for read, write or
214 resize - and an in-use mark must be kept whilst there is dirty data in the
215 pagecache in order to avoid an oops due to trying to open a file during process
216 exit.
217 
218 Note that in-use marks are cumulative.  For each time a cookie is marked
219 in-use, it must be unused.
220 
221 
222 Resizing A Data File (Truncation)
223 =================================
224 
225 If a network filesystem file is resized locally by truncation, the following
226 should be called to notify the cache::
227 
228         void fscache_resize_cookie(struct fscache_cookie *cookie,
229                                    loff_t new_size);
230 
231 The caller must have first marked the cookie in-use.  The cookie and the new
232 size are passed in and the cache is synchronously resized.  This is expected to
233 be called from ``->setattr()`` inode operation under the inode lock.
234 
235 
236 Data I/O API
237 ============
238 
239 To do data I/O operations directly through a cookie, the following functions
240 are available::
241 
242         int fscache_begin_read_operation(struct netfs_cache_resources *cres,
243                                          struct fscache_cookie *cookie);
244         int fscache_read(struct netfs_cache_resources *cres,
245                          loff_t start_pos,
246                          struct iov_iter *iter,
247                          enum netfs_read_from_hole read_hole,
248                          netfs_io_terminated_t term_func,
249                          void *term_func_priv);
250         int fscache_write(struct netfs_cache_resources *cres,
251                           loff_t start_pos,
252                           struct iov_iter *iter,
253                           netfs_io_terminated_t term_func,
254                           void *term_func_priv);
255 
256 The *begin* function sets up an operation, attaching the resources required to
257 the cache resources block from the cookie.  Assuming it doesn't return an error
258 (for instance, it will return -ENOBUFS if given a NULL cookie, but otherwise do
259 nothing), then one of the other two functions can be issued.
260 
261 The *read* and *write* functions initiate a direct-IO operation.  Both take the
262 previously set up cache resources block, an indication of the start file
263 position, and an I/O iterator that describes buffer and indicates the amount of
264 data.
265 
266 The read function also takes a parameter to indicate how it should handle a
267 partially populated region (a hole) in the disk content.  This may be to ignore
268 it, skip over an initial hole and place zeros in the buffer or give an error.
269 
270 The read and write functions can be given an optional termination function that
271 will be run on completion::
272 
273         typedef
274         void (*netfs_io_terminated_t)(void *priv, ssize_t transferred_or_error,
275                                       bool was_async);
276 
277 If a termination function is given, the operation will be run asynchronously
278 and the termination function will be called upon completion.  If not given, the
279 operation will be run synchronously.  Note that in the asynchronous case, it is
280 possible for the operation to complete before the function returns.
281 
282 Both the read and write functions end the operation when they complete,
283 detaching any pinned resources.
284 
285 The read operation will fail with ESTALE if invalidation occurred whilst the
286 operation was ongoing.
287 
288 
289 Data File Coherency
290 ===================
291 
292 To request an update of the coherency data and file size on a cookie, the
293 following should be called::
294 
295         void fscache_update_cookie(struct fscache_cookie *cookie,
296                                    const void *aux_data,
297                                    const loff_t *object_size);
298 
299 This will update the cookie's coherency data and/or file size.
300 
301 
302 Data File Invalidation
303 ======================
304 
305 Sometimes it will be necessary to invalidate an object that contains data.
306 Typically this will be necessary when the server informs the network filesystem
307 of a remote third-party change - at which point the filesystem has to throw
308 away the state and cached data that it had for an file and reload from the
309 server.
310 
311 To indicate that a cache object should be invalidated, the following should be
312 called::
313 
314         void fscache_invalidate(struct fscache_cookie *cookie,
315                                 const void *aux_data,
316                                 loff_t size,
317                                 unsigned int flags);
318 
319 This increases the invalidation counter in the cookie to cause outstanding
320 reads to fail with -ESTALE, sets the coherency data and file size from the
321 information supplied, blocks new I/O on the cookie and dispatches the cache to
322 go and get rid of the old data.
323 
324 Invalidation runs asynchronously in a worker thread so that it doesn't block
325 too much.
326 
327 
328 Write-Back Resource Management
329 ==============================
330 
331 To write data to the cache from network filesystem writeback, the cache
332 resources required need to be pinned at the point the modification is made (for
333 instance when the page is marked dirty) as it's not possible to open a file in
334 a thread that's exiting.
335 
336 The following facilities are provided to manage this:
337 
338  * An inode flag, ``I_PINNING_FSCACHE_WB``, is provided to indicate that an
339    in-use is held on the cookie for this inode.  It can only be changed if the
340    the inode lock is held.
341 
342  * A flag, ``unpinned_fscache_wb`` is placed in the ``writeback_control``
343    struct that gets set if ``__writeback_single_inode()`` clears
344    ``I_PINNING_FSCACHE_WB`` because all the dirty pages were cleared.
345 
346 To support this, the following functions are provided::
347 
348         bool fscache_dirty_folio(struct address_space *mapping,
349                                  struct folio *folio,
350                                  struct fscache_cookie *cookie);
351         void fscache_unpin_writeback(struct writeback_control *wbc,
352                                      struct fscache_cookie *cookie);
353         void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
354                                            struct inode *inode,
355                                            const void *aux);
356 
357 The *set* function is intended to be called from the filesystem's
358 ``dirty_folio`` address space operation.  If ``I_PINNING_FSCACHE_WB`` is not
359 set, it sets that flag and increments the use count on the cookie (the caller
360 must already have called ``fscache_use_cookie()``).
361 
362 The *unpin* function is intended to be called from the filesystem's
363 ``write_inode`` superblock operation.  It cleans up after writing by unusing
364 the cookie if unpinned_fscache_wb is set in the writeback_control struct.
365 
366 The *clear* function is intended to be called from the netfs's ``evict_inode``
367 superblock operation.  It must be called *after*
368 ``truncate_inode_pages_final()``, but *before* ``clear_inode()``.  This cleans
369 up any hanging ``I_PINNING_FSCACHE_WB``.  It also allows the coherency data to
370 be updated.
371 
372 
373 Caching of Local Modifications
374 ==============================
375 
376 If a network filesystem has locally modified data that it wants to write to the
377 cache, it needs to mark the pages to indicate that a write is in progress, and
378 if the mark is already present, it needs to wait for it to be removed first
379 (presumably due to an already in-progress operation).  This prevents multiple
380 competing DIO writes to the same storage in the cache.
381 
382 Firstly, the netfs should determine if caching is available by doing something
383 like::
384 
385         bool caching = fscache_cookie_enabled(cookie);
386 
387 If caching is to be attempted, pages should be waited for and then marked using
388 the following functions provided by the netfs helper library::
389 
390         void set_page_fscache(struct page *page);
391         void wait_on_page_fscache(struct page *page);
392         int wait_on_page_fscache_killable(struct page *page);
393 
394 Once all the pages in the span are marked, the netfs can ask fscache to
395 schedule a write of that region::
396 
397         void fscache_write_to_cache(struct fscache_cookie *cookie,
398                                     struct address_space *mapping,
399                                     loff_t start, size_t len, loff_t i_size,
400                                     netfs_io_terminated_t term_func,
401                                     void *term_func_priv,
402                                     bool caching)
403 
404 And if an error occurs before that point is reached, the marks can be removed
405 by calling::
406 
407         void fscache_clear_page_bits(struct address_space *mapping,
408                                      loff_t start, size_t len,
409                                      bool caching)
410 
411 In these functions, a pointer to the mapping to which the source pages are
412 attached is passed in and start and len indicate the size of the region that's
413 going to be written (it doesn't have to align to page boundaries necessarily,
414 but it does have to align to DIO boundaries on the backing filesystem).  The
415 caching parameter indicates if caching should be skipped, and if false, the
416 functions do nothing.
417 
418 The write function takes some additional parameters: the cookie representing
419 the cache object to be written to, i_size indicates the size of the netfs file
420 and term_func indicates an optional completion function, to which
421 term_func_priv will be passed, along with the error or amount written.
422 
423 Note that the write function will always run asynchronously and will unmark all
424 the pages upon completion before calling term_func.
425 
426 
427 Page Release and Invalidation
428 =============================
429 
430 Fscache keeps track of whether we have any data in the cache yet for a cache
431 object we've just created.  It knows it doesn't have to do any reading until it
432 has done a write and then the page it wrote from has been released by the VM,
433 after which it *has* to look in the cache.
434 
435 To inform fscache that a page might now be in the cache, the following function
436 should be called from the ``release_folio`` address space op::
437 
438         void fscache_note_page_release(struct fscache_cookie *cookie);
439 
440 if the page has been released (ie. release_folio returned true).
441 
442 Page release and page invalidation should also wait for any mark left on the
443 page to say that a DIO write is underway from that page::
444 
445         void wait_on_page_fscache(struct page *page);
446         int wait_on_page_fscache_killable(struct page *page);
447 
448 
449 API Function Reference
450 ======================
451 
452 .. kernel-doc:: include/linux/fscache.h

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php