~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/porting.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/filesystems/porting.rst (Version linux-6.12-rc7) and /Documentation/filesystems/porting.rst (Version linux-5.3.18)


  1 ====================                              
  2 Changes since 2.5.0:                              
  3 ====================                              
  4                                                   
  5 ---                                               
  6                                                   
  7 **recommended**                                   
  8                                                   
  9 New helpers: sb_bread(), sb_getblk(), sb_find_    
 10 sb_set_blocksize() and sb_min_blocksize().        
 11                                                   
 12 Use them.                                         
 13                                                   
 14 (sb_find_get_block() replaces 2.4's get_hash_t    
 15                                                   
 16 ---                                               
 17                                                   
 18 **recommended**                                   
 19                                                   
 20 New methods: ->alloc_inode() and ->destroy_ino    
 21                                                   
 22 Remove inode->u.foo_inode_i                       
 23                                                   
 24 Declare::                                         
 25                                                   
 26         struct foo_inode_info {                   
 27                 /* fs-private stuff */            
 28                 struct inode vfs_inode;           
 29         };                                        
 30         static inline struct foo_inode_info *F    
 31         {                                         
 32                 return list_entry(inode, struc    
 33         }                                         
 34                                                   
 35 Use FOO_I(inode) instead of &inode->u.foo_inod    
 36                                                   
 37 Add foo_alloc_inode() and foo_destroy_inode()     
 38 foo_inode_info and return the address of ->vfs    
 39 FOO_I(inode) (see in-tree filesystems for exam    
 40                                                   
 41 Make them ->alloc_inode and ->destroy_inode in    
 42                                                   
 43 Keep in mind that now you need explicit initia    
 44 typically between calling iget_locked() and un    
 45                                                   
 46 At some point that will become mandatory.         
 47                                                   
 48 **mandatory**                                     
 49                                                   
 50 The foo_inode_info should always be allocated     
 51 than kmem_cache_alloc() or kmalloc() related t    
 52 correctly.                                        
 53                                                   
 54 ---                                               
 55                                                   
 56 **mandatory**                                     
 57                                                   
 58 Change of file_system_type method (->read_supe    
 59                                                   
 60 ->read_super() is no more.  Ditto for DECLARE_    
 61                                                   
 62 Turn your foo_read_super() into a function tha    
 63 success and negative number in case of error (    
 64 informative error value to report).  Call it f    
 65                                                   
 66   int foo_get_sb(struct file_system_type *fs_t    
 67         int flags, const char *dev_name, void     
 68   {                                               
 69         return get_sb_bdev(fs_type, flags, dev    
 70                            mnt);                  
 71   }                                               
 72                                                   
 73 (or similar with s/bdev/nodev/ or s/bdev/singl    
 74 filesystem).                                      
 75                                                   
 76 Replace DECLARE_FSTYPE... with explicit initia    
 77 foo_get_sb.                                       
 78                                                   
 79 ---                                               
 80                                                   
 81 **mandatory**                                     
 82                                                   
 83 Locking change: ->s_vfs_rename_sem is taken on    
 84 Most likely there is no need to change anythin    
 85 global exclusion between renames for some inte    
 86 change your internal locking.  Otherwise exclu    
 87 same (i.e. parents and victim are locked, etc.    
 88                                                   
 89 ---                                               
 90                                                   
 91 **informational**                                 
 92                                                   
 93 Now we have the exclusion between ->lookup() a    
 94 ->rmdir() and ->rename()).  If you used to nee    
 95 it by internal locking (most of filesystems co    
 96 can relax your locking.                           
 97                                                   
 98 ---                                               
 99                                                   
100 **mandatory**                                     
101                                                   
102 ->lookup(), ->truncate(), ->create(), ->unlink    
103 ->rmdir(), ->link(), ->lseek(), ->symlink(), -    
104 and ->readdir() are called without BKL now.  G    
105 - that will guarantee the same locking you use    
106 parts do not need BKL - better yet, now you ca    
107 unlock_kernel() so that they would protect exa    
108 protected.                                        
109                                                   
110 ---                                               
111                                                   
112 **mandatory**                                     
113                                                   
114 BKL is also moved from around sb operations. B    
115 individual fs sb_op functions.  If you don't n    
116                                                   
117 ---                                               
118                                                   
119 **informational**                                 
120                                                   
121 check for ->link() target not being a director    
122 free to drop it...                                
123                                                   
124 ---                                               
125                                                   
126 **informational**                                 
127                                                   
128 ->link() callers hold ->i_mutex on the object     
129 problems might be over...                         
130                                                   
131 ---                                               
132                                                   
133 **mandatory**                                     
134                                                   
135 new file_system_type method - kill_sb(superblo    
136 an existing filesystem, set it according to ->    
137                                                   
138         FS_REQUIRES_DEV         -       kill_b    
139         FS_LITTER               -       kill_l    
140         neither                 -       kill_a    
141                                                   
142 FS_LITTER is gone - just remove it from fs_fla    
143                                                   
144 ---                                               
145                                                   
146 **mandatory**                                     
147                                                   
148 FS_SINGLE is gone (actually, that had happened    
149 went in - and hadn't been documented ;-/).  Ju    
150 (and see ->get_sb() entry for other actions).     
151                                                   
152 ---                                               
153                                                   
154 **mandatory**                                     
155                                                   
156 ->setattr() is called without BKL now.  Caller    
157 watch for ->i_mutex-grabbing code that might b    
158 Callers of notify_change() need ->i_mutex now.    
159                                                   
160 ---                                               
161                                                   
162 **recommended**                                   
163                                                   
164 New super_block field ``struct export_operatio    
165 explicit support for exporting, e.g. via NFS.     
166 documented at its declaration in include/linux    
167 Documentation/filesystems/nfs/exporting.rst.      
168                                                   
169 Briefly it allows for the definition of decode    
170 to encode and decode filehandles, and allows t    
171 a standard helper function for decode_fh, and     
172 support for this helper, particularly get_pare    
173                                                   
174 It is planned that this will be required for e    
175 settles down a bit.                               
176                                                   
177 **mandatory**                                     
178                                                   
179 s_export_op is now required for exporting a fi    
180 isofs, ext2, ext3, reiserfs, fat                  
181 can be used as examples of very different file    
182                                                   
183 ---                                               
184                                                   
185 **mandatory**                                     
186                                                   
187 iget4() and the read_inode2 callback have been    
188 which has the following prototype::               
189                                                   
190     struct inode *iget5_locked(struct super_bl    
191                                 int (*test)(st    
192                                 int (*set)(str    
193                                 void *data);      
194                                                   
195 'test' is an additional function that can be u    
196 number is not sufficient to identify the actua    
197 should be a non-blocking function that initial    
198 newly created inode to allow the test function    
199 passed as an opaque value to both test and set    
200                                                   
201 When the inode has been created by iget5_locke    
202 I_NEW flag set and will still be locked.  The     
203 the initialization. Once the inode is initiali    
204 calling unlock_new_inode().                       
205                                                   
206 The filesystem is responsible for setting (and    
207 when appropriate. There is also a simpler iget    
208 just takes the superblock and inode number as     
209 test and set for you.                             
210                                                   
211 e.g.::                                            
212                                                   
213         inode = iget_locked(sb, ino);             
214         if (inode->i_state & I_NEW) {             
215                 err = read_inode_from_disk(ino    
216                 if (err < 0) {                    
217                         iget_failed(inode);       
218                         return err;               
219                 }                                 
220                 unlock_new_inode(inode);          
221         }                                         
222                                                   
223 Note that if the process of setting up a new i    
224 should be called on the inode to render it dea    
225 should be passed back to the caller.              
226                                                   
227 ---                                               
228                                                   
229 **recommended**                                   
230                                                   
231 ->getattr() finally getting used.  See instanc    
232                                                   
233 ---                                               
234                                                   
235 **mandatory**                                     
236                                                   
237 ->revalidate() is gone.  If your filesystem ha    
238 and let it call whatever you had as ->revlidat    
239 had ->revalidate()) add calls in ->follow_link    
240                                                   
241 ---                                               
242                                                   
243 **mandatory**                                     
244                                                   
245 ->d_parent changes are not protected by BKL an    
246 if at least one of the following is true:         
247                                                   
248         * filesystem has no cross-directory re    
249         * we know that parent had been locked     
250           ->d_parent of ->lookup() argument).     
251         * we are called from ->rename().          
252         * the child's ->d_lock is held            
253                                                   
254 Audit your code and add locking if needed.  No    
255 not protected by the conditions above is risky    
256 had been relying on BKL and that's prone to sc    
257 a few holes of that kind - unprotected access     
258 anything from oops to silent memory corruption    
259                                                   
260 ---                                               
261                                                   
262 **mandatory**                                     
263                                                   
264 FS_NOMOUNT is gone.  If you use it - just set     
265 (see rootfs for one kind of solution and bdev/    
266                                                   
267 ---                                               
268                                                   
269 **recommended**                                   
270                                                   
271 Use bdev_read_only(bdev) instead of is_read_on    
272 is still alive, but only because of the mess i    
273 As soon as it gets fixed is_read_only() will d    
274                                                   
275 ---                                               
276                                                   
277 **mandatory**                                     
278                                                   
279 ->permission() is called without BKL now. Grab    
280 return - that will guarantee the same locking     
281 your method or its parts do not need BKL - bet    
282 shift lock_kernel() and unlock_kernel() so tha    
283 exactly what needs to be protected.               
284                                                   
285 ---                                               
286                                                   
287 **mandatory**                                     
288                                                   
289 ->statfs() is now called without BKL held.  BK    
290 shifted into individual fs sb_op functions whe    
291 it's safe to remove it.  If you don't need it,    
292                                                   
293 ---                                               
294                                                   
295 **mandatory**                                     
296                                                   
297 is_read_only() is gone; use bdev_read_only() i    
298                                                   
299 ---                                               
300                                                   
301 **mandatory**                                     
302                                                   
303 destroy_buffers() is gone; use invalidate_bdev    
304                                                   
305 ---                                               
306                                                   
307 **mandatory**                                     
308                                                   
309 fsync_dev() is gone; use fsync_bdev().  NOTE:     
310 deliberate; as soon as struct block_device * i    
311 way by that code fixing will become trivial; u    
312 done.                                             
313                                                   
314 **mandatory**                                     
315                                                   
316 block truncatation on error exit from ->write_    
317 moved from generic methods (block_write_begin,    
318 nobh_write_begin, blockdev_direct_IO*) to call    
319 ext2_write_failed and callers for an example.     
320                                                   
321 **mandatory**                                     
322                                                   
323 ->truncate is gone.  The whole truncate sequen    
324 implemented in ->setattr, which is now mandato    
325 implementing on-disk size changes.  Start with    
326 and vmtruncate, and the reorder the vmtruncate    
327 be in order of zeroing blocks using block_trun    
328 size update and on finally on-disk truncation     
329 setattr_prepare (which used to be inode_change    
330 for ATTR_SIZE and must be called in the beginn    
331                                                   
332 **mandatory**                                     
333                                                   
334 ->clear_inode() and ->delete_inode() are gone;    
335 be used instead.  It gets called whenever the     
336 remaining links or not.  Caller does *not* evi    
337 metadata buffers; the method has to use trunca    
338 of those. Caller makes sure async writeback ca    
339 (or after) ->evict_inode() is called.             
340                                                   
341 ->drop_inode() returns int now; it's called on    
342 inode->i_lock held and it returns true if file    
343 dropped.  As before, generic_drop_inode() is s    
344 updated appropriately.  generic_delete_inode()    
345 simply of return 1.  Note that all actual evic    
346 ->drop_inode() returns.                           
347                                                   
348 As before, clear_inode() must be called exactl    
349 ->evict_inode() (as it used to be for each cal    
350 before, if you are using inode-associated meta    
351 mark_buffer_dirty_inode()), it's your responsi    
352 invalidate_inode_buffers() before clear_inode(    
353                                                   
354 NOTE: checking i_nlink in the beginning of ->w    
355 if it's zero is not *and* *never* *had* *been*    
356 may happen while the inode is in the middle of    
357 free the on-disk inode, you may end up doing t    
358 to it.                                            
359                                                   
360 ---                                               
361                                                   
362 **mandatory**                                     
363                                                   
364 .d_delete() now only advises the dcache as to     
365 unreferenced dentries, and is now only called     
366 0. Even on 0 refcount transition, it must be a    
367 1, or more times (eg. constant, idempotent).      
368                                                   
369 ---                                               
370                                                   
371 **mandatory**                                     
372                                                   
373 .d_compare() calling convention and locking ru    
374 changed. Read updated documentation in Documen    
375 look at examples of other filesystems) for gui    
376                                                   
377 ---                                               
378                                                   
379 **mandatory**                                     
380                                                   
381 .d_hash() calling convention and locking rules    
382 changed. Read updated documentation in Documen    
383 look at examples of other filesystems) for gui    
384                                                   
385 ---                                               
386                                                   
387 **mandatory**                                     
388                                                   
389 dcache_lock is gone, replaced by fine grained     
390 for details of what locks to replace dcache_lo    
391 particular things. Most of the time, a filesys    
392 protects *all* the dcache state of a given den    
393                                                   
394 ---                                               
395                                                   
396 **mandatory**                                     
397                                                   
398 Filesystems must RCU-free their inodes, if the    
399 via rcu-walk path walk (basically, if the file    
400 vfs namespace).                                   
401                                                   
402 Even though i_dentry and i_rcu share storage i    
403 initialize the former in inode_init_always(),     
404 the callback.  It used to be necessary to clea    
405 (starting at 3.2).                                
406                                                   
407 ---                                               
408                                                   
409 **recommended**                                   
410                                                   
411 vfs now tries to do path walking in "rcu-walk     
412 atomic operations and scalability hazards on d    
413 Documentation/filesystems/path-lookup.txt). d_    
414 (above) are examples of the changes required t    
415 filesystem callbacks, the vfs drops out of rcu    
416 no changes are required to the filesystem. How    
417 the benefits of rcu-walk mode. We will begin t    
418 are rcu-walk aware, shown below. Filesystems s    
419 where possible.                                   
420                                                   
421 ---                                               
422                                                   
423 **mandatory**                                     
424                                                   
425 d_revalidate is a callback that is made on eve    
426 the filesystem provides it), which requires dr    
427 may now be called in rcu-walk mode (nd->flags     
428 returned if the filesystem cannot handle rcu-w    
429 Documentation/filesystems/vfs.rst for more det    
430                                                   
431 permission is an inode permission check that i    
432 directory inodes on the way down a path walk (    
433 must now be rcu-walk aware (mask & MAY_NOT_BLO    
434 Documentation/filesystems/vfs.rst for more det    
435                                                   
436 ---                                               
437                                                   
438 **mandatory**                                     
439                                                   
440 In ->fallocate() you must check the mode optio    
441 filesystem does not support hole punching (dea    
442 file) you must return -EOPNOTSUPP if FALLOC_FL    
443 Currently you can only have FALLOC_FL_PUNCH_HO    
444 so the i_size should not change when hole punc    
445 a file off.                                       
446                                                   
447 ---                                               
448                                                   
449 **mandatory**                                     
450                                                   
451 ->get_sb() is gone.  Switch to use of ->mount(    
452 a matter of switching from calling ``get_sb_``    
453 the function type.  If you were doing it manua    
454 ->mnt_root to some pointer to returning that p    
455 ERR_PTR(...).                                     
456                                                   
457 ---                                               
458                                                   
459 **mandatory**                                     
460                                                   
461 ->permission() and generic_permission()have lo    
462 argument; instead of passing IPERM_FLAG_RCU we    
463                                                   
464 generic_permission() has also lost the check_a    
465 has been taken to VFS and filesystems need to     
466 ->i_op->get_inode_acl to read an ACL from disk    
467                                                   
468 ---                                               
469                                                   
470 **mandatory**                                     
471                                                   
472 If you implement your own ->llseek() you must     
473 SEEK_DATA.  You can handle this by returning -    
474 support it in some way.  The generic handler a    
475 data and there is a virtual hole at the end of    
476 offset is less than i_size and SEEK_DATA is sp    
477 If the above is true for the offset and you ar    
478 of the file.  If the offset is i_size or great    
479                                                   
480 **mandatory**                                     
481                                                   
482 If you have your own ->fsync() you must make s    
483 filemap_write_and_wait_range() so that all dir    
484 You must also keep in mind that ->fsync() is n    
485 anymore, so if you require i_mutex locking you    
486 release it yourself.                              
487                                                   
488 ---                                               
489                                                   
490 **mandatory**                                     
491                                                   
492 d_alloc_root() is gone, along with a lot of bu    
493 misusing it.  Replacement: d_make_root(inode).    
494 allocates and returns a new dentry instantiate    
495 On failure NULL is returned and the passed in     
496 to inode is consumed in all cases and failure     
497 for the inode.  If d_make_root(inode) is passe    
498 and also requires no further error handling. T    
499                                                   
500         inode = foofs_new_inode(....);            
501         s->s_root = d_make_root(inode);           
502         if (!s->s_root)                           
503                 /* Nothing needed for the inod    
504                 return -ENOMEM;                   
505         ...                                       
506                                                   
507 ---                                               
508                                                   
509 **mandatory**                                     
510                                                   
511 The witch is dead!  Well, 2/3 of it, anyway.      
512 ->lookup() do *not* take struct nameidata anym    
513                                                   
514 ---                                               
515                                                   
516 **mandatory**                                     
517                                                   
518 ->create() doesn't take ``struct nameidata *``    
519 two, it gets "is it an O_EXCL or equivalent?"     
520 local filesystems can ignore this argument - t    
521 object doesn't exist.  It's remote/distributed    
522                                                   
523 ---                                               
524                                                   
525 **mandatory**                                     
526                                                   
527 FS_REVAL_DOT is gone; if you used to have it,     
528 in your dentry operations instead.                
529                                                   
530 ---                                               
531                                                   
532 **mandatory**                                     
533                                                   
534 vfs_readdir() is gone; switch to iterate_dir()    
535                                                   
536 ---                                               
537                                                   
538 **mandatory**                                     
539                                                   
540 ->readdir() is gone now; switch to ->iterate_s    
541                                                   
542 **mandatory**                                     
543                                                   
544 vfs_follow_link has been removed.  Filesystems    
545 from ->follow_link for normal symlinks, or nd_    
546 /proc/<pid> style links.                          
547                                                   
548 ---                                               
549                                                   
550 **mandatory**                                     
551                                                   
552 iget5_locked()/ilookup5()/ilookup5_nowait() te    
553 called with both ->i_lock and inode_hash_lock     
554 taken anymore, so verify that your callbacks d    
555 of the in-tree instances did).  inode_hash_loc    
556 of course, so they are still serialized wrt re    
557 as well as wrt set() callback of iget5_locked(    
558                                                   
559 ---                                               
560                                                   
561 **mandatory**                                     
562                                                   
563 d_materialise_unique() is gone; d_splice_alias    
564 need now.  Remember that they have opposite or    
565                                                   
566 ---                                               
567                                                   
568 **mandatory**                                     
569                                                   
570 f_dentry is gone; use f_path.dentry, or, bette    
571 it entirely.                                      
572                                                   
573 ---                                               
574                                                   
575 **mandatory**                                     
576                                                   
577 never call ->read() and ->write() directly; us    
578 wrappers; instead of checking for ->write or -    
579 FMODE_CAN_{WRITE,READ} in file->f_mode.           
580                                                   
581 ---                                               
582                                                   
583 **mandatory**                                     
584                                                   
585 do _not_ use new_sync_{read,write} for ->read/    
586 instead.                                          
587                                                   
588 ---                                               
589                                                   
590 **mandatory**                                     
591         ->aio_read/->aio_write are gone.  Use     
592                                                   
593 ---                                               
594                                                   
595 **recommended**                                   
596                                                   
597 for embedded ("fast") symlinks just set inode-    
598 symlink body is and use simple_follow_link() a    
599                                                   
600 ---                                               
601                                                   
602 **mandatory**                                     
603                                                   
604 calling conventions for ->follow_link() have c    
605 cookie and using nd_set_link() to store the bo    
606 the body to traverse and store the cookie usin    
607 nameidata isn't passed at all - nd_jump_link()    
608 nd_[gs]et_link() is gone.                         
609                                                   
610 ---                                               
611                                                   
612 **mandatory**                                     
613                                                   
614 calling conventions for ->put_link() have chan    
615 dentry,  it does not get nameidata at all and     
616 is non-NULL.  Note that link body isn't availa    
617 store it as cookie.                               
618                                                   
619 ---                                               
620                                                   
621 **mandatory**                                     
622                                                   
623 any symlink that might use page_follow_link_li    
624 have inode_nohighmem(inode) called before anyt    
625 its pagecache.  No highmem pages should end up    
626 symlinks.  That includes any preseeding that m    
627 creation.  page_symlink() will honour the mapp    
628 you've done inode_nohighmem() it's safe to use    
629 insert the page manually, make sure to use the    
630                                                   
631 ---                                               
632                                                   
633 **mandatory**                                     
634                                                   
635 ->follow_link() is replaced with ->get_link();    
636                                                   
637         * ->get_link() gets inode as a separat    
638         * ->get_link() may be called in RCU mo    
639           dentry is passed                        
640                                                   
641 ---                                               
642                                                   
643 **mandatory**                                     
644                                                   
645 ->get_link() gets struct delayed_call ``*done`    
646 set_delayed_call() where it used to set ``*coo    
647                                                   
648 ->put_link() is gone - just give the destructo    
649 in ->get_link().                                  
650                                                   
651 ---                                               
652                                                   
653 **mandatory**                                     
654                                                   
655 ->getxattr() and xattr_handler.get() get dentr    
656 dentry might be yet to be attached to inode, s    
657 in the instances.  Rationale: !@#!@# security_    
658 called before we attach dentry to inode.          
659                                                   
660 ---                                               
661                                                   
662 **mandatory**                                     
663                                                   
664 symlinks are no longer the only inodes that do    
665 i_pipe/i_link union zeroed out at inode evicti    
666 assume that non-NULL value in ->i_nlink at ->d    
667 it's a symlink.  Checking ->i_mode is really n    
668 to fix shmem_destroy_callback() that used to t    
669 watch out, since that shortcut is no longer va    
670                                                   
671 ---                                               
672                                                   
673 **mandatory**                                     
674                                                   
675 ->i_mutex is replaced with ->i_rwsem now.  ino    
676 they used to - they just take it exclusive.  H    
677 called with parent locked shared.  Its instanc    
678                                                   
679         * use d_instantiate) and d_rehash() se    
680           d_splice_alias() instead.               
681         * use d_rehash() alone - call d_add(ne    
682         * in the unlikely case when (read-only    
683           data structures needs exclusion for     
684           yourself.  None of the in-tree files    
685         * rely on ->d_parent and ->d_name not     
686           been fed to d_add() or d_splice_alia    
687           in-tree instances relied upon that.     
688                                                   
689 We are guaranteed that lookups of the same nam    
690 will not happen in parallel ("same" in the sen    
691 Lookups on different names in the same directo    
692 parallel now.                                     
693                                                   
694 ---                                               
695                                                   
696 **mandatory**                                     
697                                                   
698 ->iterate_shared() is added.                      
699 Exclusion on struct file level is still provid    
700 between it and lseek on the same struct file),    
701 has been opened several times, you can get the    
702 Exclusion between that method and all director    
703 still provided, of course.                        
704                                                   
705 If you have any per-inode or per-dentry in-cor    
706 by ->iterate_shared(), you might need somethin    
707 to them.  If you do dcache pre-seeding, you'll    
708 d_alloc_parallel() for that; look for in-tree     
709                                                   
710 ---                                               
711                                                   
712 **mandatory**                                     
713                                                   
714 ->atomic_open() calls without O_CREAT may happ    
715                                                   
716 ---                                               
717                                                   
718 **mandatory**                                     
719                                                   
720 ->setxattr() and xattr_handler.set() get dentr    
721 The xattr_handler.set() gets passed the user n    
722 is seen from so filesystems can idmap the i_ui    
723 dentry might be yet to be attached to inode, s    
724 in the instances.  Rationale: !@#!@# security_    
725 called before we attach dentry to inode and !@    
726 ->d_instantiate() uses not just ->getxattr() b    
727                                                   
728 ---                                               
729                                                   
730 **mandatory**                                     
731                                                   
732 ->d_compare() doesn't get parent as a separate    
733 used it for finding the struct super_block inv    
734 work just as well; if it's something more comp    
735 Just be careful not to assume that fetching it    
736 the same value - in RCU mode it could change u    
737                                                   
738 ---                                               
739                                                   
740 **mandatory**                                     
741                                                   
742 ->rename() has an added flags argument.  Any f    
743 filesystem should result in EINVAL being retur    
744                                                   
745 ---                                               
746                                                   
747                                                   
748 **recommended**                                   
749                                                   
750 ->readlink is optional for symlinks.  Don't se    
751 to fake something for readlink(2).                
752                                                   
753 ---                                               
754                                                   
755 **mandatory**                                     
756                                                   
757 ->getattr() is now passed a struct path rather    
758 dentry separately, and it now has request_mask    
759 to specify the fields and sync type requested     
760 supporting any statx-specific features may ign    
761                                                   
762 ---                                               
763                                                   
764 **mandatory**                                     
765                                                   
766 ->atomic_open() calling conventions have chang    
767 along with FILE_OPENED/FILE_CREATED.  In place    
768 FMODE_OPENED/FMODE_CREATED, set in file->f_mod    
769 value for 'called finish_no_open(), open it yo    
770 0, not 1.  Since finish_no_open() itself is re    
771 does not need any changes in ->atomic_open() i    
772                                                   
773 ---                                               
774                                                   
775 **mandatory**                                     
776                                                   
777 alloc_file() has become static now; two wrappe    
778 alloc_file_pseudo(inode, vfsmount, name, flags    
779 when dentry needs to be created; that's the ma    
780 users.  Calling conventions: on success a refe    
781 is returned and callers reference to inode is     
782 failure, ERR_PTR() is returned and no caller's    
783 so the caller needs to drop the inode referenc    
784 alloc_file_clone(file, flags, ops) does not af    
785 On success you get a new struct file sharing t    
786 original, on failure - ERR_PTR().                 
787                                                   
788 ---                                               
789                                                   
790 **mandatory**                                     
791                                                   
792 ->clone_file_range() and ->dedupe_file_range h    
793 ->remap_file_range().  See Documentation/files    
794 information.                                      
795                                                   
796 ---                                               
797                                                   
798 **recommended**                                   
799                                                   
800 ->lookup() instances doing an equivalent of::     
801                                                   
802         if (IS_ERR(inode))                        
803                 return ERR_CAST(inode);           
804         return d_splice_alias(inode, dentry);     
805                                                   
806 don't need to bother with the check - d_splice    
807 right thing when given ERR_PTR(...) as inode.     
808 inode to d_splice_alias() will also do the rig    
809 d_add(dentry, NULL); return NULL;), so that ki    
810 also doesn't need a separate treatment.           
811                                                   
812 ---                                               
813                                                   
814 **strongly recommended**                          
815                                                   
816 take the RCU-delayed parts of ->destroy_inode(    
817 ->free_inode().  If ->destroy_inode() becomes     
818 just get rid of it.  Synchronous work (e.g. th    
819 be done from an RCU callback, or any WARN_ON()    
820 stack trace) *might* be movable to ->evict_ino    
821 that goes only for the things that are not nee    
822 done by ->alloc_inode().  IOW, if it's cleanin    
823 might have accumulated over the life of in-cor    
824 might be a fit.                                   
825                                                   
826 Rules for inode destruction:                      
827                                                   
828         * if ->destroy_inode() is non-NULL, it    
829         * if ->free_inode() is non-NULL, it ge    
830         * combination of NULL ->destroy_inode     
831           treated as NULL/free_inode_nonrcu, t    
832                                                   
833 Note that the callback (be it via ->free_inode    
834 in ->destroy_inode()) is *NOT* ordered wrt sup    
835 as the matter of fact, the superblock and all     
836 might be already gone.  The filesystem driver     
837 there, but that's it.  Freeing memory in the c    
838 more than that is possible, but requires a lot    
839 avoided.                                          
840                                                   
841 ---                                               
842                                                   
843 **mandatory**                                     
844                                                   
845 DCACHE_RCUACCESS is gone; having an RCU delay     
846 default.  DCACHE_NORCU opts out, and only d_al    
847 business doing so.                                
848                                                   
849 ---                                               
850                                                   
851 **mandatory**                                     
852                                                   
853 d_alloc_pseudo() is internal-only; uses outsid    
854 very suspect (and won't work in modules).  Suc    
855 be misspelled d_alloc_anon().                     
856                                                   
857 ---                                               
858                                                   
859 **mandatory**                                     
860                                                   
861 [should've been added in 2016] stale comment i    
862 failure exits in ->atomic_open() instances sho    
863 no matter what.  Everything is handled by the     
864                                                   
865 ---                                               
866                                                   
867 **mandatory**                                     
868                                                   
869 clone_private_mount() returns a longterm mount    
870 its result is kern_unmount() or kern_unmount_a    
871                                                   
872 ---                                               
873                                                   
874 **mandatory**                                     
875                                                   
876 zero-length bvec segments are disallowed, they    
877 passed on to an iterator.                         
878                                                   
879 ---                                               
880                                                   
881 **mandatory**                                     
882                                                   
883 For bvec based itererators bio_iov_iter_get_pa    
884 uses the one provided. Anyone issuing kiocb-I/    
885 page references stay until I/O has completed,     
886 been called or returned with non -EIOCBQUEUED     
887                                                   
888 ---                                               
889                                                   
890 **mandatory**                                     
891                                                   
892 mnt_want_write_file() can now only be paired w    
893 whereas previously it could be paired with mnt    
894                                                   
895 ---                                               
896                                                   
897 **mandatory**                                     
898                                                   
899 iov_iter_copy_from_user_atomic() is gone; use     
900 The difference is copy_page_from_iter_atomic()    
901 you don't need iov_iter_advance() after it.  H    
902 only a part of obtained data, you should do io    
903                                                   
904 ---                                               
905                                                   
906 **mandatory**                                     
907                                                   
908 Calling conventions for file_open_root() chang    
909 instead of passing mount and dentry separately    
910 pass <mnt, mnt->mnt_root> pair (i.e. the root     
911 is provided - file_open_root_mnt().  In-tree u    
912                                                   
913 ---                                               
914                                                   
915 **mandatory**                                     
916                                                   
917 no_llseek is gone; don't set .llseek to that -    
918 Checks for "does that file have llseek(2), or     
919 should be done by looking at FMODE_LSEEK in fi    
920                                                   
921 ---                                               
922                                                   
923 *mandatory*                                       
924                                                   
925 filldir_t (readdir callbacks) calling conventi    
926 returning 0 or -E... it returns bool now.  fal    
927 to) and true - "keep going" (as 0 in old calli    
928 callers never looked at specific -E... values     
929 instances require no changes at all, all filld    
930 converted.                                        
931                                                   
932 ---                                               
933                                                   
934 **mandatory**                                     
935                                                   
936 Calling conventions for ->tmpfile() have chang    
937 file pointer instead of struct dentry pointer.    
938 changed to simplify callers.  The passed file     
939 success must be opened before returning (e.g.     
940 finish_open_simple()).                            
941                                                   
942 ---                                               
943                                                   
944 **mandatory**                                     
945                                                   
946 Calling convention for ->huge_fault has change    
947 order instead of an enum page_entry_size, and     
948 mmap_lock held.  All in-tree users have been a    
949 depend on the mmap_lock being held, but out of    
950 for themselves.  If they do need it, they can     
951 be called with the mmap_lock held.                
952                                                   
953 ---                                               
954                                                   
955 **mandatory**                                     
956                                                   
957 The order of opening block devices and matchin    
958 changed.                                          
959                                                   
960 The old logic opened block devices first and t    
961 suitable superblock to reuse based on the bloc    
962                                                   
963 The new logic tries to find a suitable superbl    
964 number, and opening the block device afterward    
965                                                   
966 Since opening block devices cannot happen unde    
967 ordering requirements s_umount is now dropped     
968 reacquired before calling fill_super().           
969                                                   
970 In the old logic concurrent mounters would fin    
971 superblocks for the filesystem type. Since the    
972 would hold s_umount they would wait until the     
973 was discarded due to initialization failure.      
974                                                   
975 Since the new logic drops s_umount concurrent     
976 would spin. Instead they are now made to wait     
977 mechanism without having to hold s_umount.        
978                                                   
979 ---                                               
980                                                   
981 **mandatory**                                     
982                                                   
983 The holder of a block device is now the superb    
984                                                   
985 The holder of a block device used to be the fi    
986 particularly useful. It wasn't possible to go     
987 superblock without matching on the device poin    
988 This mechanism would only work for a single de    
989 find the owning superblock of any additional d    
990                                                   
991 In the old mechanism reusing or creating a sup    
992 umount(2) relied on the file_system_type as th    
993 underdocumented however:                          
994                                                   
995 (1) Any concurrent mounter that managed to gra    
996     existing superblock was made to wait until    
997     ready or until the superblock was removed     
998     the filesystem type. If the superblock is     
999     reuse it.                                     
1000                                                  
1001 (2) If the mounter came after deactivate_lock    
1002     the superblock had been removed from the     
1003     filesystem type the mounter would wait un    
1004     reuse the block device and allocate a new    
1005                                                  
1006 (3) If the mounter came after deactivate_lock    
1007     the superblock had been removed from the     
1008     filesystem type the mounter would reuse t    
1009     superblock (the bd_holder point may still    
1010                                                  
1011 Because the holder of the block device was th    
1012 mounter could open the block devices of any s    
1013 file_system_type without risking seeing EBUSY    
1014 still in use by another superblock.              
1015                                                  
1016 Making the superblock the owner of the block     
1017 is now a unique superblock and thus block dev    
1018 reused by concurrent mounters. So a concurren    
1019 see EBUSY when trying to open a block device     
1020 superblock.                                      
1021                                                  
1022 The new logic thus waits until the superblock    
1023 ->kill_sb(). Removal of the superblock from t    
1024 filesystem type is now moved to a later point    
1025                                                  
1026 (1) Any concurrent mounter managing to grab a    
1027     superblock is made to wait until the supe    
1028     the superblock and all devices are shutdo    
1029     superblock is ready the caller will simpl    
1030                                                  
1031 (2) If the mounter comes after deactivate_loc    
1032     the superblock has been removed from the     
1033     filesystem type the mounter is made to wa    
1034     devices are shut down in ->kill_sb() and     
1035     list of superblocks of the filesystem typ    
1036     superblock and grab ownership of the bloc    
1037     the block device will be set to the newly    
1038                                                  
1039 (3) This case is now collapsed into (2) as th    
1040     of superblocks of the filesystem type unt    
1041     ->kill_sb(). In other words, if the super    
1042     superblock of the filesystem type anymore    
1043     all associated block devices (the bd_hold    
1044                                                  
1045 As this is a VFS level change it has no pract    
1046 other than that all of them must use one of t    
1047 kill_anon_super(), or kill_block_super() help    
1048                                                  
1049 ---                                              
1050                                                  
1051 **mandatory**                                    
1052                                                  
1053 Lock ordering has been changed so that s_umou    
1054 All places where s_umount was taken under ope    
1055                                                  
1056 ---                                              
1057                                                  
1058 **mandatory**                                    
1059                                                  
1060 export_operations ->encode_fh() no longer has    
1061 encode FILEID_INO32_GEN* file handles.           
1062 Filesystems that used the default implementat    
1063 generic_encode_ino32_fh() explicitly.            
1064                                                  
1065 ---                                              
1066                                                  
1067 **mandatory**                                    
1068                                                  
1069 If ->rename() update of .. on cross-directory    
1070 directory modifications, do *not* lock the su    
1071 ->rename() - it's done by the caller now [tha    
1072 28eceeda130f "fs: Lock moved directories"].      
1073                                                  
1074 ---                                              
1075                                                  
1076 **mandatory**                                    
1077                                                  
1078 On same-directory ->rename() the (tautologica    
1079 by any locks; just don't do it if the old par    
1080 We really can't lock two subdirectories in sa    
1081 deadlocks.                                       
1082                                                  
1083 ---                                              
1084                                                  
1085 **mandatory**                                    
1086                                                  
1087 lock_rename() and lock_rename_child() may fai    
1088 their arguments do not have a common ancestor    
1089 is returned, with no locks taken.  In-tree us    
1090 would need to do so.                             
1091                                                  
1092 ---                                              
1093                                                  
1094 **mandatory**                                    
1095                                                  
1096 The list of children anchored in parent dentr    
1097 Field names got changed (->d_children/->d_sib    
1098 for anchor/entries resp.), so any affected pl    
1099 by compiler.                                     
1100                                                  
1101 ---                                              
1102                                                  
1103 **mandatory**                                    
1104                                                  
1105 ->d_delete() instances are now called for den    
1106 and refcount equal to 0.  They are not permit    
1107 None of in-tree instances did anything of tha    
1108                                                  
1109 ---                                              
1110                                                  
1111 **mandatory**                                    
1112                                                  
1113 ->d_prune() instances are now called without     
1114 ->d_lock on dentry itself is still held; if y    
1115 of the in-tree instances did), use your own s    
1116                                                  
1117 ->d_iput() and ->d_release() are called with     
1118 list of parent's children.  It is still unhas    
1119 removed from parent's ->d_children yet.          
1120                                                  
1121 Anyone iterating through the list of children    
1122 half-killed dentries that might be seen there    
1123 see them negative, unhashed and with negative    
1124 of the in-kernel users would've done the righ    
1125                                                  
1126 ---                                              
1127                                                  
1128 **recommended**                                  
1129                                                  
1130 Block device freezing and thawing have been m    
1131                                                  
1132 Before this change, get_active_super() would     
1133 superblock of the main block device, i.e., th    
1134 device freezing now works for any block devic    
1135 just the main block device. The get_active_su    
1136 pointer are gone.                                
1137                                                  
1138 ---                                              
1139                                                  
1140 **mandatory**                                    
1141                                                  
1142 set_blocksize() takes opened struct file inst    
1143 and it *must* be opened exclusive.               
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php