~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/locking.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/filesystems/locking.rst (Version linux-6.12-rc7) and /Documentation/filesystems/locking.rst (Version linux-4.17.19)


  1 =======                                           
  2 Locking                                           
  3 =======                                           
  4                                                   
  5 The text below describes the locking rules for    
  6 It is (believed to be) up-to-date. *Please*, i    
  7 prototypes or locking protocols - update this     
  8 instances in the tree, don't leave that to mai    
  9 etc. At the very least, put the list of dubiou    
 10 Don't turn it into log - maintainers of out-of    
 11 be able to use diff(1).                           
 12                                                   
 13 Thing currently missing here: socket operation    
 14                                                   
 15 dentry_operations                                 
 16 =================                                 
 17                                                   
 18 prototypes::                                      
 19                                                   
 20         int (*d_revalidate)(struct dentry *, u    
 21         int (*d_weak_revalidate)(struct dentry    
 22         int (*d_hash)(const struct dentry *, s    
 23         int (*d_compare)(const struct dentry *    
 24                         unsigned int, const ch    
 25         int (*d_delete)(struct dentry *);         
 26         int (*d_init)(struct dentry *);           
 27         void (*d_release)(struct dentry *);       
 28         void (*d_iput)(struct dentry *, struct    
 29         char *(*d_dname)((struct dentry *dentr    
 30         struct vfsmount *(*d_automount)(struct    
 31         int (*d_manage)(const struct path *, b    
 32         struct dentry *(*d_real)(struct dentry    
 33                                                   
 34 locking rules:                                    
 35                                                   
 36 ================== ===========  ========          
 37 ops                rename_lock  ->d_lock          
 38 ================== ===========  ========          
 39 d_revalidate:      no           no                
 40 d_weak_revalidate: no           no                
 41 d_hash             no           no                
 42 d_compare:         yes          no                
 43 d_delete:          no           yes               
 44 d_init:            no           no                
 45 d_release:         no           no                
 46 d_prune:           no           yes               
 47 d_iput:            no           no                
 48 d_dname:           no           no                
 49 d_automount:       no           no                
 50 d_manage:          no           no                
 51 d_real             no           no                
 52 ================== ===========  ========          
 53                                                   
 54 inode_operations                                  
 55 ================                                  
 56                                                   
 57 prototypes::                                      
 58                                                   
 59         int (*create) (struct mnt_idmap *, str    
 60         struct dentry * (*lookup) (struct inod    
 61         int (*link) (struct dentry *,struct in    
 62         int (*unlink) (struct inode *,struct d    
 63         int (*symlink) (struct mnt_idmap *, st    
 64         int (*mkdir) (struct mnt_idmap *, stru    
 65         int (*rmdir) (struct inode *,struct de    
 66         int (*mknod) (struct mnt_idmap *, stru    
 67         int (*rename) (struct mnt_idmap *, str    
 68                         struct inode *, struct    
 69         int (*readlink) (struct dentry *, char    
 70         const char *(*get_link) (struct dentry    
 71         void (*truncate) (struct inode *);        
 72         int (*permission) (struct mnt_idmap *,    
 73         struct posix_acl * (*get_inode_acl)(st    
 74         int (*setattr) (struct mnt_idmap *, st    
 75         int (*getattr) (struct mnt_idmap *, co    
 76         ssize_t (*listxattr) (struct dentry *,    
 77         int (*fiemap)(struct inode *, struct f    
 78         void (*update_time)(struct inode *, st    
 79         int (*atomic_open)(struct inode *, str    
 80                                 struct file *,    
 81                                 umode_t create    
 82         int (*tmpfile) (struct mnt_idmap *, st    
 83                         struct file *, umode_t    
 84         int (*fileattr_set)(struct mnt_idmap *    
 85                             struct dentry *den    
 86         int (*fileattr_get)(struct dentry *den    
 87         struct posix_acl * (*get_acl)(struct m    
 88         struct offset_ctx *(*get_offset_ctx)(s    
 89                                                   
 90 locking rules:                                    
 91         all may block                             
 92                                                   
 93 ==============  ==============================    
 94 ops             i_rwsem(inode)                    
 95 ==============  ==============================    
 96 lookup:         shared                            
 97 create:         exclusive                         
 98 link:           exclusive (both)                  
 99 mknod:          exclusive                         
100 symlink:        exclusive                         
101 mkdir:          exclusive                         
102 unlink:         exclusive (both)                  
103 rmdir:          exclusive (both)(see below)       
104 rename:         exclusive (both parents, some     
105 readlink:       no                                
106 get_link:       no                                
107 setattr:        exclusive                         
108 permission:     no (may not block if called in    
109 get_inode_acl:  no                                
110 get_acl:        no                                
111 getattr:        no                                
112 listxattr:      no                                
113 fiemap:         no                                
114 update_time:    no                                
115 atomic_open:    shared (exclusive if O_CREAT i    
116 tmpfile:        no                                
117 fileattr_get:   no or exclusive                   
118 fileattr_set:   exclusive                         
119 get_offset_ctx  no                                
120 ==============  ==============================    
121                                                   
122                                                   
123         Additionally, ->rmdir(), ->unlink() an    
124         exclusive on victim.                      
125         cross-directory ->rename() has (per-su    
126         ->unlink() and ->rename() have ->i_rws    
127         involved.                                 
128         ->rename() has ->i_rwsem exclusive on     
129                                                   
130 See Documentation/filesystems/directory-lockin    
131 of the locking scheme for directory operations    
132                                                   
133 xattr_handler operations                          
134 ========================                          
135                                                   
136 prototypes::                                      
137                                                   
138         bool (*list)(struct dentry *dentry);      
139         int (*get)(const struct xattr_handler     
140                    struct inode *inode, const     
141                    size_t size);                  
142         int (*set)(const struct xattr_handler     
143                    struct mnt_idmap *idmap,       
144                    struct dentry *dentry, stru    
145                    const void *buffer, size_t     
146                                                   
147 locking rules:                                    
148         all may block                             
149                                                   
150 =====           ==============                    
151 ops             i_rwsem(inode)                    
152 =====           ==============                    
153 list:           no                                
154 get:            no                                
155 set:            exclusive                         
156 =====           ==============                    
157                                                   
158 super_operations                                  
159 ================                                  
160                                                   
161 prototypes::                                      
162                                                   
163         struct inode *(*alloc_inode)(struct su    
164         void (*free_inode)(struct inode *);       
165         void (*destroy_inode)(struct inode *);    
166         void (*dirty_inode) (struct inode *, i    
167         int (*write_inode) (struct inode *, st    
168         int (*drop_inode) (struct inode *);       
169         void (*evict_inode) (struct inode *);     
170         void (*put_super) (struct super_block     
171         int (*sync_fs)(struct super_block *sb,    
172         int (*freeze_fs) (struct super_block *    
173         int (*unfreeze_fs) (struct super_block    
174         int (*statfs) (struct dentry *, struct    
175         int (*remount_fs) (struct super_block     
176         void (*umount_begin) (struct super_blo    
177         int (*show_options)(struct seq_file *,    
178         ssize_t (*quota_read)(struct super_blo    
179         ssize_t (*quota_write)(struct super_bl    
180                                                   
181 locking rules:                                    
182         All may block [not true, see below]       
183                                                   
184 ======================  ============    ======    
185 ops                     s_umount        note      
186 ======================  ============    ======    
187 alloc_inode:                                      
188 free_inode:                             called    
189 destroy_inode:                                    
190 dirty_inode:                                      
191 write_inode:                                      
192 drop_inode:                             !!!ino    
193 evict_inode:                                      
194 put_super:              write                     
195 sync_fs:                read                      
196 freeze_fs:              write                     
197 unfreeze_fs:            write                     
198 statfs:                 maybe(read)     (see b    
199 remount_fs:             write                     
200 umount_begin:           no                        
201 show_options:           no              (names    
202 quota_read:             no              (see b    
203 quota_write:            no              (see b    
204 ======================  ============    ======    
205                                                   
206 ->statfs() has s_umount (shared) when called b    
207 compat), but that's an accident of bad API; s_    
208 the superblock down when we only have dev_t gi    
209 identify the superblock.  Everything else (sta    
210 doesn't hold it when calling ->statfs() - supe    
211 by resolving the pathname passed to syscall.      
212                                                   
213 ->quota_read() and ->quota_write() functions a    
214 be the only ones operating on the quota file b    
215 dqio_sem) (unless an admin really wants to scr    
216 writes to quota files with quotas on). For oth    
217 see also dquot_operations section.                
218                                                   
219 file_system_type                                  
220 ================                                  
221                                                   
222 prototypes::                                      
223                                                   
224         struct dentry *(*mount) (struct file_s    
225                        const char *, void *);     
226         void (*kill_sb) (struct super_block *)    
227                                                   
228 locking rules:                                    
229                                                   
230 =======         =========                         
231 ops             may block                         
232 =======         =========                         
233 mount           yes                               
234 kill_sb         yes                               
235 =======         =========                         
236                                                   
237 ->mount() returns ERR_PTR or the root dentry;     
238 on return.                                        
239                                                   
240 ->kill_sb() takes a write-locked superblock, d    
241 unlocks and drops the reference.                  
242                                                   
243 address_space_operations                          
244 ========================                          
245 prototypes::                                      
246                                                   
247         int (*writepage)(struct page *page, st    
248         int (*read_folio)(struct file *, struc    
249         int (*writepages)(struct address_space    
250         bool (*dirty_folio)(struct address_spa    
251         void (*readahead)(struct readahead_con    
252         int (*write_begin)(struct file *, stru    
253                                 loff_t pos, un    
254                                 struct folio *    
255         int (*write_end)(struct file *, struct    
256                                 loff_t pos, un    
257                                 struct folio *    
258         sector_t (*bmap)(struct address_space     
259         void (*invalidate_folio) (struct folio    
260         bool (*release_folio)(struct folio *,     
261         void (*free_folio)(struct folio *);       
262         int (*direct_IO)(struct kiocb *, struc    
263         int (*migrate_folio)(struct address_sp    
264                         struct folio *src, enu    
265         int (*launder_folio)(struct folio *);     
266         bool (*is_partially_uptodate)(struct f    
267         int (*error_remove_folio)(struct addre    
268         int (*swap_activate)(struct swap_info_    
269         int (*swap_deactivate)(struct file *);    
270         int (*swap_rw)(struct kiocb *iocb, str    
271                                                   
272 locking rules:                                    
273         All except dirty_folio and free_folio     
274                                                   
275 ======================  ======================    
276 ops                     folio locked              
277 ======================  ======================    
278 writepage:              yes, unlocks (see belo    
279 read_folio:             yes, unlocks              
280 writepages:                                       
281 dirty_folio:            maybe                     
282 readahead:              yes, unlocks              
283 write_begin:            locks the folio           
284 write_end:              yes, unlocks              
285 bmap:                                             
286 invalidate_folio:       yes                       
287 release_folio:          yes                       
288 free_folio:             yes                       
289 direct_IO:                                        
290 migrate_folio:          yes (both)                
291 launder_folio:          yes                       
292 is_partially_uptodate:  yes                       
293 error_remove_folio:     yes                       
294 swap_activate:          no                        
295 swap_deactivate:        no                        
296 swap_rw:                yes, unlocks              
297 ======================  ======================    
298                                                   
299 ->write_begin(), ->write_end() and ->read_foli    
300 the request handler (/dev/loop).                  
301                                                   
302 ->read_folio() unlocks the folio, either synch    
303 completion.                                       
304                                                   
305 ->readahead() unlocks the folios that I/O is a    
306                                                   
307 ->writepage() is used for two purposes: for "m    
308 "sync".  These are quite different operations     
309 depending upon the mode.                          
310                                                   
311 If writepage is called for sync (wbc->sync_mod    
312 it *must* start I/O against the page, even if     
313 blocking on in-progress I/O.                      
314                                                   
315 If writepage is called for memory cleansing (s    
316 WBC_SYNC_NONE) then its role is to get as much    
317 possible.  So writepage should try to avoid bl    
318 currently-in-progress I/O.                        
319                                                   
320 If the filesystem is not called for "sync" and    
321 would need to block against in-progress I/O to    
322 against the page the filesystem should redirty    
323 redirty_page_for_writepage(), then unlock the     
324 This may also be done to avoid internal deadlo    
325                                                   
326 If the filesystem is called for sync then it m    
327 in-progress I/O and then start new I/O.           
328                                                   
329 The filesystem should unlock the page synchron    
330 caller, unless ->writepage() returns special W    
331 value. WRITEPAGE_ACTIVATE means that page cann    
332 currently, and VM should stop calling ->writep    
333 time. VM does this by moving page to the head     
334 name.                                             
335                                                   
336 Unless the filesystem is going to redirty_page    
337 and return zero, writepage *must* run set_page    
338 followed by unlocking it.  Once set_page_write    
339 page, write I/O can be submitted and the write    
340 end_page_writeback() once the I/O is complete.    
341 filesystem must run end_page_writeback() again    
342 writepage.                                        
343                                                   
344 That is: after 2.5.12, pages which are under w    
345 if the filesystem needs the page to be locked     
346 the page is allowed to be unlocked at any poin    
347 set_page_writeback() and end_page_writeback().    
348                                                   
349 Note, failure to run either redirty_page_for_w    
350 set_page_writeback()/end_page_writeback() on a    
351 will leave the page itself marked clean but it    
352 radix tree.  This incoherency can lead to all     
353 in the filesystem like having dirty inodes at     
354                                                   
355 ->writepages() is used for periodic writeback     
356 sync operations.  The address_space should sta    
357 ``*nr_to_write`` pages.  ``*nr_to_write`` must    
358 which is written.  The address_space implement    
359 pages than ``*nr_to_write`` asks for, but it s    
360 If nr_to_write is NULL, all dirty pages must b    
361                                                   
362 writepages should _only_ write pages which are    
363 mapping->io_pages.                                
364                                                   
365 ->dirty_folio() is called from various places     
366 the target folio is marked as needing writebac    
367 truncated because either the caller holds the     
368 has found the folio while holding the page tab    
369 truncation.                                       
370                                                   
371 ->bmap() is currently used by legacy ioctl() (    
372 filesystems and by the swapper. The latter wil    
373 keep it that way and don't breed new callers.     
374                                                   
375 ->invalidate_folio() is called when the filesy    
376 some or all of the buffers from the page when     
377 returns zero on success.  The filesystem must     
378 invalidate_lock before invalidating page cache    
379 path (and thus calling into ->invalidate_folio    
380 cache invalidation and page cache filling func    
381                                                   
382 ->release_folio() is called when the MM wants     
383 folio that would invalidate the filesystem's p    
384 it may be about to be removed from the address    
385 is locked and not under writeback.  It may be     
386 is not usually used for allocation, but rather    
387 filesystem may do to attempt to free the priva    
388 return false to indicate that the folio's priv    
389 If it returns true, it should have already rem    
390 the folio.  If a filesystem does not provide a    
391 the pagecache will assume that private data is    
392 try_to_free_buffers().                            
393                                                   
394 ->free_folio() is called when the kernel has d    
395 from the page cache.                              
396                                                   
397 ->launder_folio() may be called prior to relea    
398 it is still found to be dirty. It returns zero    
399 cleaned, or an error value if not. Note that i    
400 getting mapped back in and redirtied, it needs    
401 across the entire operation.                      
402                                                   
403 ->swap_activate() will be called to prepare th    
404 should perform any validation and preparation     
405 writes can be performed with minimal memory al    
406 add_swap_extent(), or the helper iomap_swapfil    
407 the number of extents added.  If IO should be     
408 ->swap_rw(), it should set SWP_FS_OPS, otherwi    
409 directly to the block device ``sis->bdev``.       
410                                                   
411 ->swap_deactivate() will be called in the sys_    
412 path after ->swap_activate() returned success.    
413                                                   
414 ->swap_rw will be called for swap IO if SWP_FS    
415                                                   
416 file_lock_operations                              
417 ====================                              
418                                                   
419 prototypes::                                      
420                                                   
421         void (*fl_copy_lock)(struct file_lock     
422         void (*fl_release_private)(struct file    
423                                                   
424                                                   
425 locking rules:                                    
426                                                   
427 ===================     =============   ======    
428 ops                     inode->i_lock   may bl    
429 ===================     =============   ======    
430 fl_copy_lock:           yes             no        
431 fl_release_private:     maybe           maybe[    
432 ===================     =============   ======    
433                                                   
434 .. [1]:                                           
435    ->fl_release_private for flock or POSIX loc    
436    to block. Leases however can still be freed    
437    so fl_release_private called on a lease sho    
438                                                   
439 lock_manager_operations                           
440 =======================                           
441                                                   
442 prototypes::                                      
443                                                   
444         void (*lm_notify)(struct file_lock *);    
445         int (*lm_grant)(struct file_lock *, st    
446         void (*lm_break)(struct file_lock *);     
447         int (*lm_change)(struct file_lock **,     
448         bool (*lm_breaker_owns_lease)(struct f    
449         bool (*lm_lock_expirable)(struct file_    
450         void (*lm_expire_lock)(void);             
451                                                   
452 locking rules:                                    
453                                                   
454 ======================  =============   ======    
455 ops                        flc_lock     blocke    
456 ======================  =============   ======    
457 lm_notify:              no              yes       
458 lm_grant:               no              no        
459 lm_break:               yes             no        
460 lm_change               yes             no        
461 lm_breaker_owns_lease:  yes             no        
462 lm_lock_expirable       yes             no        
463 lm_expire_lock          no              no        
464 ======================  =============   ======    
465                                                   
466 buffer_head                                       
467 ===========                                       
468                                                   
469 prototypes::                                      
470                                                   
471         void (*b_end_io)(struct buffer_head *b    
472                                                   
473 locking rules:                                    
474                                                   
475 called from interrupts. In other words, extrem    
476 bh is locked, but that's all warranties we hav    
477 highmem, fs/buffer.c, and fs/ntfs/aops.c are p    
478 call this method upon the IO completion.          
479                                                   
480 block_device_operations                           
481 =======================                           
482 prototypes::                                      
483                                                   
484         int (*open) (struct block_device *, fm    
485         int (*release) (struct gendisk *, fmod    
486         int (*ioctl) (struct block_device *, f    
487         int (*compat_ioctl) (struct block_devi    
488         int (*direct_access) (struct block_dev    
489                                 unsigned long     
490         void (*unlock_native_capacity) (struct    
491         int (*getgeo)(struct block_device *, s    
492         void (*swap_slot_free_notify) (struct     
493                                                   
494 locking rules:                                    
495                                                   
496 ======================= ===================       
497 ops                     open_mutex                
498 ======================= ===================       
499 open:                   yes                       
500 release:                yes                       
501 ioctl:                  no                        
502 compat_ioctl:           no                        
503 direct_access:          no                        
504 unlock_native_capacity: no                        
505 getgeo:                 no                        
506 swap_slot_free_notify:  no      (see below)       
507 ======================= ===================       
508                                                   
509 swap_slot_free_notify is called with swap_lock    
510 held.                                             
511                                                   
512                                                   
513 file_operations                                   
514 ===============                                   
515                                                   
516 prototypes::                                      
517                                                   
518         loff_t (*llseek) (struct file *, loff_    
519         ssize_t (*read) (struct file *, char _    
520         ssize_t (*write) (struct file *, const    
521         ssize_t (*read_iter) (struct kiocb *,     
522         ssize_t (*write_iter) (struct kiocb *,    
523         int (*iopoll) (struct kiocb *kiocb, bo    
524         int (*iterate_shared) (struct file *,     
525         __poll_t (*poll) (struct file *, struc    
526         long (*unlocked_ioctl) (struct file *,    
527         long (*compat_ioctl) (struct file *, u    
528         int (*mmap) (struct file *, struct vm_    
529         int (*open) (struct inode *, struct fi    
530         int (*flush) (struct file *);             
531         int (*release) (struct inode *, struct    
532         int (*fsync) (struct file *, loff_t st    
533         int (*fasync) (int, struct file *, int    
534         int (*lock) (struct file *, int, struc    
535         unsigned long (*get_unmapped_area)(str    
536                         unsigned long, unsigne    
537         int (*check_flags)(int);                  
538         int (*flock) (struct file *, int, stru    
539         ssize_t (*splice_write)(struct pipe_in    
540                         size_t, unsigned int);    
541         ssize_t (*splice_read)(struct file *,     
542                         size_t, unsigned int);    
543         int (*setlease)(struct file *, long, s    
544         long (*fallocate)(struct file *, int,     
545         void (*show_fdinfo)(struct seq_file *m    
546         unsigned (*mmap_capabilities)(struct f    
547         ssize_t (*copy_file_range)(struct file    
548                         loff_t, size_t, unsign    
549         loff_t (*remap_file_range)(struct file    
550                         struct file *file_out,    
551                         loff_t len, unsigned i    
552         int (*fadvise)(struct file *, loff_t,     
553                                                   
554 locking rules:                                    
555         All may block.                            
556                                                   
557 ->llseek() locking has moved from llseek to th    
558 implementations.  If your fs is not using gene    
559 need to acquire and release the appropriate lo    
560 For many filesystems, it is probably safe to a    
561 mutex or just to use i_size_read() instead.       
562 Note: this does not protect the file->f_pos ag    
563 since this is something the userspace has to t    
564                                                   
565 ->iterate_shared() is called with i_rwsem held    
566 file f_pos_lock held exclusively                  
567                                                   
568 ->fasync() is responsible for maintaining the     
569 Most instances call fasync_helper(), which doe    
570 not normally something one needs to worry abou    
571 mapped to zero in the VFS layer.                  
572                                                   
573 ->readdir() and ->ioctl() on directories must     
574 move ->readdir() to inode_operations and use a    
575 ->ioctl() or kill the latter completely. One o    
576 anything that resembles union-mount we won't h    
577 components. And there are other reasons why th    
578                                                   
579 ->read on directories probably must go away -     
580 in sys_read() and friends.                        
581                                                   
582 ->setlease operations should call generic_setl    
583 the lease within the individual filesystem to     
584 operation                                         
585                                                   
586 ->fallocate implementation must be really care    
587 consistency when punching holes or performing     
588 page cache contents. Usually the filesystem ne    
589 truncate_inode_pages_range() to invalidate rel    
590 However the filesystem usually also needs to u    
591 view of file offset -> disk block mapping. Unt    
592 filesystem needs to block page faults and read    
593 cache contents from the disk. Since VFS acquir    
594 shared mode when loading pages from disk (file    
595 readahead paths), the fallocate implementation    
596 prevent reloading.                                
597                                                   
598 ->copy_file_range and ->remap_file_range imple    
599 against modifications of file data while the o    
600 blocking changes through write(2) and similar     
601 used. To block changes to file contents via a     
602 operation, the filesystem must take mapping->i    
603 with ->page_mkwrite.                              
604                                                   
605 dquot_operations                                  
606 ================                                  
607                                                   
608 prototypes::                                      
609                                                   
610         int (*write_dquot) (struct dquot *);      
611         int (*acquire_dquot) (struct dquot *);    
612         int (*release_dquot) (struct dquot *);    
613         int (*mark_dirty) (struct dquot *);       
614         int (*write_info) (struct super_block     
615                                                   
616 These operations are intended to be more or le    
617 a proper locking wrt the filesystem and call t    
618                                                   
619 What filesystem should expect from the generic    
620                                                   
621 ==============  ============    ==============    
622 ops             FS recursion    Held locks whe    
623 ==============  ============    ==============    
624 write_dquot:    yes             dqonoff_sem or    
625 acquire_dquot:  yes             dqonoff_sem or    
626 release_dquot:  yes             dqonoff_sem or    
627 mark_dirty:     no              -                 
628 write_info:     yes             dqonoff_sem       
629 ==============  ============    ==============    
630                                                   
631 FS recursion means calling ->quota_read() and     
632 operations.                                       
633                                                   
634 More details about quota locking can be found     
635                                                   
636 vm_operations_struct                              
637 ====================                              
638                                                   
639 prototypes::                                      
640                                                   
641         void (*open)(struct vm_area_struct *);    
642         void (*close)(struct vm_area_struct *)    
643         vm_fault_t (*fault)(struct vm_fault *)    
644         vm_fault_t (*huge_fault)(struct vm_fau    
645         vm_fault_t (*map_pages)(struct vm_faul    
646         vm_fault_t (*page_mkwrite)(struct vm_a    
647         vm_fault_t (*pfn_mkwrite)(struct vm_ar    
648         int (*access)(struct vm_area_struct *,    
649                                                   
650 locking rules:                                    
651                                                   
652 =============   ==========      ==============    
653 ops             mmap_lock       PageLocked(pag    
654 =============   ==========      ==============    
655 open:           write                             
656 close:          read/write                        
657 fault:          read            can return wit    
658 huge_fault:     maybe-read                        
659 map_pages:      maybe-read                        
660 page_mkwrite:   read            can return wit    
661 pfn_mkwrite:    read                              
662 access:         read                              
663 =============   ==========      ==============    
664                                                   
665 ->fault() is called when a previously not pres    
666 in. The filesystem must find and return the pa    
667 "pgoff" in the vm_fault structure. If it is po    
668 truncated and/or invalidated, then the filesys    
669 then ensure the page is not already truncated     
670 subsequent truncate), and then return with VM_    
671 locked. The VM will unlock the page.              
672                                                   
673 ->huge_fault() is called when there is no PUD     
674 gives the filesystem the opportunity to instal    
675 Filesystems can also use the ->fault method to    
676 so implementing this function may not be neces    
677 filesystems should not call filemap_fault() fr    
678 The mmap_lock may not be held when this method    
679                                                   
680 ->map_pages() is called when VM asks to map ea    
681 Filesystem should find and map pages associate    
682 till "end_pgoff". ->map_pages() is called with    
683 not block.  If it's not possible to reach a pa    
684 filesystem should skip it. Filesystem should u    
685 page table entry. Pointer to entry associated     
686 "pte" field in vm_fault structure. Pointers to    
687 should be calculated relative to "pte".           
688                                                   
689 ->page_mkwrite() is called when a previously r    
690 writeable. The filesystem again must ensure th    
691 truncate/invalidate races or races with operat    
692 or ->copy_file_range, and then return with the    
693 mapping->invalidate_lock is suitable for prope    
694 been truncated, the filesystem should not look    
695 handler, but simply return with VM_FAULT_NOPAG    
696 retry the fault.                                  
697                                                   
698 ->pfn_mkwrite() is the same as page_mkwrite bu    
699 VM_PFNMAP or VM_MIXEDMAP with a page-less entr    
700 VM_FAULT_NOPAGE. Or one of the VM_FAULT_ERROR     
701 after this call is to make the pte read-write,    
702 an error.                                         
703                                                   
704 ->access() is called when get_user_pages() fai    
705 access_process_vm(), typically used to debug a    
706 /proc/pid/mem or ptrace.  This function is nee    
707 VM_IO | VM_PFNMAP VMAs.                           
708                                                   
709 ----------------------------------------------    
710                                                   
711                         Dubious stuff             
712                                                   
713 (if you break something or notice that it is b    
714 - at least put it here)                           
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php