~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/filesystems/fuse.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/filesystems/fuse.rst (Version linux-6.12-rc7) and /Documentation/filesystems/fuse.rst (Version policy-sample)


  1 .. SPDX-License-Identifier: GPL-2.0               
  2                                                   
  3 ====                                              
  4 FUSE                                              
  5 ====                                              
  6                                                   
  7 Definitions                                       
  8 ===========                                       
  9                                                   
 10 Userspace filesystem:                             
 11   A filesystem in which data and metadata are     
 12   userspace process.  The filesystem can be ac    
 13   the kernel interface.                           
 14                                                   
 15 Filesystem daemon:                                
 16   The process(es) providing the data and metad    
 17                                                   
 18 Non-privileged mount (or user mount):             
 19   A userspace filesystem mounted by a non-priv    
 20   The filesystem daemon is running with the pr    
 21   user.  NOTE: this is not the same as mounts     
 22   option in /etc/fstab, which is not discussed    
 23                                                   
 24 Filesystem connection:                            
 25   A connection between the filesystem daemon a    
 26   connection exists until either the daemon di    
 27   umounted.  Note that detaching (or lazy umou    
 28   does *not* break the connection, in this cas    
 29   the last reference to the filesystem is rele    
 30                                                   
 31 Mount owner:                                      
 32   The user who does the mounting.                 
 33                                                   
 34 User:                                             
 35   The user who is performing filesystem operat    
 36                                                   
 37 What is FUSE?                                     
 38 =============                                     
 39                                                   
 40 FUSE is a userspace filesystem framework.  It     
 41 module (fuse.ko), a userspace library (libfuse    
 42 (fusermount).                                     
 43                                                   
 44 One of the most important features of FUSE is     
 45 non-privileged mounts.  This opens up new poss    
 46 filesystems.  A good example is sshfs: a secur    
 47 using the sftp protocol.                          
 48                                                   
 49 The userspace library and utilities are availa    
 50 `FUSE homepage: <https://github.com/libfuse/>`    
 51                                                   
 52 Filesystem type                                   
 53 ===============                                   
 54                                                   
 55 The filesystem type given to mount(2) can be o    
 56                                                   
 57     fuse                                          
 58       This is the usual way to mount a FUSE fi    
 59       argument of the mount system call may co    
 60       which is not interpreted by the kernel.     
 61                                                   
 62     fuseblk                                       
 63       The filesystem is block device based.  T    
 64       mount system call is interpreted as the     
 65                                                   
 66 Mount options                                     
 67 =============                                     
 68                                                   
 69 fd=N                                              
 70   The file descriptor to use for communication    
 71   filesystem and the kernel.  The file descrip    
 72   obtained by opening the FUSE device ('/dev/f    
 73                                                   
 74 rootmode=M                                        
 75   The file mode of the filesystem's root in oc    
 76                                                   
 77 user_id=N                                         
 78   The numeric user id of the mount owner.         
 79                                                   
 80 group_id=N                                        
 81   The numeric group id of the mount owner.        
 82                                                   
 83 default_permissions                               
 84   By default FUSE doesn't check file access pe    
 85   filesystem is free to implement its access p    
 86   the underlying file access mechanism (e.g. i    
 87   filesystems).  This option enables permissio    
 88   access based on file mode.  It is usually us    
 89   'allow_other' mount option.                     
 90                                                   
 91 allow_other                                       
 92   This option overrides the security measure r    
 93   to the user mounting the filesystem.  This o    
 94   allowed to root, but this restriction can be    
 95   (userspace) configuration option.               
 96                                                   
 97 max_read=N                                        
 98   With this option the maximum size of read op    
 99   The default is infinite.  Note that the size    
100   limited anyway to 32 pages (which is 128kbyt    
101                                                   
102 blksize=N                                         
103   Set the block size for the filesystem.  The     
104   option is only valid for 'fuseblk' type moun    
105                                                   
106 Control filesystem                                
107 ==================                                
108                                                   
109 There's a control filesystem for FUSE, which c    
110                                                   
111   mount -t fusectl none /sys/fs/fuse/connectio    
112                                                   
113 Mounting it under the '/sys/fs/fuse/connection    
114 backwards compatible with earlier versions.       
115                                                   
116 Under the fuse control filesystem each connect    
117 named by a unique number.                         
118                                                   
119 For each connection the following files exist     
120                                                   
121         waiting                                   
122           The number of requests which are wai    
123           userspace or being processed by the     
124           no filesystem activity and 'waiting'    
125           filesystem is hung or deadlocked.       
126                                                   
127         abort                                     
128           Writing anything into this file will    
129           connection.  This means that all wai    
130           error returned for all aborted and n    
131                                                   
132 Only the owner of the mount may read or write     
133                                                   
134 Interrupting filesystem operations                
135 ##################################                
136                                                   
137 If a process issuing a FUSE filesystem request    
138 following will happen:                            
139                                                   
140   -  If the request is not yet sent to userspa    
141      fatal (SIGKILL or unhandled fatal signal)    
142      dequeued and returns immediately.            
143                                                   
144   -  If the request is not yet sent to userspa    
145      fatal, then an interrupted flag is set fo    
146      the request has been successfully transfe    
147      this flag is set, an INTERRUPT request is    
148                                                   
149   -  If the request is already sent to userspa    
150      request is queued.                           
151                                                   
152 INTERRUPT requests take precedence over other     
153 userspace filesystem will receive queued INTER    
154                                                   
155 The userspace filesystem may ignore the INTERR    
156 or may honor them by sending a reply to the *o    
157 the error set to EINTR.                           
158                                                   
159 It is also possible that there's a race betwee    
160 original request and its INTERRUPT request.  T    
161                                                   
162   1. The INTERRUPT request is processed before    
163      processed                                    
164                                                   
165   2. The INTERRUPT request is processed after     
166      been answered                                
167                                                   
168 If the filesystem cannot find the original req    
169 some timeout and/or a number of new requests t    
170 should reply to the INTERRUPT request with an     
171 1) the INTERRUPT request will be requeued.  In    
172 reply will be ignored.                            
173                                                   
174 Aborting a filesystem connection                  
175 ================================                  
176                                                   
177 It is possible to get into certain situations     
178 not responding.  Reasons for this may be:         
179                                                   
180   a) Broken userspace filesystem implementatio    
181                                                   
182   b) Network connection down                      
183                                                   
184   c) Accidental deadlock                          
185                                                   
186   d) Malicious deadlock                           
187                                                   
188 (For more on c) and d) see later sections)        
189                                                   
190 In either of these cases it may be useful to a    
191 the filesystem.  There are several ways to do     
192                                                   
193   - Kill the filesystem daemon.  Works in case    
194                                                   
195   - Kill the filesystem daemon and all users o    
196     in all cases except some malicious deadloc    
197                                                   
198   - Use forced umount (umount -f).  Works in a    
199     filesystem is still attached (it hasn't be    
200                                                   
201   - Abort filesystem through the FUSE control     
202     powerful method, always works.                
203                                                   
204 How do non-privileged mounts work?                
205 ==================================                
206                                                   
207 Since the mount() system call is a privileged     
208 program (fusermount) is needed, which is insta    
209                                                   
210 The implication of providing non-privileged mo    
211 owner must not be able to use this capability     
212 system.  Obvious requirements arising from thi    
213                                                   
214  A) mount owner should not be able to get elev    
215     help of the mounted filesystem                
216                                                   
217  B) mount owner should not get illegitimate ac    
218     other users' and the super user's processe    
219                                                   
220  C) mount owner should not be able to induce u    
221     other users' or the super user's processes    
222                                                   
223 How are requirements fulfilled?                   
224 ===============================                   
225                                                   
226  A) The mount owner could gain elevated privil    
227                                                   
228     1. creating a filesystem containing a devi    
229                                                   
230     2. creating a filesystem containing a suid    
231                                                   
232     The solution is not to allow opening devic    
233     setuid and setgid bits when executing prog    
234     fusermount always adds "nosuid" and "nodev    
235     for non-privileged mounts.                    
236                                                   
237  B) If another user is accessing files or dire    
238     filesystem, the filesystem daemon serving     
239     exact sequence and timing of operations pe    
240     information is otherwise inaccessible to t    
241     counts as an information leak.                
242                                                   
243     The solution to this problem will be prese    
244                                                   
245  C) There are several ways in which the mount     
246     undesired behavior in other users' process    
247                                                   
248      1) mounting a filesystem over a file or d    
249         owner could otherwise not be able to m    
250         make limited modifications).              
251                                                   
252         This is solved in fusermount, by check    
253         permissions on the mountpoint and only    
254         the mount owner can do unlimited modif    
255         access to the mountpoint, and mountpoi    
256         directory)                                
257                                                   
258      2) Even if 1) is solved the mount owner c    
259         of other users' processes.                
260                                                   
261          i) It can slow down or indefinitely d    
262             filesystem operation creating a Do    
263             whole system.  For example a suid     
264             system file, and then accessing a     
265             filesystem could be stopped, and t    
266             file to be locked forever.            
267                                                   
268          ii) It can present files or directori    
269              directory structures of unlimited    
270              system process to eat up diskspac    
271              resources, again causing *DoS*.      
272                                                   
273         The solution to this as well as B) is     
274         to access the filesystem, which could     
275         monitored or manipulated by the mount     
276         mount owner can ptrace a process, it c    
277         without using a FUSE mount, the same c    
278         ptrace can be used to check if a proce    
279         the filesystem or not.                    
280                                                   
281         Note that the *ptrace* check is not st    
282         prevent C/2/i, it is enough to check i    
283         privilege to send signal to the proces    
284         filesystem, since *SIGSTOP* can be use    
285                                                   
286 I think these limitations are unacceptable?       
287 ===========================================       
288                                                   
289 If a sysadmin trusts the users enough, or can     
290 measures, that system processes will never ent    
291 mounts, it can relax the last limitation in se    
292                                                   
293   - With the 'user_allow_other' config option.    
294     set, the mounting user can add the 'allow_    
295     disables the check for other users' proces    
296                                                   
297     User namespaces have an unintuitive intera    
298     an unprivileged user - normally restricted    
299     'allow_other' - could do so in a user name    
300     privileged. If any process could access su    
301     this would give the mounting user the abil    
302     processes in user namespaces where they're    
303     reason 'allow_other' restricts access to u    
304     or a descendant.                              
305                                                   
306   - With the 'allow_sys_admin_access' module o    
307     set, super user's processes have unrestric    
308     irrespective of allow_other setting or use    
309     mounting user.                                
310                                                   
311 Note that both of these relaxations expose the    
312 information leak or *DoS* as described in poin    
313 preceding section.                                
314                                                   
315 Kernel - userspace interface                      
316 ============================                      
317                                                   
318 The following diagram shows how a filesystem o    
319 example unlink) is performed in FUSE. ::          
320                                                   
321                                                   
322  |  "rm /mnt/fuse/file"               |  FUSE     
323  |                                    |           
324  |                                    |  >sys_    
325  |                                    |    >fu    
326  |                                    |      >    
327  |                                    |           
328  |                                    |           
329  |  >sys_unlink()                     |           
330  |    >fuse_unlink()                  |           
331  |      [get request from             |           
332  |       fc->unused_list]             |           
333  |      >request_send()               |           
334  |        [queue req on fc->pending]  |           
335  |        [wake up fc->waitq]         |           
336  |        >request_wait_answer()      |           
337  |          [sleep on req->waitq]     |           
338  |                                    |      <    
339  |                                    |      [    
340  |                                    |      [    
341  |                                    |      [    
342  |                                    |    <fu    
343  |                                    |  <sys_    
344  |                                    |           
345  |                                    |  [perf    
346  |                                    |           
347  |                                    |  >sys_    
348  |                                    |    >fu    
349  |                                    |      [    
350  |                                    |      [    
351  |                                    |      [    
352  |          [woken up]                |      [    
353  |                                    |    <fu    
354  |                                    |  <sys_    
355  |        <request_wait_answer()      |           
356  |      <request_send()               |           
357  |      [add request to               |           
358  |       fc->unused_list]             |           
359  |    <fuse_unlink()                  |           
360  |  <sys_unlink()                     |           
361                                                   
362 .. note:: Everything in the description above     
363                                                   
364 There are a couple of ways in which to deadloc    
365 Since we are talking about unprivileged usersp    
366 something must be done about these.               
367                                                   
368 **Scenario 1 -  Simple deadlock**::               
369                                                   
370  |  "rm /mnt/fuse/file"               |  FUSE     
371  |                                    |           
372  |  >sys_unlink("/mnt/fuse/file")     |           
373  |    [acquire inode semaphore        |           
374  |     for "file"]                    |           
375  |    >fuse_unlink()                  |           
376  |      [sleep on req->waitq]         |           
377  |                                    |  <sys_    
378  |                                    |  >sys_    
379  |                                    |    [ac    
380  |                                    |     fo    
381  |                                    |    *DE    
382                                                   
383 The solution for this is to allow the filesyst    
384                                                   
385 **Scenario 2 - Tricky deadlock**                  
386                                                   
387                                                   
388 This one needs a carefully crafted filesystem.    
389 the above, only the call back to the filesyste    
390 but is caused by a pagefault. ::                  
391                                                   
392  |  Kamikaze filesystem thread 1      |  Kamik    
393  |                                    |           
394  |  [fd = open("/mnt/fuse/file")]     |  [requ    
395  |  [mmap fd to 'addr']               |           
396  |  [close fd]                        |  [FLUS    
397  |  [read a byte from addr]           |           
398  |    >do_page_fault()                |           
399  |      [find or create page]         |           
400  |      [lock page]                   |           
401  |      >fuse_readpage()              |           
402  |         [queue READ request]       |           
403  |         [sleep on req->waitq]      |           
404  |                                    |  [read    
405  |                                    |  [crea    
406  |                                    |  >sys_    
407  |                                    |    >fu    
408  |                                    |      [    
409  |                                    |      [    
410  |                                    |      [    
411  |                                    |           
412  |                                    |           
413  |                                    |           
414  |                                    |           
415                                                   
416 The solution is basically the same as above.      
417                                                   
418 An additional problem is that while the write     
419 to the request, the request must not be interr    
420 because the destination address of the copy ma    
421 request has returned.                             
422                                                   
423 This is solved with doing the copy atomically,    
424 while the page(s) belonging to the write buffe    
425 get_user_pages().  The 'req->locked' flag indi    
426 taking place, and abort is delayed until this     
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php