~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/block/ublk.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/block/ublk.rst (Version linux-6.12-rc7) and /Documentation/block/ublk.rst (Version linux-5.9.16)


  1 .. SPDX-License-Identifier: GPL-2.0               
  2                                                   
  3 ===========================================       
  4 Userspace block device driver (ublk driver)       
  5 ===========================================       
  6                                                   
  7 Overview                                          
  8 ========                                          
  9                                                   
 10 ublk is a generic framework for implementing b    
 11 The motivation behind it is that moving virtua    
 12 such as loop, nbd and similar can be very help    
 13 new virtual block device such as ublk-qcow2 (t    
 14 implementing qcow2 driver in kernel).             
 15                                                   
 16 Userspace block devices are attractive because    
 17                                                   
 18 - They can be written many programming languag    
 19 - They can use libraries that are not availabl    
 20 - They can be debugged with tools familiar to     
 21 - Crashes do not kernel panic the machine.        
 22 - Bugs are likely to have a lower security imp    
 23   code.                                           
 24 - They can be installed and updated independen    
 25 - They can be used to simulate block device ea    
 26   parameters/setting for test/debug purpose       
 27                                                   
 28 ublk block device (``/dev/ublkb*``) is added b    
 29 on the device will be forwarded to ublk usersp    
 30 in this document, ``ublk server`` refers to ge    
 31 program. ``ublksrv`` [#userspace]_ is one of s    
 32 provides ``libublksrv`` [#userspace_lib]_ libr    
 33 user block device conveniently, while also gen    
 34 included, such as loop and null. Richard W.M.     
 35 ``nbdublk`` [#userspace_nbdublk]_  based on ``    
 36                                                   
 37 After the IO is handled by userspace, the resu    
 38 driver, thus completing the request cycle. Thi    
 39 logic is totally done by userspace, such as lo    
 40 communication, or qcow2's IO mapping.             
 41                                                   
 42 ``/dev/ublkb*`` is driven by blk-mq request-ba    
 43 assigned by one queue wide unique tag. ublk se    
 44 IO too, which is 1:1 mapped with IO of ``/dev/    
 45                                                   
 46 Both the IO request forward and IO handling re    
 47 ``io_uring`` passthrough command; that is why     
 48 block driver. It has been observed that using     
 49 give better IOPS than block IO; which is why u    
 50 implementation of userspace block device: not     
 51 done by io_uring, but also the preferred IO ha    
 52 based approach too.                               
 53                                                   
 54 ublk provides control interface to set/get ubl    
 55 The interface is extendable and kabi compatibl    
 56 queue's parameter or ublk generic feature para    
 57 interface. Thus, ublk is generic userspace blo    
 58 For example, it is easy to setup a ublk device    
 59 parameters from userspace.                        
 60                                                   
 61 Using ublk                                        
 62 ==========                                        
 63                                                   
 64 ublk requires userspace ublk server to handle     
 65                                                   
 66 Below is example of using ``ublksrv`` to provi    
 67                                                   
 68 - add a device::                                  
 69                                                   
 70      ublk add -t loop -f ublk-loop.img            
 71                                                   
 72 - format with xfs, then use it::                  
 73                                                   
 74      mkfs.xfs /dev/ublkb0                         
 75      mount /dev/ublkb0 /mnt                       
 76      # do anything. all IOs are handled by io_    
 77      ...                                          
 78      umount /mnt                                  
 79                                                   
 80 - list the devices with their info::              
 81                                                   
 82      ublk list                                    
 83                                                   
 84 - delete the device::                             
 85                                                   
 86      ublk del -a                                  
 87      ublk del -n $ublk_dev_id                     
 88                                                   
 89 See usage details in README of ``ublksrv`` [#u    
 90                                                   
 91 Design                                            
 92 ======                                            
 93                                                   
 94 Control plane                                     
 95 -------------                                     
 96                                                   
 97 ublk driver provides global misc device node (    
 98 managing and controlling ublk devices with hel    
 99                                                   
100 - ``UBLK_CMD_ADD_DEV``                            
101                                                   
102   Add a ublk char device (``/dev/ublkc*``) whi    
103   WRT IO command communication. Basic device i    
104   command. It sets UAPI structure of ``ublksrv    
105   such as ``nr_hw_queues``, ``queue_depth``, a    
106   for which the info is negotiated with the dr    
107   When this command is completed, the basic de    
108                                                   
109 - ``UBLK_CMD_SET_PARAMS`` / ``UBLK_CMD_GET_PAR    
110                                                   
111   Set or get parameters of the device, which c    
112   related, or request queue limit related, but    
113   because the driver does not handle any IO lo    
114   sent before sending ``UBLK_CMD_START_DEV``.     
115                                                   
116 - ``UBLK_CMD_START_DEV``                          
117                                                   
118   After the server prepares userspace resource    
119   pthread & io_uring for handling ublk IO), th    
120   driver for allocating & exposing ``/dev/ublk    
121   ``UBLK_CMD_SET_PARAMS`` are applied for crea    
122                                                   
123 - ``UBLK_CMD_STOP_DEV``                           
124                                                   
125   Halt IO on ``/dev/ublkb*`` and remove the de    
126   ublk server will release resources (such as     
127   io_uring).                                      
128                                                   
129 - ``UBLK_CMD_DEL_DEV``                            
130                                                   
131   Remove ``/dev/ublkc*``. When this command re    
132   number can be reused.                           
133                                                   
134 - ``UBLK_CMD_GET_QUEUE_AFFINITY``                 
135                                                   
136   When ``/dev/ublkc`` is added, the driver cre    
137   that each queue's affinity info is available    
138   ``UBLK_CMD_GET_QUEUE_AFFINITY`` to retrieve     
139   set up the per-queue context efficiently, su    
140   pthread and try to allocate buffers in IO th    
141                                                   
142 - ``UBLK_CMD_GET_DEV_INFO``                       
143                                                   
144   For retrieving device info via ``ublksrv_ctr    
145   responsibility to save IO target specific in    
146                                                   
147 - ``UBLK_CMD_GET_DEV_INFO2``                      
148   Same purpose with ``UBLK_CMD_GET_DEV_INFO``,    
149   provide path of the char device of ``/dev/ub    
150   permission check, and this command is added     
151   ublk device, and introduced with ``UBLK_F_UN    
152   Only the user owning the requested device ca    
153                                                   
154   How to deal with userspace/kernel compatibil    
155                                                   
156   1) if kernel is capable of handling ``UBLK_F    
157                                                   
158     If ublk server supports ``UBLK_F_UNPRIVILE    
159                                                   
160     ublk server should send ``UBLK_CMD_GET_DEV    
161     unprivileged application needs to query de    
162     when the application has no idea if ``UBLK    
163     given the capability info is stateless, an    
164     retrieve it via ``UBLK_CMD_GET_DEV_INFO2``    
165                                                   
166     If ublk server doesn't support ``UBLK_F_UN    
167                                                   
168     ``UBLK_CMD_GET_DEV_INFO`` is always sent t    
169     UBLK_F_UNPRIVILEGED_DEV isn't available fo    
170                                                   
171   2) if kernel isn't capable of handling ``UBL    
172                                                   
173     If ublk server supports ``UBLK_F_UNPRIVILE    
174                                                   
175     ``UBLK_CMD_GET_DEV_INFO2`` is tried first,    
176     ``UBLK_CMD_GET_DEV_INFO`` needs to be retr    
177     ``UBLK_F_UNPRIVILEGED_DEV`` can't be set      
178                                                   
179     If ublk server doesn't support ``UBLK_F_UN    
180                                                   
181     ``UBLK_CMD_GET_DEV_INFO`` is always sent t    
182     ``UBLK_F_UNPRIVILEGED_DEV`` isn't availabl    
183                                                   
184 - ``UBLK_CMD_START_USER_RECOVERY``                
185                                                   
186   This command is valid if ``UBLK_F_USER_RECOV    
187   command is accepted after the old process ha    
188   and ``/dev/ublkc*`` is released. User should    
189   a new process which re-opens ``/dev/ublkc*``    
190   ublk device is ready for the new process.       
191                                                   
192 - ``UBLK_CMD_END_USER_RECOVERY``                  
193                                                   
194   This command is valid if ``UBLK_F_USER_RECOV    
195   command is accepted after ublk device is qui    
196   opened ``/dev/ublkc*`` and get all ublk queu    
197   returns, ublk device is unquiesced and new I    
198   new process.                                    
199                                                   
200 - user recovery feature description               
201                                                   
202   Two new features are added for user recovery    
203   ``UBLK_F_USER_RECOVERY_REISSUE``.               
204                                                   
205   With ``UBLK_F_USER_RECOVERY`` set, after one    
206   handler) is dying, ublk does not delete ``/d    
207   recovery stage and ublk device ID is kept. I    
208   responsibility to recover the device context    
209   Requests which have not been issued to users    
210   which have been issued to userspace are abor    
211                                                   
212   With ``UBLK_F_USER_RECOVERY_REISSUE`` set, a    
213   server's io handler) is dying, contrary to `    
214   requests which have been issued to userspace    
215   re-issued to the new process after handling     
216   ``UBLK_F_USER_RECOVERY_REISSUE`` is designed    
217   double-write since the driver may issue the     
218   might be useful to a read-only FS or a VM ba    
219                                                   
220 Unprivileged ublk device is supported by passi    
221 Once the flag is set, all control commands can    
222 user. Except for command of ``UBLK_CMD_ADD_DEV    
223 the specified char device(``/dev/ublkc*``) is     
224 commands by ublk driver, for doing that, path     
225 be provided in these commands' payload from ub    
226 ublk device becomes container-ware, and device    
227 can be controlled/accessed just inside this co    
228                                                   
229 Data plane                                        
230 ----------                                        
231                                                   
232 ublk server needs to create per-queue IO pthre    
233 commands via io_uring passthrough. The per-que    
234 focuses on IO handling and shouldn't handle an    
235 tasks.                                            
236                                                   
237 The's IO is assigned by a unique tag, which is    
238 request of ``/dev/ublkb*``.                       
239                                                   
240 UAPI structure of ``ublksrv_io_desc`` is defin    
241 the driver. A fixed mmapped area (array) on ``    
242 exporting IO info to the server; such as IO of    
243 buffer address. Each ``ublksrv_io_desc`` insta    
244 and IO tag directly.                              
245                                                   
246 The following IO commands are communicated via    
247 and each command is only for forwarding the IO    
248 with specified IO tag in the command data:        
249                                                   
250 - ``UBLK_IO_FETCH_REQ``                           
251                                                   
252   Sent from the server IO pthread for fetching    
253   destined to ``/dev/ublkb*``. This command is    
254   IO pthread for ublk driver to setup IO forwa    
255                                                   
256 - ``UBLK_IO_COMMIT_AND_FETCH_REQ``                
257                                                   
258   When an IO request is destined to ``/dev/ubl    
259   the IO's ``ublksrv_io_desc`` to the specifie    
260   previous received IO command of this IO tag     
261   or ``UBLK_IO_COMMIT_AND_FETCH_REQ)`` is comp    
262   the IO notification via io_uring.               
263                                                   
264   After the server handles the IO, its result     
265   driver by sending ``UBLK_IO_COMMIT_AND_FETCH    
266   received this command, it parses the result     
267   ``/dev/ublkb*``. In the meantime setup envir    
268   requests with the same IO tag. That is, ``UB    
269   is reused for both fetching request and comm    
270                                                   
271 - ``UBLK_IO_NEED_GET_DATA``                       
272                                                   
273   With ``UBLK_F_NEED_GET_DATA`` enabled, the W    
274   issued to ublk server without data copy. The    
275   receives the request and it can allocate dat    
276   inside this new io command. After the kernel    
277   data copy is done from request pages to this    
278   backend receives the request again with data    
279   truly handle the request.                       
280                                                   
281   ``UBLK_IO_NEED_GET_DATA`` adds one additiona    
282   io_uring_enter() syscall. Any user thinks th    
283   should not enable UBLK_F_NEED_GET_DATA. ublk    
284   buffer for each IO by default. Any new proje    
285   buffer to communicate with ublk driver. Howe    
286   break or not able to consume the new buffer     
287   command is added for backwards compatibility    
288   can still consume existing buffers.             
289                                                   
290 - data copy between ublk server IO buffer and     
291                                                   
292   The driver needs to copy the block IO reques    
293   (pages) first for WRITE before notifying the    
294   that the server can handle WRITE request.       
295                                                   
296   When the server handles READ request and sen    
297   ``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the serv    
298   the server buffer (pages) read to the IO req    
299                                                   
300 Future development                                
301 ==================                                
302                                                   
303 Zero copy                                         
304 ---------                                         
305                                                   
306 Zero copy is a generic requirement for nbd, fu    
307 problem [#xiaoguang]_ Xiaoguang mentioned is t    
308 can't be remapped any more in kernel with exis    
309 occurs when destining direct IO to ``/dev/ublk    
310 big requests (IO size >= 256 KB) may benefit a    
311                                                   
312                                                   
313 References                                        
314 ==========                                        
315                                                   
316 .. [#userspace] https://github.com/ming1/ubdsr    
317                                                   
318 .. [#userspace_lib] https://github.com/ming1/u    
319                                                   
320 .. [#userspace_nbdublk] https://gitlab.com/rwm    
321                                                   
322 .. [#userspace_readme] https://github.com/ming    
323                                                   
324 .. [#stefan] https://lore.kernel.org/linux-blo    
325                                                   
326 .. [#xiaoguang] https://lore.kernel.org/linux-    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php