~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/device-mapper/thin-provisioning.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/admin-guide/device-mapper/thin-provisioning.rst (Version linux-6.12-rc7) and /Documentation/admin-guide/device-mapper/thin-provisioning.rst (Version linux-4.12.14)


  1 =================                                 
  2 Thin provisioning                                 
  3 =================                                 
  4                                                   
  5 Introduction                                      
  6 ============                                      
  7                                                   
  8 This document describes a collection of device    
  9 between them implement thin-provisioning and s    
 10                                                   
 11 The main highlight of this implementation, com    
 12 implementation of snapshots, is that it allows    
 13 be stored on the same data volume.  This simpl    
 14 allows the sharing of data between volumes, th    
 15                                                   
 16 Another significant feature is support for an     
 17 recursive snapshots (snapshots of snapshots of    
 18 previous implementation of snapshots did this     
 19 lookup tables, and so performance was O(depth)    
 20 implementation uses a single data structure to    
 21 with depth.  Fragmentation may still be an iss    
 22 scenarios.                                        
 23                                                   
 24 Metadata is stored on a separate device from d    
 25 administrator some freedom, for example to:       
 26                                                   
 27 - Improve metadata resilience by storing metad    
 28   but data on a non-mirrored one.                 
 29                                                   
 30 - Improve performance by storing the metadata     
 31                                                   
 32 Status                                            
 33 ======                                            
 34                                                   
 35 These targets are considered safe for producti    
 36 cases will have different performance characte    
 37 to fragmentation of the data volume.              
 38                                                   
 39 If you find this software is not performing as    
 40 dm-devel@redhat.com with details and we'll try    
 41 things for you.                                   
 42                                                   
 43 Userspace tools for checking and repairing the    
 44 developed and are available as 'thin_check' an    
 45 of the package that provides these utilities v    
 46 a Red Hat distribution it is named 'device-map    
 47                                                   
 48 Cookbook                                          
 49 ========                                          
 50                                                   
 51 This section describes some quick recipes for     
 52 They use the dmsetup program to control the de    
 53 directly.  End users will be advised to use a     
 54 manager such as LVM2 once support has been add    
 55                                                   
 56 Pool device                                       
 57 -----------                                       
 58                                                   
 59 The pool device ties together the metadata vol    
 60 It maps I/O linearly to the data volume and up    
 61 two mechanisms:                                   
 62                                                   
 63 - Function calls from the thin targets            
 64                                                   
 65 - Device-mapper 'messages' from userspace whic    
 66   virtual devices amongst other things.           
 67                                                   
 68 Setting up a fresh pool device                    
 69 ------------------------------                    
 70                                                   
 71 Setting up a pool device requires a valid meta    
 72 data device.  If you do not have an existing m    
 73 make one by zeroing the first 4k to indicate e    
 74                                                   
 75     dd if=/dev/zero of=$metadata_dev bs=4096 c    
 76                                                   
 77 The amount of metadata you need will vary acco    
 78 are shared between thin devices (i.e. through     
 79 less sharing than average you'll need a larger    
 80                                                   
 81 As a guide, we suggest you calculate the numbe    
 82 metadata device as 48 * $data_dev_size / $data    
 83 to 2MB if the answer is smaller.  If you're cr    
 84 snapshots which are recording large amounts of    
 85 need to increase this.                            
 86                                                   
 87 The largest size supported is 16GB: If the dev    
 88 a warning will be issued and the excess space     
 89                                                   
 90 Reloading a pool table                            
 91 ----------------------                            
 92                                                   
 93 You may reload a pool's table, indeed this is     
 94 if it runs out of space.  (N.B. While specifyi    
 95 device when reloading is not forbidden at the     
 96 wrong if it does not route I/O to exactly the     
 97 previously.)                                      
 98                                                   
 99 Using an existing pool device                     
100 -----------------------------                     
101                                                   
102 ::                                                
103                                                   
104     dmsetup create pool \                         
105         --table "0 20971520 thin-pool $metadat    
106                  $data_block_size $low_water_m    
107                                                   
108 $data_block_size gives the smallest unit of di    
109 allocated at a time expressed in units of 512-    
110 $data_block_size must be between 128 (64KB) an    
111 multiple of 128 (64KB).  $data_block_size cann    
112 thin-pool is created.  People primarily intere    
113 may want to use a value such as 1024 (512KB).     
114 snapshotting may want a smaller value such as     
115 not zeroing newly-allocated data, a larger $da    
116 region of 256000 (128MB) is suggested.            
117                                                   
118 $low_water_mark is expressed in blocks of size    
119 free space on the data device drops below this    
120 will be triggered which a userspace daemon sho    
121 extend the pool device.  Only one such event w    
122                                                   
123 No special event is triggered if a just resume    
124 the low water mark. However, resuming a device    
125 event; a userspace daemon should verify that f    
126 water mark when handling this event.              
127                                                   
128 A low water mark for the metadata device is ma    
129 will trigger a dm event if free space on the m    
130 it.                                               
131                                                   
132 Updating on-disk metadata                         
133 -------------------------                         
134                                                   
135 On-disk metadata is committed every time a FLU    
136 If no such requests are made then commits will    
137 means the thin-provisioning target behaves lik    
138 a volatile write cache.  If power is lost you     
139 writes.  The metadata should always be consist    
140                                                   
141 If data space is exhausted the pool will eithe    
142 according to the configuration (see: error_if_    
143 space is exhausted or a metadata operation fai    
144 until the pool is taken offline and repair is     
145 potential inconsistencies and 2) clear the fla    
146 Once the pool's metadata device is repaired it    
147 will allow the pool to return to normal operat    
148 is flagged as needing repair, the pool's data     
149 cannot be resized until repair is performed.      
150 that when the pool's metadata space is exhaust    
151 transaction is aborted.  Given that the pool w    
152 completion may have already been acknowledged     
153 (e.g. filesystem) it is strongly suggested tha    
154 (e.g. fsck) be performed on those layers when     
155 required.                                         
156                                                   
157 Thin provisioning                                 
158 -----------------                                 
159                                                   
160 i) Creating a new thinly-provisioned volume.      
161                                                   
162   To create a new thinly- provisioned volume y    
163   active pool device, /dev/mapper/pool in this    
164                                                   
165     dmsetup message /dev/mapper/pool 0 "create    
166                                                   
167   Here '0' is an identifier for the volume, a     
168   to the caller to allocate and manage these i    
169   identifier is already in use, the message wi    
170                                                   
171 ii) Using a thinly-provisioned volume.            
172                                                   
173   Thinly-provisioned volumes are activated usi    
174                                                   
175     dmsetup create thin --table "0 2097152 thi    
176                                                   
177   The last parameter is the identifier for the    
178                                                   
179 Internal snapshots                                
180 ------------------                                
181                                                   
182 i) Creating an internal snapshot.                 
183                                                   
184   Snapshots are created with another message t    
185                                                   
186   N.B.  If the origin device that you wish to     
187   must suspend it before creating the snapshot    
188   This is NOT enforced at the moment, so pleas    
189                                                   
190   ::                                              
191                                                   
192     dmsetup suspend /dev/mapper/thin              
193     dmsetup message /dev/mapper/pool 0 "create    
194     dmsetup resume /dev/mapper/thin               
195                                                   
196   Here '1' is the identifier for the volume, a    
197   identifier for the origin device.               
198                                                   
199 ii) Using an internal snapshot.                   
200                                                   
201   Once created, the user doesn't have to worry    
202   between the origin and the snapshot.  Indeed    
203   different from any other thinly-provisioned     
204   snapshotted itself via the same method.  It'    
205   have only one of them active, and there's no    
206   activating or removing them both.  (This dif    
207   device-mapper snapshots.)                       
208                                                   
209   Activate it exactly the same way as any othe    
210                                                   
211     dmsetup create snap --table "0 2097152 thi    
212                                                   
213 External snapshots                                
214 ------------------                                
215                                                   
216 You can use an external **read only** device a    
217 thinly-provisioned volume.  Any read to an unp    
218 thin device will be passed through to the orig    
219 the allocation of new blocks as usual.            
220                                                   
221 One use case for this is VM hosts that want to    
222 thinly-provisioned volumes but have the base i    
223 (possibly shared between many VMs).               
224                                                   
225 You must not write to the origin device if you    
226 Of course, you may write to the thin device an    
227 of the thin volume.                               
228                                                   
229 i) Creating a snapshot of an external device      
230                                                   
231   This is the same as creating a thin device.     
232   You don't mention the origin at this stage.     
233                                                   
234   ::                                              
235                                                   
236     dmsetup message /dev/mapper/pool 0 "create    
237                                                   
238 ii) Using a snapshot of an external device.       
239                                                   
240   Append an extra parameter to the thin target    
241                                                   
242     dmsetup create snap --table "0 2097152 thi    
243                                                   
244   N.B. All descendants (internal snapshots) of    
245   same extra origin parameter.                    
246                                                   
247 Deactivation                                      
248 ------------                                      
249                                                   
250 All devices using a pool must be deactivated b    
251 can be.                                           
252                                                   
253 ::                                                
254                                                   
255     dmsetup remove thin                           
256     dmsetup remove snap                           
257     dmsetup remove pool                           
258                                                   
259 Reference                                         
260 =========                                         
261                                                   
262 'thin-pool' target                                
263 ------------------                                
264                                                   
265 i) Constructor                                    
266                                                   
267     ::                                            
268                                                   
269       thin-pool <metadata dev> <data dev> <dat    
270                 <low water mark (blocks)> [<nu    
271                                                   
272     Optional feature arguments:                   
273                                                   
274       skip_block_zeroing:                         
275         Skip the zeroing of newly-provisioned     
276                                                   
277       ignore_discard:                             
278         Disable discard support.                  
279                                                   
280       no_discard_passdown:                        
281         Don't pass discards down to the underl    
282         data device, but just remove the mappi    
283                                                   
284       read_only:                                  
285                  Don't allow any changes to be    
286                  metadata.  This mode is only     
287                  thin-pool has been created an    
288                  read/write mode.  It cannot b    
289                  thin-pool creation.              
290                                                   
291       error_if_no_space:                          
292         Error IOs, instead of queueing, if no     
293                                                   
294     Data block size must be between 64KB (128     
295     (2097152 sectors) inclusive.                  
296                                                   
297                                                   
298 ii) Status                                        
299                                                   
300     ::                                            
301                                                   
302       <transaction id> <used metadata blocks>/    
303       <used data blocks>/<total data blocks> <    
304       ro|rw|out_of_data_space [no_]discard_pas    
305       needs_check|- metadata_low_watermark        
306                                                   
307     transaction id:                               
308         A 64-bit number used by userspace to h    
309         from volume managers.                     
310                                                   
311     used data blocks / total data blocks          
312         If the number of free blocks drops bel    
313         dm event will be sent to userspace.  T    
314         it will occur only once after each res    
315         should register for the event and then    
316                                                   
317     held metadata root:                           
318         The location, in blocks, of the metada    
319         'held' for userspace read access.  '-'    
320         held root.                                
321                                                   
322     discard_passdown|no_discard_passdown          
323         Whether or not discards are actually b    
324         underlying device.  When this is enabl    
325         it can get disabled if the underlying     
326                                                   
327     ro|rw|out_of_data_space                       
328         If the pool encounters certain types o    
329         drop into a read-only metadata mode in    
330         the pool metadata (like allocating new    
331                                                   
332         In serious cases where even a read-onl    
333         no further I/O will be permitted and t    
334         contain the string 'Fail'.  The usersp    
335         should then be used.                      
336                                                   
337     error_if_no_space|queue_if_no_space           
338         If the pool runs out of data or metada    
339         either queue or error the IO destined     
340         default is to queue the IO until more     
341         'no_space_timeout' expires.  The 'no_s    
342         module parameter can be used to change    
343         defaults to 60 seconds but may be disa    
344                                                   
345     needs_check                                   
346         A metadata operation has failed, resul    
347         flag being set in the metadata's super    
348         device must be deactivated and checked    
349         thin-pool can be made fully operationa    
350         needs_check is not set.                   
351                                                   
352     metadata_low_watermark:                       
353         Value of metadata low watermark in blo    
354         value internally but userspace needs t    
355         determine if an event was caused by cr    
356                                                   
357 iii) Messages                                     
358                                                   
359     create_thin <dev id>                          
360         Create a new thinly-provisioned device    
361         <dev id> is an arbitrary unique 24-bit    
362         the caller.                               
363                                                   
364     create_snap <dev id> <origin id>              
365         Create a new snapshot of another thinl    
366         <dev id> is an arbitrary unique 24-bit    
367         the caller.                               
368         <origin id> is the identifier of the t    
369         of which the new device will be a snap    
370                                                   
371     delete <dev id>                               
372         Deletes a thin device.  Irreversible.     
373                                                   
374     set_transaction_id <current id> <new id>      
375         Userland volume managers, such as LVM,    
376         synchronise their external metadata wi    
377         pool target.  The thin-pool target off    
378         arbitrary 64-bit transaction id and re    
379         status line.  To avoid races you must     
380         the current transaction id is when you    
381         compare-and-swap message.                 
382                                                   
383     reserve_metadata_snap                         
384         Reserve a copy of the data mapping btr    
385         This allows userland to inspect the ma    
386         this message was executed.  Use the po    
387         get the root block associated with the    
388                                                   
389     release_metadata_snap                         
390         Release a previously reserved copy of     
391                                                   
392 'thin' target                                     
393 -------------                                     
394                                                   
395 i) Constructor                                    
396                                                   
397     ::                                            
398                                                   
399         thin <pool dev> <dev id> [<external or    
400                                                   
401     pool dev:                                     
402         the thin-pool device, e.g. /dev/mapper    
403                                                   
404     dev id:                                       
405         the internal device identifier of the     
406         activated.                                
407                                                   
408     external origin dev:                          
409         an optional block device outside the p    
410         read-only snapshot origin: reads to un    
411         thin target will be mapped to this dev    
412                                                   
413 The pool doesn't store any size against the th    
414 load a thin target that is smaller than you've    
415 then you'll have no access to blocks mapped be    
416 load a target that is bigger than before, then    
417 provisioned as and when needed.                   
418                                                   
419 ii) Status                                        
420                                                   
421     <nr mapped sectors> <highest mapped sector    
422         If the pool has encountered device err    
423         will just contain the string 'Fail'.      
424         tools should then be used.                
425                                                   
426     In the case where <nr mapped sectors> is 0    
427     mapped sector and the value of <highest ma    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php