~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/device-mapper/vdo.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0-only
  2 
  3 dm-vdo
  4 ======
  5 
  6 The dm-vdo (virtual data optimizer) device mapper target provides
  7 block-level deduplication, compression, and thin provisioning. As a device
  8 mapper target, it can add these features to the storage stack, compatible
  9 with any file system. The vdo target does not protect against data
 10 corruption, relying instead on integrity protection of the storage below
 11 it. It is strongly recommended that lvm be used to manage vdo volumes. See
 12 lvmvdo(7).
 13 
 14 Userspace component
 15 ===================
 16 
 17 Formatting a vdo volume requires the use of the 'vdoformat' tool, available
 18 at:
 19 
 20 https://github.com/dm-vdo/vdo/
 21 
 22 In most cases, a vdo target will recover from a crash automatically the
 23 next time it is started. In cases where it encountered an unrecoverable
 24 error (either during normal operation or crash recovery) the target will
 25 enter or come up in read-only mode. Because read-only mode is indicative of
 26 data-loss, a positive action must be taken to bring vdo out of read-only
 27 mode. The 'vdoforcerebuild' tool, available from the same repo, is used to
 28 prepare a read-only vdo to exit read-only mode. After running this tool,
 29 the vdo target will rebuild its metadata the next time it is
 30 started. Although some data may be lost, the rebuilt vdo's metadata will be
 31 internally consistent and the target will be writable again.
 32 
 33 The repo also contains additional userspace tools which can be used to
 34 inspect a vdo target's on-disk metadata. Fortunately, these tools are
 35 rarely needed except by dm-vdo developers.
 36 
 37 Metadata requirements
 38 =====================
 39 
 40 Each vdo volume reserves 3GB of space for metadata, or more depending on
 41 its configuration. It is helpful to check that the space saved by
 42 deduplication and compression is not cancelled out by the metadata
 43 requirements. An estimation of the space saved for a specific dataset can
 44 be computed with the vdo estimator tool, which is available at:
 45 
 46 https://github.com/dm-vdo/vdoestimator/
 47 
 48 Target interface
 49 ================
 50 
 51 Table line
 52 ----------
 53 
 54 ::
 55 
 56         <offset> <logical device size> vdo V4 <storage device>
 57         <storage device size> <minimum I/O size> <block map cache size>
 58         <block map era length> [optional arguments]
 59 
 60 
 61 Required parameters:
 62 
 63         offset:
 64                 The offset, in sectors, at which the vdo volume's logical
 65                 space begins.
 66 
 67         logical device size:
 68                 The size of the device which the vdo volume will service,
 69                 in sectors. Must match the current logical size of the vdo
 70                 volume.
 71 
 72         storage device:
 73                 The device holding the vdo volume's data and metadata.
 74 
 75         storage device size:
 76                 The size of the device holding the vdo volume, as a number
 77                 of 4096-byte blocks. Must match the current size of the vdo
 78                 volume.
 79 
 80         minimum I/O size:
 81                 The minimum I/O size for this vdo volume to accept, in
 82                 bytes. Valid values are 512 or 4096. The recommended value
 83                 is 4096.
 84 
 85         block map cache size:
 86                 The size of the block map cache, as a number of 4096-byte
 87                 blocks. The minimum and recommended value is 32768 blocks.
 88                 If the logical thread count is non-zero, the cache size
 89                 must be at least 4096 blocks per logical thread.
 90 
 91         block map era length:
 92                 The speed with which the block map cache writes out
 93                 modified block map pages. A smaller era length is likely to
 94                 reduce the amount of time spent rebuilding, at the cost of
 95                 increased block map writes during normal operation. The
 96                 maximum and recommended value is 16380; the minimum value
 97                 is 1.
 98 
 99 Optional parameters:
100 --------------------
101 Some or all of these parameters may be specified as <key> <value> pairs.
102 
103 Thread related parameters:
104 
105 Different categories of work are assigned to separate thread groups, and
106 the number of threads in each group can be configured separately.
107 
108 If <hash>, <logical>, and <physical> are all set to 0, the work handled by
109 all three thread types will be handled by a single thread. If any of these
110 values are non-zero, all of them must be non-zero.
111 
112         ack:
113                 The number of threads used to complete bios. Since
114                 completing a bio calls an arbitrary completion function
115                 outside the vdo volume, threads of this type allow the vdo
116                 volume to continue processing requests even when bio
117                 completion is slow. The default is 1.
118 
119         bio:
120                 The number of threads used to issue bios to the underlying
121                 storage. Threads of this type allow the vdo volume to
122                 continue processing requests even when bio submission is
123                 slow. The default is 4.
124 
125         bioRotationInterval:
126                 The number of bios to enqueue on each bio thread before
127                 switching to the next thread. The value must be greater
128                 than 0 and not more than 1024; the default is 64.
129 
130         cpu:
131                 The number of threads used to do CPU-intensive work, such
132                 as hashing and compression. The default is 1.
133 
134         hash:
135                 The number of threads used to manage data comparisons for
136                 deduplication based on the hash value of data blocks. The
137                 default is 0.
138 
139         logical:
140                 The number of threads used to manage caching and locking
141                 based on the logical address of incoming bios. The default
142                 is 0; the maximum is 60.
143 
144         physical:
145                 The number of threads used to manage administration of the
146                 underlying storage device. At format time, a slab size for
147                 the vdo is chosen; the vdo storage device must be large
148                 enough to have at least 1 slab per physical thread. The
149                 default is 0; the maximum is 16.
150 
151 Miscellaneous parameters:
152 
153         maxDiscard:
154                 The maximum size of discard bio accepted, in 4096-byte
155                 blocks. I/O requests to a vdo volume are normally split
156                 into 4096-byte blocks, and processed up to 2048 at a time.
157                 However, discard requests to a vdo volume can be
158                 automatically split to a larger size, up to <maxDiscard>
159                 4096-byte blocks in a single bio, and are limited to 1500
160                 at a time. Increasing this value may provide better overall
161                 performance, at the cost of increased latency for the
162                 individual discard requests. The default and minimum is 1;
163                 the maximum is UINT_MAX / 4096.
164 
165         deduplication:
166                 Whether deduplication is enabled. The default is 'on'; the
167                 acceptable values are 'on' and 'off'.
168 
169         compression:
170                 Whether compression is enabled. The default is 'off'; the
171                 acceptable values are 'on' and 'off'.
172 
173 Device modification
174 -------------------
175 
176 A modified table may be loaded into a running, non-suspended vdo volume.
177 The modifications will take effect when the device is next resumed. The
178 modifiable parameters are <logical device size>, <physical device size>,
179 <maxDiscard>, <compression>, and <deduplication>.
180 
181 If the logical device size or physical device size are changed, upon
182 successful resume vdo will store the new values and require them on future
183 startups. These two parameters may not be decreased. The logical device
184 size may not exceed 4 PB. The physical device size must increase by at
185 least 32832 4096-byte blocks if at all, and must not exceed the size of the
186 underlying storage device. Additionally, when formatting the vdo device, a
187 slab size is chosen: the physical device size may never increase above the
188 size which provides 8192 slabs, and each increase must be large enough to
189 add at least one new slab.
190 
191 Examples:
192 
193 Start a previously-formatted vdo volume with 1 GB logical space and 1 GB
194 physical space, storing to /dev/dm-1 which has more than 1 GB of space.
195 
196 ::
197 
198         dmsetup create vdo0 --table \
199         "0 2097152 vdo V4 /dev/dm-1 262144 4096 32768 16380"
200 
201 Grow the logical size to 4 GB.
202 
203 ::
204 
205         dmsetup reload vdo0 --table \
206         "0 8388608 vdo V4 /dev/dm-1 262144 4096 32768 16380"
207         dmsetup resume vdo0
208 
209 Grow the physical size to 2 GB.
210 
211 ::
212 
213         dmsetup reload vdo0 --table \
214         "0 8388608 vdo V4 /dev/dm-1 524288 4096 32768 16380"
215         dmsetup resume vdo0
216 
217 Grow the physical size by 1 GB more and increase max discard sectors.
218 
219 ::
220 
221         dmsetup reload vdo0 --table \
222         "0 10485760 vdo V4 /dev/dm-1 786432 4096 32768 16380 maxDiscard 8"
223         dmsetup resume vdo0
224 
225 Stop the vdo volume.
226 
227 ::
228 
229         dmsetup remove vdo0
230 
231 Start the vdo volume again. Note that the logical and physical device sizes
232 must still match, but other parameters can change.
233 
234 ::
235 
236         dmsetup create vdo1 --table \
237         "0 10485760 vdo V4 /dev/dm-1 786432 512 65550 5000 hash 1 logical 3 physical 2"
238 
239 Messages
240 --------
241 All vdo devices accept messages in the form:
242 
243 ::
244 
245         dmsetup message <target-name> 0 <message-name> <message-parameters>
246 
247 The messages are:
248 
249         stats:
250                 Outputs the current view of the vdo statistics. Mostly used
251                 by the vdostats userspace program to interpret the output
252                 buffer.
253 
254         config:
255                 Outputs useful vdo configuration information. Mostly used
256                 by users who want to recreate a similar VDO volume and
257                 want to know the creation configuration used.
258 
259         dump:
260                 Dumps many internal structures to the system log. This is
261                 not always safe to run, so it should only be used to debug
262                 a hung vdo. Optional parameters to specify structures to
263                 dump are:
264 
265                         viopool: The pool of I/O requests incoming bios
266                         pools: A synonym of 'viopool'
267                         vdo: Most of the structures managing on-disk data
268                         queues: Basic information about each vdo thread
269                         threads: A synonym of 'queues'
270                         default: Equivalent to 'queues vdo'
271                         all: All of the above.
272 
273         dump-on-shutdown:
274                 Perform a default dump next time vdo shuts down.
275 
276 
277 Status
278 ------
279 
280 ::
281 
282     <device> <operating mode> <in recovery> <index state>
283     <compression state> <physical blocks used> <total physical blocks>
284 
285         device:
286                 The name of the vdo volume.
287 
288         operating mode:
289                 The current operating mode of the vdo volume; values may be
290                 'normal', 'recovering' (the volume has detected an issue
291                 with its metadata and is attempting to repair itself), and
292                 'read-only' (an error has occurred that forces the vdo
293                 volume to only support read operations and not writes).
294 
295         in recovery:
296                 Whether the vdo volume is currently in recovery mode;
297                 values may be 'recovering' or '-' which indicates not
298                 recovering.
299 
300         index state:
301                 The current state of the deduplication index in the vdo
302                 volume; values may be 'closed', 'closing', 'error',
303                 'offline', 'online', 'opening', and 'unknown'.
304 
305         compression state:
306                 The current state of compression in the vdo volume; values
307                 may be 'offline' and 'online'.
308 
309         used physical blocks:
310                 The number of physical blocks in use by the vdo volume.
311 
312         total physical blocks:
313                 The total number of physical blocks the vdo volume may use;
314                 the difference between this value and the
315                 <used physical blocks> is the number of blocks the vdo
316                 volume has left before being full.
317 
318 Memory Requirements
319 ===================
320 
321 A vdo target requires a fixed 38 MB of RAM along with the following amounts
322 that scale with the target:
323 
324 - 1.15 MB of RAM for each 1 MB of configured block map cache size. The
325   block map cache requires a minimum of 150 MB.
326 - 1.6 MB of RAM for each 1 TB of logical space.
327 - 268 MB of RAM for each 1 TB of physical storage managed by the volume.
328 
329 The deduplication index requires additional memory which scales with the
330 size of the deduplication window. For dense indexes, the index requires 1
331 GB of RAM per 1 TB of window. For sparse indexes, the index requires 1 GB
332 of RAM per 10 TB of window. The index configuration is set when the target
333 is formatted and may not be modified.
334 
335 Module Parameters
336 =================
337 
338 The vdo driver has a numeric parameter 'log_level' which controls the
339 verbosity of logging from the driver. The default setting is 6
340 (LOGLEVEL_INFO and more severe messages).
341 
342 Run-time Usage
343 ==============
344 
345 When using dm-vdo, it is important to be aware of the ways in which its
346 behavior differs from other storage targets.
347 
348 - There is no guarantee that over-writes of existing blocks will succeed.
349   Because the underlying storage may be multiply referenced, over-writing
350   an existing block generally requires a vdo to have a free block
351   available.
352 
353 - When blocks are no longer in use, sending a discard request for those
354   blocks lets the vdo release references for those blocks. If the vdo is
355   thinly provisioned, discarding unused blocks is essential to prevent the
356   target from running out of space. However, due to the sharing of
357   duplicate blocks, no discard request for any given logical block is
358   guaranteed to reclaim space.
359 
360 - Assuming the underlying storage properly implements flush requests, vdo
361   is resilient against crashes, however, unflushed writes may or may not
362   persist after a crash.
363 
364 - Each write to a vdo target entails a significant amount of processing.
365   However, much of the work is paralellizable. Therefore, vdo targets
366   achieve better throughput at higher I/O depths, and can support up 2048
367   requests in parallel.
368 
369 Tuning
370 ======
371 
372 The vdo device has many options, and it can be difficult to make optimal
373 choices without perfect knowledge of the workload. Additionally, most
374 configuration options must be set when a vdo target is started, and cannot
375 be changed without shutting it down completely; the configuration cannot be
376 changed while the target is active. Ideally, tuning with simulated
377 workloads should be performed before deploying vdo in production
378 environments.
379 
380 The most important value to adjust is the block map cache size. In order to
381 service a request for any logical address, a vdo must load the portion of
382 the block map which holds the relevant mapping. These mappings are cached.
383 Performance will suffer when the working set does not fit in the cache. By
384 default, a vdo allocates 128 MB of metadata cache in RAM to support
385 efficient access to 100 GB of logical space at a time. It should be scaled
386 up proportionally for larger working sets.
387 
388 The logical and physical thread counts should also be adjusted. A logical
389 thread controls a disjoint section of the block map, so additional logical
390 threads increase parallelism and can increase throughput. Physical threads
391 control a disjoint section of the data blocks, so additional physical
392 threads can also increase throughput. However, excess threads can waste
393 resources and increase contention.
394 
395 Bio submission threads control the parallelism involved in sending I/O to
396 the underlying storage; fewer threads mean there is more opportunity to
397 reorder I/O requests for performance benefit, but also that each I/O
398 request has to wait longer before being submitted.
399 
400 Bio acknowledgment threads are used for finishing I/O requests. This is
401 done on dedicated threads since the amount of work required to execute a
402 bio's callback can not be controlled by the vdo itself. Usually one thread
403 is sufficient but additional threads may be beneficial, particularly when
404 bios have CPU-heavy callbacks.
405 
406 CPU threads are used for hashing and for compression; in workloads with
407 compression enabled, more threads may result in higher throughput.
408 
409 Hash threads are used to sort active requests by hash and determine whether
410 they should deduplicate; the most CPU intensive actions done by these
411 threads are comparison of 4096-byte data blocks. In most cases, a single
412 hash thread is sufficient.

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php