~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/device-mapper/switch.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/admin-guide/device-mapper/switch.rst (Version linux-6.12-rc7) and /Documentation/admin-guide/device-mapper/switch.rst (Version linux-5.3.18)


  1 =========                                           1 =========
  2 dm-switch                                           2 dm-switch
  3 =========                                           3 =========
  4                                                     4 
  5 The device-mapper switch target creates a devi      5 The device-mapper switch target creates a device that supports an
  6 arbitrary mapping of fixed-size regions of I/O      6 arbitrary mapping of fixed-size regions of I/O across a fixed set of
  7 paths.  The path used for any specific region       7 paths.  The path used for any specific region can be switched
  8 dynamically by sending the target a message.        8 dynamically by sending the target a message.
  9                                                     9 
 10 It maps I/O to underlying block devices effici     10 It maps I/O to underlying block devices efficiently when there is a large
 11 number of fixed-sized address regions but ther     11 number of fixed-sized address regions but there is no simple pattern
 12 that would allow for a compact representation      12 that would allow for a compact representation of the mapping such as
 13 dm-stripe.                                         13 dm-stripe.
 14                                                    14 
 15 Background                                         15 Background
 16 ----------                                         16 ----------
 17                                                    17 
 18 Dell EqualLogic and some other iSCSI storage a     18 Dell EqualLogic and some other iSCSI storage arrays use a distributed
 19 frameless architecture.  In this architecture,     19 frameless architecture.  In this architecture, the storage group
 20 consists of a number of distinct storage array     20 consists of a number of distinct storage arrays ("members") each having
 21 independent controllers, disk storage and netw     21 independent controllers, disk storage and network adapters.  When a LUN
 22 is created it is spread across multiple member     22 is created it is spread across multiple members.  The details of the
 23 spreading are hidden from initiators connected     23 spreading are hidden from initiators connected to this storage system.
 24 The storage group exposes a single target disc     24 The storage group exposes a single target discovery portal, no matter
 25 how many members are being used.  When iSCSI s     25 how many members are being used.  When iSCSI sessions are created, each
 26 session is connected to an eth port on a singl     26 session is connected to an eth port on a single member.  Data to a LUN
 27 can be sent on any iSCSI session, and if the b     27 can be sent on any iSCSI session, and if the blocks being accessed are
 28 stored on another member the I/O will be forwa     28 stored on another member the I/O will be forwarded as required.  This
 29 forwarding is invisible to the initiator.  The     29 forwarding is invisible to the initiator.  The storage layout is also
 30 dynamic, and the blocks stored on disk may be      30 dynamic, and the blocks stored on disk may be moved from member to
 31 member as needed to balance the load.              31 member as needed to balance the load.
 32                                                    32 
 33 This architecture simplifies the management an     33 This architecture simplifies the management and configuration of both
 34 the storage group and initiators.  In a multip     34 the storage group and initiators.  In a multipathing configuration, it
 35 is possible to set up multiple iSCSI sessions      35 is possible to set up multiple iSCSI sessions to use multiple network
 36 interfaces on both the host and target to take     36 interfaces on both the host and target to take advantage of the
 37 increased network bandwidth.  An initiator cou     37 increased network bandwidth.  An initiator could use a simple round
 38 robin algorithm to send I/O across all paths a     38 robin algorithm to send I/O across all paths and let the storage array
 39 members forward it as necessary, but there is      39 members forward it as necessary, but there is a performance advantage to
 40 sending data directly to the correct member.       40 sending data directly to the correct member.
 41                                                    41 
 42 A device-mapper table already lets you map dif     42 A device-mapper table already lets you map different regions of a
 43 device onto different targets.  However in thi     43 device onto different targets.  However in this architecture the LUN is
 44 spread with an address region size on the orde     44 spread with an address region size on the order of 10s of MBs, which
 45 means the resulting table could have more than     45 means the resulting table could have more than a million entries and
 46 consume far too much memory.                       46 consume far too much memory.
 47                                                    47 
 48 Using this device-mapper switch target we can      48 Using this device-mapper switch target we can now build a two-layer
 49 device hierarchy:                                  49 device hierarchy:
 50                                                    50 
 51     Upper Tier - Determine which array member      51     Upper Tier - Determine which array member the I/O should be sent to.
 52     Lower Tier - Load balance amongst paths to     52     Lower Tier - Load balance amongst paths to a particular member.
 53                                                    53 
 54 The lower tier consists of a single dm multipa     54 The lower tier consists of a single dm multipath device for each member.
 55 Each of these multipath devices contains the s     55 Each of these multipath devices contains the set of paths directly to
 56 the array member in one priority group, and le     56 the array member in one priority group, and leverages existing path
 57 selectors to load balance amongst these paths.     57 selectors to load balance amongst these paths.  We also build a
 58 non-preferred priority group containing paths      58 non-preferred priority group containing paths to other array members for
 59 failover reasons.                                  59 failover reasons.
 60                                                    60 
 61 The upper tier consists of a single dm-switch      61 The upper tier consists of a single dm-switch device.  This device uses
 62 a bitmap to look up the location of the I/O an     62 a bitmap to look up the location of the I/O and choose the appropriate
 63 lower tier device to route the I/O.  By using      63 lower tier device to route the I/O.  By using a bitmap we are able to
 64 use 4 bits for each address range in a 16 memb     64 use 4 bits for each address range in a 16 member group (which is very
 65 large for us).  This is a much denser represen     65 large for us).  This is a much denser representation than the dm table
 66 b-tree can achieve.                                66 b-tree can achieve.
 67                                                    67 
 68 Construction Parameters                            68 Construction Parameters
 69 =======================                            69 =======================
 70                                                    70 
 71     <num_paths> <region_size> <num_optional_ar     71     <num_paths> <region_size> <num_optional_args> [<optional_args>...] [<dev_path> <offset>]+
 72         <num_paths>                                72         <num_paths>
 73             The number of paths across which t     73             The number of paths across which to distribute the I/O.
 74                                                    74 
 75         <region_size>                              75         <region_size>
 76             The number of 512-byte sectors in      76             The number of 512-byte sectors in a region. Each region can be redirected
 77             to any of the available paths.         77             to any of the available paths.
 78                                                    78 
 79         <num_optional_args>                        79         <num_optional_args>
 80             The number of optional arguments.      80             The number of optional arguments. Currently, no optional arguments
 81             are supported and so this must be      81             are supported and so this must be zero.
 82                                                    82 
 83         <dev_path>                                 83         <dev_path>
 84             The block device that represents a     84             The block device that represents a specific path to the device.
 85                                                    85 
 86         <offset>                                   86         <offset>
 87             The offset of the start of data on     87             The offset of the start of data on the specific <dev_path> (in units
 88             of 512-byte sectors). This number      88             of 512-byte sectors). This number is added to the sector number when
 89             forwarding the request to the spec     89             forwarding the request to the specific path. Typically it is zero.
 90                                                    90 
 91 Messages                                           91 Messages
 92 ========                                           92 ========
 93                                                    93 
 94 set_region_mappings <index>:<path_nr> [<index>     94 set_region_mappings <index>:<path_nr> [<index>]:<path_nr> [<index>]:<path_nr>...
 95                                                    95 
 96 Modify the region table by specifying which re     96 Modify the region table by specifying which regions are redirected to
 97 which paths.                                       97 which paths.
 98                                                    98 
 99 <index>                                            99 <index>
100     The region number (region size was specifi    100     The region number (region size was specified in constructor parameters).
101     If index is omitted, the next region (prev    101     If index is omitted, the next region (previous index + 1) is used.
102     Expressed in hexadecimal (WITHOUT any pref    102     Expressed in hexadecimal (WITHOUT any prefix like 0x).
103                                                   103 
104 <path_nr>                                         104 <path_nr>
105     The path number in the range 0 ... (<num_p    105     The path number in the range 0 ... (<num_paths> - 1).
106     Expressed in hexadecimal (WITHOUT any pref    106     Expressed in hexadecimal (WITHOUT any prefix like 0x).
107                                                   107 
108 R<n>,<m>                                          108 R<n>,<m>
109     This parameter allows repetitive patterns     109     This parameter allows repetitive patterns to be loaded quickly. <n> and <m>
110     are hexadecimal numbers. The last <n> mapp    110     are hexadecimal numbers. The last <n> mappings are repeated in the next <m>
111     slots.                                        111     slots.
112                                                   112 
113 Status                                            113 Status
114 ======                                            114 ======
115                                                   115 
116 No status line is reported.                       116 No status line is reported.
117                                                   117 
118 Example                                           118 Example
119 =======                                           119 =======
120                                                   120 
121 Assume that you have volumes vg1/switch0 vg1/s    121 Assume that you have volumes vg1/switch0 vg1/switch1 vg1/switch2 with
122 the same size.                                    122 the same size.
123                                                   123 
124 Create a switch device with 64kB region size::    124 Create a switch device with 64kB region size::
125                                                   125 
126     dmsetup create switch --table "0 `blockdev    126     dmsetup create switch --table "0 `blockdev --getsz /dev/vg1/switch0`
127         switch 3 128 0 /dev/vg1/switch0 0 /dev    127         switch 3 128 0 /dev/vg1/switch0 0 /dev/vg1/switch1 0 /dev/vg1/switch2 0"
128                                                   128 
129 Set mappings for the first 7 entries to point     129 Set mappings for the first 7 entries to point to devices switch0, switch1,
130 switch2, switch0, switch1, switch2, switch1::     130 switch2, switch0, switch1, switch2, switch1::
131                                                   131 
132     dmsetup message switch 0 set_region_mappin    132     dmsetup message switch 0 set_region_mappings 0:0 :1 :2 :0 :1 :2 :1
133                                                   133 
134 Set repetitive mapping. This command::            134 Set repetitive mapping. This command::
135                                                   135 
136     dmsetup message switch 0 set_region_mappin    136     dmsetup message switch 0 set_region_mappings 1000:1 :2 R2,10
137                                                   137 
138 is equivalent to::                                138 is equivalent to::
139                                                   139 
140     dmsetup message switch 0 set_region_mappin    140     dmsetup message switch 0 set_region_mappings 1000:1 :2 :1 :2 :1 :2 :1 :2 \
141         :1 :2 :1 :2 :1 :2 :1 :2 :1 :2             141         :1 :2 :1 :2 :1 :2 :1 :2 :1 :2
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php