~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/block/data-integrity.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/block/data-integrity.rst (Version linux-6.11.5) and /Documentation/block/data-integrity.rst (Version linux-4.16.18)


  1 ==============                                    
  2 Data Integrity                                    
  3 ==============                                    
  4                                                   
  5 1. Introduction                                   
  6 ===============                                   
  7                                                   
  8 Modern filesystems feature checksumming of dat    
  9 protect against data corruption.  However, the    
 10 corruption is done at read time which could po    
 11 after the data was written.  At that point the    
 12 application tried to write is most likely lost    
 13                                                   
 14 The solution is to ensure that the disk is act    
 15 application meant it to.  Recent additions to     
 16 protocols (SBC Data Integrity Field, SCC prote    
 17 as SATA/T13 (External Path Protection) try to     
 18 support for appending integrity metadata to an    
 19 metadata (or protection information in SCSI te    
 20 checksum for each sector as well as an increme    
 21 ensures the individual sectors are written in     
 22 for some protection schemes also that the I/O     
 23 place on disk.                                    
 24                                                   
 25 Current storage controllers and devices implem    
 26 measures, for instance checksumming and scrubb    
 27 technologies are working in their own isolated    
 28 between adjacent nodes in the I/O path.  The i    
 29 DIF and the other integrity extensions is that    
 30 is well defined and every node in the I/O path    
 31 integrity of the I/O and reject it if corrupti    
 32 allows not only corruption prevention but also    
 33 of failure.                                       
 34                                                   
 35 2. The Data Integrity Extensions                  
 36 ================================                  
 37                                                   
 38 As written, the protocol extensions only prote    
 39 controller and storage device.  However, many     
 40 allow the operating system to interact with th    
 41 (IMD).  We have been working with several FC/S    
 42 the protection information to be transferred t    
 43 controllers.                                      
 44                                                   
 45 The SCSI Data Integrity Field works by appendi    
 46 information to each sector.  The data + integr    
 47 in 520 byte sectors on disk.  Data + IMD are i    
 48 transferred between the controller and target.    
 49 similar.                                          
 50                                                   
 51 Because it is highly inconvenient for operatin    
 52 520 (and 4104) byte sectors, we approached sev    
 53 encouraged them to allow separation of the dat    
 54 scatter-gather lists.                             
 55                                                   
 56 The controller will interleave the buffers on     
 57 read.  This means that Linux can DMA the data     
 58 host memory without changes to the page cache.    
 59                                                   
 60 Also, the 16-bit CRC checksum mandated by both    
 61 is somewhat heavy to compute in software.  Ben    
 62 calculating this checksum had a significant im    
 63 performance for a number of workloads.  Some c    
 64 lighter-weight checksum to be used when interf    
 65 system.  Emulex, for instance, supports the TC    
 66 The IP checksum received from the OS is conver    
 67 when writing and vice versa.  This allows the     
 68 generated by Linux or the application at very     
 69 software RAID5).                                  
 70                                                   
 71 The IP checksum is weaker than the CRC in term    
 72 errors.  However, the strength is really in th    
 73 buffers and the integrity metadata.  These two    
 74 match up for an I/O to complete.                  
 75                                                   
 76 The separation of the data and integrity metad    
 77 the choice in checksums is referred to as the     
 78 Extensions.  As these extensions are outside t    
 79 bodies (T10, T13), Oracle and its partners are    
 80 them within the Storage Networking Industry As    
 81                                                   
 82 3. Kernel Changes                                 
 83 =================                                 
 84                                                   
 85 The data integrity framework in Linux enables     
 86 to be pinned to I/Os and sent to/received from    
 87 support it.                                       
 88                                                   
 89 The advantage to the integrity extensions in S    
 90 they enable us to protect the entire path from    
 91 device.  However, at the same time this is als    
 92 disadvantage. It means that the protection inf    
 93 format that can be understood by the disk.        
 94                                                   
 95 Generally Linux/POSIX applications are agnosti    
 96 the storage devices they are accessing.  The v    
 97 and the block layer make things like hardware     
 98 transport protocols completely transparent to     
 99                                                   
100 However, this level of detail is required when    
101 protection information to send to a disk.  Con    
102 concept of an end-to-end protection scheme is     
103 It is completely unreasonable for an applicati    
104 it is accessing a SCSI or SATA disk.              
105                                                   
106 The data integrity support implemented in Linu    
107 from the application.  As far as the applicati    
108 the kernel) is concerned, the integrity metada    
109 that's attached to the I/O.                       
110                                                   
111 The current implementation allows the block la    
112 generate the protection information for any I/    
113 intent is to move the integrity metadata calcu    
114 user data.  Metadata and other I/O that origin    
115 will still use the automatic generation interf    
116                                                   
117 Some storage devices allow each hardware secto    
118 16-bit value.  The owner of this tag space is     
119 device.  I.e. the filesystem in most cases.  T    
120 this extra space to tag sectors as they see fi    
121 space is limited, the block interface allows t    
122 way of interleaving.  This way, 8*16 bits of i    
123 attached to a typical 4KB filesystem block.       
124                                                   
125 This also means that applications such as fsck    
126 access to manipulate the tags from user space.    
127 interface for this is being worked on.            
128                                                   
129                                                   
130 4. Block Layer Implementation Details             
131 =====================================             
132                                                   
133 4.1 Bio                                           
134 -------                                           
135                                                   
136 The data integrity patches add a new field to     
137 CONFIG_BLK_DEV_INTEGRITY is enabled.  bio_inte    
138 pointer to a struct bip which contains the bio    
139 Essentially a bip is a trimmed down struct bio    
140 containing the integrity metadata and the requ    
141 information (bvec pool, vector count, etc.)       
142                                                   
143 A kernel subsystem can enable data integrity p    
144 calling bio_integrity_alloc(bio).  This will a    
145 bip to the bio.                                   
146                                                   
147 Individual pages containing integrity metadata    
148 attached using bio_integrity_add_page().          
149                                                   
150 bio_free() will automatically free the bip.       
151                                                   
152                                                   
153 4.2 Block Device                                  
154 ----------------                                  
155                                                   
156 Block devices can set up the integrity informa    
157 sub-struture of the queue_limits structure.       
158                                                   
159 Layered block devices will need to pick a prof    
160 for all subdevices.  queue_limits_stack_integr    
161 and MD linear, RAID0 and RAID1 are currently s    
162 will require extra work due to the application    
163                                                   
164                                                   
165 5.0 Block Layer Integrity API                     
166 =============================                     
167                                                   
168 5.1 Normal Filesystem                             
169 ---------------------                             
170                                                   
171     The normal filesystem is unaware that the     
172     is capable of sending/receiving integrity     
173     be automatically generated by the block la    
174     in case of a WRITE.  A READ request will c    
175     to be verified upon completion.               
176                                                   
177     IMD generation and verification can be tog    
178                                                   
179       /sys/block/<bdev>/integrity/write_genera    
180                                                   
181     and::                                         
182                                                   
183       /sys/block/<bdev>/integrity/read_verify     
184                                                   
185     flags.                                        
186                                                   
187                                                   
188 5.2 Integrity-Aware Filesystem                    
189 ------------------------------                    
190                                                   
191     A filesystem that is integrity-aware can p    
192     attached.  It can also use the application    
193     supported by the block device.                
194                                                   
195                                                   
196     `bool bio_integrity_prep(bio);`               
197                                                   
198       To generate IMD for WRITE and to set up     
199       filesystem must call bio_integrity_prep(    
200                                                   
201       Prior to calling this function, the bio     
202       sector must be set, and the bio should h    
203       added.  It is up to the caller to ensure    
204       change while I/O is in progress.            
205       Complete bio with error if prepare faile    
206                                                   
207                                                   
208 5.3 Passing Existing Integrity Metadata           
209 ---------------------------------------           
210                                                   
211     Filesystems that either generate their own    
212     are capable of transferring IMD from user     
213     following calls:                              
214                                                   
215                                                   
216     `struct bip * bio_integrity_alloc(bio, gfp    
217                                                   
218       Allocates the bio integrity payload and     
219       nr_pages indicate how many pages of prot    
220       stored in the integrity bio_vec list (si    
221                                                   
222       The integrity payload will be freed at b    
223                                                   
224                                                   
225     `int bio_integrity_add_page(bio, page, len    
226                                                   
227       Attaches a page containing integrity met    
228       bio.  The bio must have an existing bip,    
229       i.e. bio_integrity_alloc() must have bee    
230       the integrity metadata in the pages must    
231       understood by the target device with the    
232       the sector numbers will be remapped as t    
233       I/O stack.  This implies that the pages     
234       will be modified during I/O!  The first     
235       integrity metadata must have a value of     
236                                                   
237       Pages can be added using bio_integrity_a    
238       there is room in the bip bio_vec array (    
239                                                   
240       Upon completion of a READ operation, the    
241       contain the integrity metadata received     
242       It is up to the receiver to process them    
243       integrity upon completion.                  
244                                                   
245                                                   
246 ----------------------------------------------    
247                                                   
248 2007-12-24 Martin K. Petersen <martin.petersen@    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php