~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/tools/memory-model/Documentation/locking.txt

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /tools/memory-model/Documentation/locking.txt (Version linux-6.11.5) and /tools/memory-model/Documentation/locking.txt (Version linux-6.5.13)


  1 Locking                                             1 Locking
  2 =======                                             2 =======
  3                                                     3 
  4 Locking is well-known and the common use cases      4 Locking is well-known and the common use cases are straightforward: Any
  5 CPU holding a given lock sees any changes prev      5 CPU holding a given lock sees any changes previously seen or made by any
  6 CPU before it previously released that same lo      6 CPU before it previously released that same lock.  This last sentence
  7 is the only part of this document that most de      7 is the only part of this document that most developers will need to read.
  8                                                     8 
  9 However, developers who would like to also acc      9 However, developers who would like to also access lock-protected shared
 10 variables outside of their corresponding locks     10 variables outside of their corresponding locks should continue reading.
 11                                                    11 
 12                                                    12 
 13 Locking and Prior Accesses                         13 Locking and Prior Accesses
 14 --------------------------                         14 --------------------------
 15                                                    15 
 16 The basic rule of locking is worth repeating:      16 The basic rule of locking is worth repeating:
 17                                                    17 
 18         Any CPU holding a given lock sees any      18         Any CPU holding a given lock sees any changes previously seen
 19         or made by any CPU before it previousl     19         or made by any CPU before it previously released that same lock.
 20                                                    20 
 21 Note that this statement is a bit stronger tha     21 Note that this statement is a bit stronger than "Any CPU holding a
 22 given lock sees all changes made by any CPU du     22 given lock sees all changes made by any CPU during the time that CPU was
 23 previously holding this same lock".  For examp     23 previously holding this same lock".  For example, consider the following
 24 pair of code fragments:                            24 pair of code fragments:
 25                                                    25 
 26         /* See MP+polocks.litmus. */               26         /* See MP+polocks.litmus. */
 27         void CPU0(void)                            27         void CPU0(void)
 28         {                                          28         {
 29                 WRITE_ONCE(x, 1);                  29                 WRITE_ONCE(x, 1);
 30                 spin_lock(&mylock);                30                 spin_lock(&mylock);
 31                 WRITE_ONCE(y, 1);                  31                 WRITE_ONCE(y, 1);
 32                 spin_unlock(&mylock);              32                 spin_unlock(&mylock);
 33         }                                          33         }
 34                                                    34 
 35         void CPU1(void)                            35         void CPU1(void)
 36         {                                          36         {
 37                 spin_lock(&mylock);                37                 spin_lock(&mylock);
 38                 r0 = READ_ONCE(y);                 38                 r0 = READ_ONCE(y);
 39                 spin_unlock(&mylock);              39                 spin_unlock(&mylock);
 40                 r1 = READ_ONCE(x);                 40                 r1 = READ_ONCE(x);
 41         }                                          41         }
 42                                                    42 
 43 The basic rule guarantees that if CPU0() acqui     43 The basic rule guarantees that if CPU0() acquires mylock before CPU1(),
 44 then both r0 and r1 must be set to the value 1     44 then both r0 and r1 must be set to the value 1.  This also has the
 45 consequence that if the final value of r0 is e     45 consequence that if the final value of r0 is equal to 1, then the final
 46 value of r1 must also be equal to 1.  In contr     46 value of r1 must also be equal to 1.  In contrast, the weaker rule would
 47 say nothing about the final value of r1.           47 say nothing about the final value of r1.
 48                                                    48 
 49                                                    49 
 50 Locking and Subsequent Accesses                    50 Locking and Subsequent Accesses
 51 -------------------------------                    51 -------------------------------
 52                                                    52 
 53 The converse to the basic rule also holds:  An     53 The converse to the basic rule also holds:  Any CPU holding a given
 54 lock will not see any changes that will be mad     54 lock will not see any changes that will be made by any CPU after it
 55 subsequently acquires this same lock.  This co     55 subsequently acquires this same lock.  This converse statement is
 56 illustrated by the following litmus test:          56 illustrated by the following litmus test:
 57                                                    57 
 58         /* See MP+porevlocks.litmus. */            58         /* See MP+porevlocks.litmus. */
 59         void CPU0(void)                            59         void CPU0(void)
 60         {                                          60         {
 61                 r0 = READ_ONCE(y);                 61                 r0 = READ_ONCE(y);
 62                 spin_lock(&mylock);                62                 spin_lock(&mylock);
 63                 r1 = READ_ONCE(x);                 63                 r1 = READ_ONCE(x);
 64                 spin_unlock(&mylock);              64                 spin_unlock(&mylock);
 65         }                                          65         }
 66                                                    66 
 67         void CPU1(void)                            67         void CPU1(void)
 68         {                                          68         {
 69                 spin_lock(&mylock);                69                 spin_lock(&mylock);
 70                 WRITE_ONCE(x, 1);                  70                 WRITE_ONCE(x, 1);
 71                 spin_unlock(&mylock);              71                 spin_unlock(&mylock);
 72                 WRITE_ONCE(y, 1);                  72                 WRITE_ONCE(y, 1);
 73         }                                          73         }
 74                                                    74 
 75 This converse to the basic rule guarantees tha     75 This converse to the basic rule guarantees that if CPU0() acquires
 76 mylock before CPU1(), then both r0 and r1 must     76 mylock before CPU1(), then both r0 and r1 must be set to the value 0.
 77 This also has the consequence that if the fina     77 This also has the consequence that if the final value of r1 is equal
 78 to 0, then the final value of r0 must also be      78 to 0, then the final value of r0 must also be equal to 0.  In contrast,
 79 the weaker rule would say nothing about the fi     79 the weaker rule would say nothing about the final value of r0.
 80                                                    80 
 81 These examples show only a single pair of CPUs     81 These examples show only a single pair of CPUs, but the effects of the
 82 locking basic rule extend across multiple acqu     82 locking basic rule extend across multiple acquisitions of a given lock
 83 across multiple CPUs.                              83 across multiple CPUs.
 84                                                    84 
 85                                                    85 
 86 Double-Checked Locking                             86 Double-Checked Locking
 87 ----------------------                             87 ----------------------
 88                                                    88 
 89 It is well known that more than just a lock is     89 It is well known that more than just a lock is required to make
 90 double-checked locking work correctly,  This l     90 double-checked locking work correctly,  This litmus test illustrates
 91 one incorrect approach:                            91 one incorrect approach:
 92                                                    92 
 93         /* See Documentation/litmus-tests/lock     93         /* See Documentation/litmus-tests/locking/DCL-broken.litmus. */
 94         void CPU0(void)                            94         void CPU0(void)
 95         {                                          95         {
 96                 r0 = READ_ONCE(flag);              96                 r0 = READ_ONCE(flag);
 97                 if (r0 == 0) {                     97                 if (r0 == 0) {
 98                         spin_lock(&lck);           98                         spin_lock(&lck);
 99                         r1 = READ_ONCE(flag);      99                         r1 = READ_ONCE(flag);
100                         if (r1 == 0) {            100                         if (r1 == 0) {
101                                 WRITE_ONCE(dat    101                                 WRITE_ONCE(data, 1);
102                                 WRITE_ONCE(fla    102                                 WRITE_ONCE(flag, 1);
103                         }                         103                         }
104                         spin_unlock(&lck);        104                         spin_unlock(&lck);
105                 }                                 105                 }
106                 r2 = READ_ONCE(data);             106                 r2 = READ_ONCE(data);
107         }                                         107         }
108         /* CPU1() is the exactly the same as C    108         /* CPU1() is the exactly the same as CPU0(). */
109                                                   109 
110 There are two problems.  First, there is no or    110 There are two problems.  First, there is no ordering between the first
111 READ_ONCE() of "flag" and the READ_ONCE() of "    111 READ_ONCE() of "flag" and the READ_ONCE() of "data".  Second, there is
112 no ordering between the two WRITE_ONCE() calls    112 no ordering between the two WRITE_ONCE() calls.  It should therefore be
113 no surprise that "r2" can be zero, and a quick    113 no surprise that "r2" can be zero, and a quick herd7 run confirms this.
114                                                   114 
115 One way to fix this is to use smp_load_acquire    115 One way to fix this is to use smp_load_acquire() and smp_store_release()
116 as shown in this corrected version:               116 as shown in this corrected version:
117                                                   117 
118         /* See Documentation/litmus-tests/lock    118         /* See Documentation/litmus-tests/locking/DCL-fixed.litmus. */
119         void CPU0(void)                           119         void CPU0(void)
120         {                                         120         {
121                 r0 = smp_load_acquire(&flag);     121                 r0 = smp_load_acquire(&flag);
122                 if (r0 == 0) {                    122                 if (r0 == 0) {
123                         spin_lock(&lck);          123                         spin_lock(&lck);
124                         r1 = READ_ONCE(flag);     124                         r1 = READ_ONCE(flag);
125                         if (r1 == 0) {            125                         if (r1 == 0) {
126                                 WRITE_ONCE(dat    126                                 WRITE_ONCE(data, 1);
127                                 smp_store_rele    127                                 smp_store_release(&flag, 1);
128                         }                         128                         }
129                         spin_unlock(&lck);        129                         spin_unlock(&lck);
130                 }                                 130                 }
131                 r2 = READ_ONCE(data);             131                 r2 = READ_ONCE(data);
132         }                                         132         }
133         /* CPU1() is the exactly the same as C    133         /* CPU1() is the exactly the same as CPU0(). */
134                                                   134 
135 The smp_load_acquire() guarantees that its loa    135 The smp_load_acquire() guarantees that its load from "flags" will
136 be ordered before the READ_ONCE() from data, t    136 be ordered before the READ_ONCE() from data, thus solving the first
137 problem.  The smp_store_release() guarantees t    137 problem.  The smp_store_release() guarantees that its store will be
138 ordered after the WRITE_ONCE() to "data", solv    138 ordered after the WRITE_ONCE() to "data", solving the second problem.
139 The smp_store_release() pairs with the smp_loa    139 The smp_store_release() pairs with the smp_load_acquire(), thus ensuring
140 that the ordering provided by each actually ta    140 that the ordering provided by each actually takes effect.  Again, a
141 quick herd7 run confirms this.                    141 quick herd7 run confirms this.
142                                                   142 
143 In short, if you access a lock-protected varia    143 In short, if you access a lock-protected variable without holding the
144 corresponding lock, you will need to provide a    144 corresponding lock, you will need to provide additional ordering, in
145 this case, via the smp_load_acquire() and the     145 this case, via the smp_load_acquire() and the smp_store_release().
146                                                   146 
147                                                   147 
148 Ordering Provided by a Lock to CPUs Not Holdin    148 Ordering Provided by a Lock to CPUs Not Holding That Lock
149 ----------------------------------------------    149 ---------------------------------------------------------
150                                                   150 
151 It is not necessarily the case that accesses o    151 It is not necessarily the case that accesses ordered by locking will be
152 seen as ordered by CPUs not holding that lock.    152 seen as ordered by CPUs not holding that lock.  Consider this example:
153                                                   153 
154         /* See Z6.0+pooncelock+pooncelock+pomb    154         /* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */
155         void CPU0(void)                           155         void CPU0(void)
156         {                                         156         {
157                 spin_lock(&mylock);               157                 spin_lock(&mylock);
158                 WRITE_ONCE(x, 1);                 158                 WRITE_ONCE(x, 1);
159                 WRITE_ONCE(y, 1);                 159                 WRITE_ONCE(y, 1);
160                 spin_unlock(&mylock);             160                 spin_unlock(&mylock);
161         }                                         161         }
162                                                   162 
163         void CPU1(void)                           163         void CPU1(void)
164         {                                         164         {
165                 spin_lock(&mylock);               165                 spin_lock(&mylock);
166                 r0 = READ_ONCE(y);                166                 r0 = READ_ONCE(y);
167                 WRITE_ONCE(z, 1);                 167                 WRITE_ONCE(z, 1);
168                 spin_unlock(&mylock);             168                 spin_unlock(&mylock);
169         }                                         169         }
170                                                   170 
171         void CPU2(void)                           171         void CPU2(void)
172         {                                         172         {
173                 WRITE_ONCE(z, 2);                 173                 WRITE_ONCE(z, 2);
174                 smp_mb();                         174                 smp_mb();
175                 r1 = READ_ONCE(x);                175                 r1 = READ_ONCE(x);
176         }                                         176         }
177                                                   177 
178 Counter-intuitive though it might be, it is qu    178 Counter-intuitive though it might be, it is quite possible to have
179 the final value of r0 be 1, the final value of    179 the final value of r0 be 1, the final value of z be 2, and the final
180 value of r1 be 0.  The reason for this surpris    180 value of r1 be 0.  The reason for this surprising outcome is that CPU2()
181 never acquired the lock, and thus did not full    181 never acquired the lock, and thus did not fully benefit from the lock's
182 ordering properties.                              182 ordering properties.
183                                                   183 
184 Ordering can be extended to CPUs not holding t    184 Ordering can be extended to CPUs not holding the lock by careful use
185 of smp_mb__after_spinlock():                      185 of smp_mb__after_spinlock():
186                                                   186 
187         /* See Z6.0+pooncelock+poonceLock+pomb    187         /* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */
188         void CPU0(void)                           188         void CPU0(void)
189         {                                         189         {
190                 spin_lock(&mylock);               190                 spin_lock(&mylock);
191                 WRITE_ONCE(x, 1);                 191                 WRITE_ONCE(x, 1);
192                 WRITE_ONCE(y, 1);                 192                 WRITE_ONCE(y, 1);
193                 spin_unlock(&mylock);             193                 spin_unlock(&mylock);
194         }                                         194         }
195                                                   195 
196         void CPU1(void)                           196         void CPU1(void)
197         {                                         197         {
198                 spin_lock(&mylock);               198                 spin_lock(&mylock);
199                 smp_mb__after_spinlock();         199                 smp_mb__after_spinlock();
200                 r0 = READ_ONCE(y);                200                 r0 = READ_ONCE(y);
201                 WRITE_ONCE(z, 1);                 201                 WRITE_ONCE(z, 1);
202                 spin_unlock(&mylock);             202                 spin_unlock(&mylock);
203         }                                         203         }
204                                                   204 
205         void CPU2(void)                           205         void CPU2(void)
206         {                                         206         {
207                 WRITE_ONCE(z, 2);                 207                 WRITE_ONCE(z, 2);
208                 smp_mb();                         208                 smp_mb();
209                 r1 = READ_ONCE(x);                209                 r1 = READ_ONCE(x);
210         }                                         210         }
211                                                   211 
212 This addition of smp_mb__after_spinlock() stre    212 This addition of smp_mb__after_spinlock() strengthens the lock
213 acquisition sufficiently to rule out the count    213 acquisition sufficiently to rule out the counter-intuitive outcome.
214 In other words, the addition of the smp_mb__af    214 In other words, the addition of the smp_mb__after_spinlock() prohibits
215 the counter-intuitive result where the final v    215 the counter-intuitive result where the final value of r0 is 1, the final
216 value of z is 2, and the final value of r1 is     216 value of z is 2, and the final value of r1 is 0.
217                                                   217 
218                                                   218 
219 No Roach-Motel Locking!                           219 No Roach-Motel Locking!
220 -----------------------                           220 -----------------------
221                                                   221 
222 This example requires familiarity with the her    222 This example requires familiarity with the herd7 "filter" clause, so
223 please read up on that topic in litmus-tests.t    223 please read up on that topic in litmus-tests.txt.
224                                                   224 
225 It is tempting to allow memory-reference instr    225 It is tempting to allow memory-reference instructions to be pulled
226 into a critical section, but this cannot be al    226 into a critical section, but this cannot be allowed in the general case.
227 For example, consider a spin loop preceding a     227 For example, consider a spin loop preceding a lock-based critical section.
228 Now, herd7 does not model spin loops, but we c    228 Now, herd7 does not model spin loops, but we can emulate one with two
229 loads, with a "filter" clause to constrain the    229 loads, with a "filter" clause to constrain the first to return the
230 initial value and the second to return the upd    230 initial value and the second to return the updated value, as shown below:
231                                                   231 
232         /* See Documentation/litmus-tests/lock    232         /* See Documentation/litmus-tests/locking/RM-fixed.litmus. */
233         void CPU0(void)                           233         void CPU0(void)
234         {                                         234         {
235                 spin_lock(&lck);                  235                 spin_lock(&lck);
236                 r2 = atomic_inc_return(&y);       236                 r2 = atomic_inc_return(&y);
237                 WRITE_ONCE(x, 1);                 237                 WRITE_ONCE(x, 1);
238                 spin_unlock(&lck);                238                 spin_unlock(&lck);
239         }                                         239         }
240                                                   240 
241         void CPU1(void)                           241         void CPU1(void)
242         {                                         242         {
243                 r0 = READ_ONCE(x);                243                 r0 = READ_ONCE(x);
244                 r1 = READ_ONCE(x);                244                 r1 = READ_ONCE(x);
245                 spin_lock(&lck);                  245                 spin_lock(&lck);
246                 r2 = atomic_inc_return(&y);       246                 r2 = atomic_inc_return(&y);
247                 spin_unlock(&lck);                247                 spin_unlock(&lck);
248         }                                         248         }
249                                                   249 
250         filter (1:r0=0 /\ 1:r1=1)                 250         filter (1:r0=0 /\ 1:r1=1)
251         exists (1:r2=1)                           251         exists (1:r2=1)
252                                                   252 
253 The variable "x" is the control variable for t    253 The variable "x" is the control variable for the emulated spin loop.
254 CPU0() sets it to "1" while holding the lock,     254 CPU0() sets it to "1" while holding the lock, and CPU1() emulates the
255 spin loop by reading it twice, first into "1:r    255 spin loop by reading it twice, first into "1:r0" (which should get the
256 initial value "0") and then into "1:r1" (which    256 initial value "0") and then into "1:r1" (which should get the updated
257 value "1").                                       257 value "1").
258                                                   258 
259 The "filter" clause takes this into account, c    259 The "filter" clause takes this into account, constraining "1:r0" to
260 equal "0" and "1:r1" to equal 1.                  260 equal "0" and "1:r1" to equal 1.
261                                                   261 
262 Then the "exists" clause checks to see if CPU1    262 Then the "exists" clause checks to see if CPU1() acquired its lock first,
263 which should not happen given the filter claus    263 which should not happen given the filter clause because CPU0() updates
264 "x" while holding the lock.  And herd7 confirm    264 "x" while holding the lock.  And herd7 confirms this.
265                                                   265 
266 But suppose that the compiler was permitted to    266 But suppose that the compiler was permitted to reorder the spin loop
267 into CPU1()'s critical section, like this:        267 into CPU1()'s critical section, like this:
268                                                   268 
269         /* See Documentation/litmus-tests/lock    269         /* See Documentation/litmus-tests/locking/RM-broken.litmus. */
270         void CPU0(void)                           270         void CPU0(void)
271         {                                         271         {
272                 int r2;                           272                 int r2;
273                                                   273 
274                 spin_lock(&lck);                  274                 spin_lock(&lck);
275                 r2 = atomic_inc_return(&y);       275                 r2 = atomic_inc_return(&y);
276                 WRITE_ONCE(x, 1);                 276                 WRITE_ONCE(x, 1);
277                 spin_unlock(&lck);                277                 spin_unlock(&lck);
278         }                                         278         }
279                                                   279 
280         void CPU1(void)                           280         void CPU1(void)
281         {                                         281         {
282                 spin_lock(&lck);                  282                 spin_lock(&lck);
283                 r0 = READ_ONCE(x);                283                 r0 = READ_ONCE(x);
284                 r1 = READ_ONCE(x);                284                 r1 = READ_ONCE(x);
285                 r2 = atomic_inc_return(&y);       285                 r2 = atomic_inc_return(&y);
286                 spin_unlock(&lck);                286                 spin_unlock(&lck);
287         }                                         287         }
288                                                   288 
289         filter (1:r0=0 /\ 1:r1=1)                 289         filter (1:r0=0 /\ 1:r1=1)
290         exists (1:r2=1)                           290         exists (1:r2=1)
291                                                   291 
292 If "1:r0" is equal to "0", "1:r1" can never eq    292 If "1:r0" is equal to "0", "1:r1" can never equal "1" because CPU0()
293 cannot update "x" while CPU1() holds the lock.    293 cannot update "x" while CPU1() holds the lock.  And herd7 confirms this,
294 showing zero executions matching the "filter"     294 showing zero executions matching the "filter" criteria.
295                                                   295 
296 And this is why Linux-kernel lock and unlock p    296 And this is why Linux-kernel lock and unlock primitives must prevent
297 code from entering critical sections.  It is n    297 code from entering critical sections.  It is not sufficient to only
298 prevent code from leaving them.                   298 prevent code from leaving them.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php