1 Linux-Kernel Memory Model Litmus Tests 2 ====================================== 3 4 This file describes the LKMM litmus-test forma 5 some tricks and traps, and finally outlines LK 6 versions of this material appeared in a number 7 8 https://lwn.net/Articles/720550/ 9 A formal kernel memory-ordering model 10 https://lwn.net/Articles/608550/ 11 Axiomatic validation of memory barrier 12 https://lwn.net/Articles/470681/ 13 Validating Memory Barriers and Atomic 14 15 This document presents information in decreasi 16 so that, where possible, the information that 17 useful is shown near the beginning. 18 19 For information on installing LKMM, including 20 tool, please see tools/memory-model/README. 21 22 23 Copy-Pasta 24 ========== 25 26 As with other software, it is often better (if 27 existing litmus test than it is to create one 28 of litmus tests may be found in the kernel sou 29 30 tools/memory-model/litmus-tests/ 31 Documentation/litmus-tests/ 32 33 Several thousand more example litmus tests are 34 and kernel.org: 35 36 https://github.com/paulmckrcu/litmus 37 https://git.kernel.org/pub/scm/linux/k 38 https://git.kernel.org/pub/scm/linux/k 39 40 The -l and -L arguments to "git grep" can be q 41 existing litmus tests that are similar to the 42 you start with an existing litmus test, it is 43 good understanding of the litmus-test format. 44 45 46 Examples and Format 47 =================== 48 49 This section describes the overall format of l 50 with a small example of the message-passing pa 51 more complex examples that illustrate explicit 52 minimalistic set of flow-control statements. 53 54 55 Message-Passing Example 56 ----------------------- 57 58 This section gives an overview of the format o 59 example based on the common message-passing us 60 appears often in the Linux kernel. For exampl 61 below) indicates that a buffer (modeled by "x" 62 filled in and ready for use. It would be very 63 flag set, but, due to memory misordering, saw 64 65 This example asks whether smp_store_release() 66 suffices to avoid this bad outcome: 67 68 1 C MP+pooncerelease+poacquireonce 69 2 70 3 {} 71 4 72 5 P0(int *x, int *y) 73 6 { 74 7 WRITE_ONCE(*x, 1); 75 8 smp_store_release(y, 1); 76 9 } 77 10 78 11 P1(int *x, int *y) 79 12 { 80 13 int r0; 81 14 int r1; 82 15 83 16 r0 = smp_load_acquire(y); 84 17 r1 = READ_ONCE(*x); 85 18 } 86 19 87 20 exists (1:r0=1 /\ 1:r1=0) 88 89 Line 1 starts with "C", which identifies this 90 LKMM C-language format (which, as we will see, 91 of the full C language). The remainder of lin 92 the test, which by convention is the filename 93 suffix stripped. In this case, the actual tes 94 tools/memory-model/litmus-tests/MP+pooncerelea 95 in the Linux-kernel source tree. 96 97 Mechanically generated litmus tests will often 98 double-quoted comment string on the second lin 99 when running the test. Yes, you can add your 100 tests, but this is a bit involved due to the u 101 For now, you can use C-language comments in th 102 may be in either the "/* */" or the "//" style 103 cover the full litmus-test commenting story. 104 105 Line 3 is the initialization section. Because 106 to zero suffices for this test, the "{}" synta 107 initialization section is empty. Litmus tests 108 initialization must have non-empty initializat 109 example that will be presented later in this d 110 111 Lines 5-9 show the first process and lines 11- 112 process corresponds to a Linux-kernel task (or 113 and so on; LKMM discussions often use these te 114 The name of the first process is "P0" and that 115 You can name your processes anything you like 116 of a single "P" followed by a number, and as l 117 consecutive starting with zero. This can actu 118 for example, a .litmus file matching "^P1(" bu 119 must contain a two-process litmus test. 120 121 The argument list for each function are pointe 122 used by that function. Unlike normal C-langua 123 names are significant. The fact that both P0( 124 parameter named "x" means that these two proce 125 same global variable, also named "x". So the 126 and P1() mean that both processes are working 127 variables, "x" and "y". Global variables are 128 by reference, hence "P0(int *x, int *y)", but 129 130 P0() has no local variables, but P1() has two 131 These names may be freely chosen, but for hist 132 other litmus-test formats, it is conventional 133 "r" followed by a number as shown here. A com 134 is forgetting to add a global variable to a pr 135 This will sometimes result in an error message 136 intended global to instead be silently treated 137 variable. 138 139 Each process's code is similar to Linux-kernel 140 7-8 and 13-17. This code may use many of the 141 operations, some of its exclusive-lock functio 142 and SRCU functions. An approximate list of th 143 functions may be found in the linux-kernel.def 144 145 The P0() process does "WRITE_ONCE(*x, 1)" on l 146 pointer in P0()'s parameter list, this does an 147 variable "x". Line 8 does "smp_store_release( 148 is also in P0()'s parameter list, this does a 149 variable "y". 150 151 The P1() process declares two local variables 152 Line 16 does "r0 = smp_load_acquire(y)" which 153 from global variable "y" into local variable " 154 "r1 = READ_ONCE(*x)", which does an unordered 155 variable "r1". Both "x" and "y" are in P1()'s 156 reference the same global variables that are u 157 158 Line 20 is the "exists" assertion expression t 159 This final state is evaluated after the dust h 160 have completed and all of their memory referen 161 have propagated to all parts of the system. T 162 variables "r0" and "r1" in line 24 must be pre 163 which process they are local to. 164 165 Note that the assertion expression is written 166 language rather than in C. For example, singl 167 operator rather than an assignment. The "/\" 168 "and". Similarly, "\/" stands for "or". Both 169 representations of the corresponding mathemati 170 "~" stands for "logical not", which is "!" in 171 with the C-language "~" operator which instead 172 Parentheses may be used to override precedence 173 174 The "exists" assertion on line 20 is satisfied 175 flag ("y") set but the buffer ("x") as not yet 176 loaded a value from "x" that was equal to 1 bu 177 that was still equal to zero. 178 179 This example can be checked by running the fol 180 absolutely must be run from the tools/memory-m 181 this directory only: 182 183 herd7 -conf linux-kernel.cfg litmus-tests/MP+p 184 185 The output is the result of something similar 186 search, and is as follows: 187 188 1 Test MP+pooncerelease+poacquireonce Allowed 189 2 States 3 190 3 1:r0=0; 1:r1=0; 191 4 1:r0=0; 1:r1=1; 192 5 1:r0=1; 1:r1=1; 193 6 No 194 7 Witnesses 195 8 Positive: 0 Negative: 3 196 9 Condition exists (1:r0=1 /\ 1:r1=0) 197 10 Observation MP+pooncerelease+poacquireonce 198 11 Time MP+pooncerelease+poacquireonce 0.00 199 12 Hash=579aaa14d8c35a39429b02e698241d09 200 201 The most pertinent line is line 10, which cont 202 indicates that the bad result flagged by the " 203 happens. This line might instead say "Sometim 204 bad result happened in some but not all execut 205 "Always" to indicate that the bad result happe 206 (The herd7 tool doesn't judge, so it is only a 207 "exists" clause indicates a bad result. To se 208 clause's condition and run the test.) The num 209 of this line indicate the number of end states 210 clause (0) and the number not not satisfying t 211 212 Another important part of this output is shown 213 214 2 States 3 215 3 1:r0=0; 1:r1=0; 216 4 1:r0=0; 1:r1=1; 217 5 1:r0=1; 1:r1=1; 218 219 Line 2 gives the total number of end states, a 220 one of these states, with the first ("1:r0=0; 221 both of P1()'s loads returned the value "0". 222 "Never" on line 10, the state flagged by the " 223 listed. This full list of states can be helpf 224 litmus test. 225 226 The rest of the output is not normally needed, 227 or due to being redundant with the lines discu 228 following paragraph lists them for the benefit 229 an insatiable curiosity. Other readers should 230 231 Line 1 echos the test name, along with the "Te 232 "No" says that the "exists" clause was not sat 233 and as such it has the same meaning as line 10 234 lead-in to line 8's "Positive: 0 Negative: 3", 235 of end states satisfying and not satisfying th 236 like the two numbers at the end of line 10. L 237 clause so that you don't have to look it up in 238 The number at the end of line 11 (which begins 239 time in seconds required to analyze the litmus 240 as this one complete in a few milliseconds, so 241 Line 12 gives a hash of the contents for the l 242 by tooling that manages litmus tests and their 243 used by people modifying LKMM itself, and amon 244 people know which of the several thousand rele 245 affected by a given change to LKMM. 246 247 248 Initialization 249 -------------- 250 251 The previous example relied on the default zer 252 "x" and "y", but a similar litmus test could i 253 to some other value: 254 255 1 C MP+pooncerelease+poacquireonce 256 2 257 3 { 258 4 x=42; 259 5 y=42; 260 6 } 261 7 262 8 P0(int *x, int *y) 263 9 { 264 10 WRITE_ONCE(*x, 1); 265 11 smp_store_release(y, 1); 266 12 } 267 13 268 14 P1(int *x, int *y) 269 15 { 270 16 int r0; 271 17 int r1; 272 18 273 19 r0 = smp_load_acquire(y); 274 20 r1 = READ_ONCE(*x); 275 21 } 276 22 277 23 exists (1:r0=1 /\ 1:r1=42) 278 279 Lines 3-6 now initialize both "x" and "y" to t 280 means that the "exists" clause on line 23 must 281 "1:r1=42". 282 283 Running the test gives the same overall result 284 value 42 appearing in place of the value zero: 285 286 1 Test MP+pooncerelease+poacquireonce Allowed 287 2 States 3 288 3 1:r0=1; 1:r1=1; 289 4 1:r0=42; 1:r1=1; 290 5 1:r0=42; 1:r1=42; 291 6 No 292 7 Witnesses 293 8 Positive: 0 Negative: 3 294 9 Condition exists (1:r0=1 /\ 1:r1=42) 295 10 Observation MP+pooncerelease+poacquireonce 296 11 Time MP+pooncerelease+poacquireonce 0.02 297 12 Hash=ab9a9b7940a75a792266be279a980156 298 299 It is tempting to avoid the open-coded repetit 300 by defining another global variable "initval=4 301 occurrences of "42" with "initval". This will 302 initialize "x" and "y" to 42, but instead to t 303 (try it!). See the section below on linked li 304 why this approach to initialization can be use 305 306 307 Control Structures 308 ------------------ 309 310 LKMM supports the C-language "if" statement, w 311 conditional branches. In LKMM, conditional br 312 but only if you are *very* careful (compilers 313 to optimize away conditional branches). The f 314 the "load buffering" (LB) use case that is use 315 synchronize between ring-buffer producers and 316 below, P0() is one side checking to see if an 317 P1() is the other side completing its update. 318 319 1 C LB+fencembonceonce+ctrlonceonce 320 2 321 3 {} 322 4 323 5 P0(int *x, int *y) 324 6 { 325 7 int r0; 326 8 327 9 r0 = READ_ONCE(*x); 328 10 if (r0) 329 11 WRITE_ONCE(*y, 1); 330 12 } 331 13 332 14 P1(int *x, int *y) 333 15 { 334 16 int r0; 335 17 336 18 r0 = READ_ONCE(*y); 337 19 smp_mb(); 338 20 WRITE_ONCE(*x, 1); 339 21 } 340 22 341 23 exists (0:r0=1 /\ 1:r0=1) 342 343 P1()'s "if" statement on line 10 works as expe 344 executed only if line 9 loads a non-zero value 345 write of "1" to "x" happens only after P1()'s 346 hope that the "exists" clause cannot be satisf 347 348 1 Test LB+fencembonceonce+ctrlonceonce Allowe 349 2 States 2 350 3 0:r0=0; 1:r0=0; 351 4 0:r0=1; 1:r0=0; 352 5 No 353 6 Witnesses 354 7 Positive: 0 Negative: 2 355 8 Condition exists (0:r0=1 /\ 1:r0=1) 356 9 Observation LB+fencembonceonce+ctrlonceonce 357 10 Time LB+fencembonceonce+ctrlonceonce 0.00 358 11 Hash=e5260556f6de495fd39b556d1b831c3b 359 360 However, there is no "while" statement due to 361 state-space search has some difficulty with it 362 are tricks that may be used to handle some spe 363 discussed below. In addition, loop-unrolling 364 albeit sparingly. 365 366 367 Tricks and Traps 368 ================ 369 370 This section covers extracting debug output fr 371 spin loops, handling trivial linked lists, add 372 emulating call_rcu(), and finally tricks to im 373 in order to better handle large litmus tests. 374 375 376 Debug Output 377 ------------ 378 379 By default, the herd7 state output includes al 380 in the "exists" clause. But sometimes debuggi 381 aided by the values of other variables. Consi 382 (tools/memory-order/litmus-tests/SB+rfionceonc 383 slightly modified), which probes an obscure co 384 ordering: 385 386 1 C SB+rfionceonce-poonceonces 387 2 388 3 {} 389 4 390 5 P0(int *x, int *y) 391 6 { 392 7 int r1; 393 8 int r2; 394 9 395 10 WRITE_ONCE(*x, 1); 396 11 r1 = READ_ONCE(*x); 397 12 r2 = READ_ONCE(*y); 398 13 } 399 14 400 15 P1(int *x, int *y) 401 16 { 402 17 int r3; 403 18 int r4; 404 19 405 20 WRITE_ONCE(*y, 1); 406 21 r3 = READ_ONCE(*y); 407 22 r4 = READ_ONCE(*x); 408 23 } 409 24 410 25 exists (0:r2=0 /\ 1:r4=0) 411 412 The herd7 output is as follows: 413 414 1 Test SB+rfionceonce-poonceonces Allowed 415 2 States 4 416 3 0:r2=0; 1:r4=0; 417 4 0:r2=0; 1:r4=1; 418 5 0:r2=1; 1:r4=0; 419 6 0:r2=1; 1:r4=1; 420 7 Ok 421 8 Witnesses 422 9 Positive: 1 Negative: 3 423 10 Condition exists (0:r2=0 /\ 1:r4=0) 424 11 Observation SB+rfionceonce-poonceonces Some 425 12 Time SB+rfionceonce-poonceonces 0.01 426 13 Hash=c7f30fe0faebb7d565405d55b7318ada 427 428 (This output indicates that CPUs are permitted 429 store buffers", which all of Linux's CPU famil 430 happily do. Such snooping results in disagree 431 order of stores from different CPUs, which is 432 433 But the herd7 output shows only the two variab 434 "exists" clause. Someone modifying this test 435 values of "x", "y", "0:r1", and "0:r3" as well 436 statement on line 25 shows how to cause herd7 437 variables: 438 439 1 C SB+rfionceonce-poonceonces 440 2 441 3 {} 442 4 443 5 P0(int *x, int *y) 444 6 { 445 7 int r1; 446 8 int r2; 447 9 448 10 WRITE_ONCE(*x, 1); 449 11 r1 = READ_ONCE(*x); 450 12 r2 = READ_ONCE(*y); 451 13 } 452 14 453 15 P1(int *x, int *y) 454 16 { 455 17 int r3; 456 18 int r4; 457 19 458 20 WRITE_ONCE(*y, 1); 459 21 r3 = READ_ONCE(*y); 460 22 r4 = READ_ONCE(*x); 461 23 } 462 24 463 25 locations [0:r1; 1:r3; x; y] 464 26 exists (0:r2=0 /\ 1:r4=0) 465 466 The herd7 output then displays the values of a 467 468 1 Test SB+rfionceonce-poonceonces Allowed 469 2 States 4 470 3 0:r1=1; 0:r2=0; 1:r3=1; 1:r4=0; x=1; y=1; 471 4 0:r1=1; 0:r2=0; 1:r3=1; 1:r4=1; x=1; y=1; 472 5 0:r1=1; 0:r2=1; 1:r3=1; 1:r4=0; x=1; y=1; 473 6 0:r1=1; 0:r2=1; 1:r3=1; 1:r4=1; x=1; y=1; 474 7 Ok 475 8 Witnesses 476 9 Positive: 1 Negative: 3 477 10 Condition exists (0:r2=0 /\ 1:r4=0) 478 11 Observation SB+rfionceonce-poonceonces Some 479 12 Time SB+rfionceonce-poonceonces 0.01 480 13 Hash=40de8418c4b395388f6501cafd1ed38d 481 482 What if you would like to know the value of a 483 at some particular point in a given process's 484 is to use a READ_ONCE() to load that global va 485 variable, then add that local variable to the 486 But be careful: In some litmus tests, adding 487 the outcome! For one example, please see the 488 C-READ_ONCE-omitted.litmus tests located here: 489 490 https://github.com/paulmckrcu/litmus/b 491 492 493 Spin Loops 494 ---------- 495 496 The analysis carried out by herd7 explores ful 497 at best of exponential time complexity. Addin 498 the amount of code in a give process can great 499 Potentially infinite loops, such as those used 500 become available, are clearly problematic. 501 502 Fortunately, it is possible to avoid state-spa 503 modeling such loops. For example, the followi 504 locking using xchg_acquire(), but instead of e 505 in a spin loop, it instead excludes executions 506 lock using a herd7 "filter" clause. Note that 507 are better off using the spin_lock() and spin_ 508 models, if for no other reason that these are 509 techniques illustrated in this section can be 510 such as emulating reader-writer locking, which 511 512 1 C C-SB+l-o-o-u+l-o-o-u-X 513 2 514 3 { 515 4 } 516 5 517 6 P0(int *sl, int *x0, int *x1) 518 7 { 519 8 int r2; 520 9 int r1; 521 10 522 11 r2 = xchg_acquire(sl, 1); 523 12 WRITE_ONCE(*x0, 1); 524 13 r1 = READ_ONCE(*x1); 525 14 smp_store_release(sl, 0); 526 15 } 527 16 528 17 P1(int *sl, int *x0, int *x1) 529 18 { 530 19 int r2; 531 20 int r1; 532 21 533 22 r2 = xchg_acquire(sl, 1); 534 23 WRITE_ONCE(*x1, 1); 535 24 r1 = READ_ONCE(*x0); 536 25 smp_store_release(sl, 0); 537 26 } 538 27 539 28 filter (0:r2=0 /\ 1:r2=0) 540 29 exists (0:r1=0 /\ 1:r1=0) 541 542 This litmus test may be found here: 543 544 https://git.kernel.org/pub/scm/linux/kernel/gi 545 546 This test uses two global variables, "x1" and 547 single global spinlock named "sl". This spinl 548 process changes the value of "sl" from "0" to 549 that process sets "sl" back to "0". P0()'s lo 550 on line 11 using xchg_acquire(), which uncondi 551 "1" to "sl" and stores either "0" or "1" to "r 552 the lock acquisition was successful or unsucce 553 having the value "1"), respectively. P1() ope 554 555 Rather unconventionally, execution appears to 556 section on lines 12 and 13 in either case. Li 557 smp_store_release() to store zero to "sl", thu 558 559 The case where xchg_acquire() fails to acquire 560 the "filter" clause on line 28, which tells he 561 executions in which both "0:r2" and "1:r2" are 562 attention only to those executions in which bo 563 acquired. Thus, the bogus executions that wou 564 sections are discarded and any effects that th 565 ignored. Note well that the "filter" clause k 566 for which its expression is satisfied, that is 567 evaluates to true. In other words, the "filte 568 keep, not what to discard. 569 570 The result of running this test is as follows: 571 572 1 Test C-SB+l-o-o-u+l-o-o-u-X Allowed 573 2 States 2 574 3 0:r1=0; 1:r1=1; 575 4 0:r1=1; 1:r1=0; 576 5 No 577 6 Witnesses 578 7 Positive: 0 Negative: 2 579 8 Condition exists (0:r1=0 /\ 1:r1=0) 580 9 Observation C-SB+l-o-o-u+l-o-o-u-X Never 0 581 10 Time C-SB+l-o-o-u+l-o-o-u-X 0.03 582 583 The "Never" on line 9 indicates that this use 584 smp_store_release() really does correctly emul 585 586 Why doesn't the litmus test take the simpler a 587 to handle failed spinlock acquisitions, like t 588 insight behind this litmus test is that spin l 589 possible "exists"-clause outcomes of program e 590 of deadlock. In other words, given a high-qua 591 primitive in a deadlock-free program running o 592 each lock acquisition will eventually succeed. 593 explores the full state space, the length of t 594 acquire the lock does not matter. After all, 595 possible durations of the xchg_acquire() state 596 597 Why not just add the "filter" clause to the "e 598 avoiding the "filter" clause entirely? This d 599 The reason that the "filter" clause is faster 600 herd7 knows to abandon an execution as soon as 601 fails to be satisfied. In contrast, the "exis 602 only at the end of time, thus requiring herd7 603 executions in which both critical sections pro 604 addition, some LKMM users like the separation 605 using the both the "filter" and "exists" claus 606 607 Readers lacking a pathological interest in odd 608 free to skip the remainder of this section. 609 610 But what if the litmus test were to temporaril 611 value? Wouldn't that cause herd7 to abandon t 612 due to an early mismatch of the "filter" claus 613 614 Why not just try it? Line 4 of the following 615 introduces a new global variable "x2" that is 616 of P1() reads that variable into "1:r2" to for 617 the "filter" clause. Line 24 does a known-tru 618 and static analysis that herd7 might do. Fina 619 on line 32 is updated to a condition that is a 620 of the test. 621 622 1 C C-SB+l-o-o-u+l-o-o-u-X 623 2 624 3 { 625 4 x2=1; 626 5 } 627 6 628 7 P0(int *sl, int *x0, int *x1) 629 8 { 630 9 int r2; 631 10 int r1; 632 11 633 12 r2 = xchg_acquire(sl, 1); 634 13 WRITE_ONCE(*x0, 1); 635 14 r1 = READ_ONCE(*x1); 636 15 smp_store_release(sl, 0); 637 16 } 638 17 639 18 P1(int *sl, int *x0, int *x1, int *x2) 640 19 { 641 20 int r2; 642 21 int r1; 643 22 644 23 r2 = READ_ONCE(*x2); 645 24 if (r2) 646 25 r2 = xchg_acquire(sl, 1); 647 26 WRITE_ONCE(*x1, 1); 648 27 r1 = READ_ONCE(*x0); 649 28 smp_store_release(sl, 0); 650 29 } 651 30 652 31 filter (0:r2=0 /\ 1:r2=0) 653 32 exists (x1=1) 654 655 If the "filter" clause were to check each vari 656 execution, running this litmus test would disp 657 all executions would be filtered out at line 2 658 is instead as follows: 659 660 1 Test C-SB+l-o-o-u+l-o-o-u-X Allowed 661 2 States 1 662 3 x1=1; 663 4 Ok 664 5 Witnesses 665 6 Positive: 2 Negative: 0 666 7 Condition exists (x1=1) 667 8 Observation C-SB+l-o-o-u+l-o-o-u-X Always 2 668 9 Time C-SB+l-o-o-u+l-o-o-u-X 0.04 669 10 Hash=080bc508da7f291e122c6de76c0088e3 670 671 Line 3 shows that there is one execution that 672 so the "filter" clause is evaluated only on th 673 the variables that it checks. In this case, t 674 disjunction, so it might be evaluated twice, o 675 assignment to "0:r2" and once at the final ass 676 677 678 Linked Lists 679 ------------ 680 681 LKMM can handle linked lists, but only linked 682 contains nothing except a pointer to the next 683 of course quite restrictive, but there is neve 684 can be done within these confines, as can be s 685 at tools/memory-model/litmus-tests/MP+onceassi 686 687 1 C MP+onceassign+derefonce 688 2 689 3 { 690 4 y=z; 691 5 z=0; 692 6 } 693 7 694 8 P0(int *x, int **y) 695 9 { 696 10 WRITE_ONCE(*x, 1); 697 11 rcu_assign_pointer(*y, x); 698 12 } 699 13 700 14 P1(int *x, int **y) 701 15 { 702 16 int *r0; 703 17 int r1; 704 18 705 19 rcu_read_lock(); 706 20 r0 = rcu_dereference(*y); 707 21 r1 = READ_ONCE(*r0); 708 22 rcu_read_unlock(); 709 23 } 710 24 711 25 exists (1:r0=x /\ 1:r1=0) 712 713 Line 4's "y=z" may seem odd, given that "z" ha 714 But "y=z" does not set the value of "y" to tha 715 sets the value of "y" to the *address* of "z". 716 create a simple linked list, with "y" pointing 717 NULL pointer. A much longer linked list could 718 and circular singly linked lists can also be c 719 720 The "exists" clause works the same way, with t 721 "r0" not to the value of "x", but again to its 722 "exists" clause therefore tests whether line 2 723 value stored by line 11, which is in fact what 724 725 P0()'s line 10 initializes "x" to the value 1 726 from "y", replacing "z". 727 728 P1()'s line 20 loads a pointer from "y", and l 729 pointer. The RCU read-side critical section s 730 for show in this example. Note that the addre 731 depends on (in this case, "is exactly the same 732 line 20. This is an example of what is called 733 This particular address dependency extends fro 734 load on line 21. Address dependencies provide 735 736 Running this test results in the following: 737 738 1 Test MP+onceassign+derefonce Allowed 739 2 States 2 740 3 1:r0=x; 1:r1=1; 741 4 1:r0=z; 1:r1=0; 742 5 No 743 6 Witnesses 744 7 Positive: 0 Negative: 2 745 8 Condition exists (1:r0=x /\ 1:r1=0) 746 9 Observation MP+onceassign+derefonce Never 0 747 10 Time MP+onceassign+derefonce 0.00 748 11 Hash=49ef7a741563570102448a256a0c8568 749 750 The only possible outcomes feature P1() loadin 751 (which contains zero) on the one hand and P1() 752 (which contains the value one) on the other. 753 because it says that RCU readers cannot see th 754 values when accessing a newly inserted list no 755 scenario is flagged by the "exists" clause, an 756 loaded a pointer to "x", but obtained the pre- 757 zero after dereferencing that pointer. 758 759 760 Comments 761 -------- 762 763 Different portions of a litmus test are proces 764 which has the charming effect of requiring dif 765 different portions of the litmus test. The C- 766 C-language comments (either "/* */" or "//"), 767 use Ocaml comments "(* *)". 768 769 The following litmus test illustrates the comm 770 to each syntactic unit of the test: 771 772 1 C MP+onceassign+derefonce (* A *) 773 2 774 3 (* B *) 775 4 776 5 { 777 6 y=z; (* C *) 778 7 z=0; 779 8 } // D 780 9 781 10 // E 782 11 783 12 P0(int *x, int **y) // F 784 13 { 785 14 WRITE_ONCE(*x, 1); // G 786 15 rcu_assign_pointer(*y, x); 787 16 } 788 17 789 18 // H 790 19 791 20 P1(int *x, int **y) 792 21 { 793 22 int *r0; 794 23 int r1; 795 24 796 25 rcu_read_lock(); 797 26 r0 = rcu_dereference(*y); 798 27 r1 = READ_ONCE(*r0); 799 28 rcu_read_unlock(); 800 29 } 801 30 802 31 // I 803 32 804 33 exists (* J *) (1:r0=x /\ (* K *) 1:r1=0) ( 805 806 In short, use C-language comments in the C cod 807 the rest of the litmus test. 808 809 On the other hand, if you prefer C-style comme 810 C preprocessor is your friend. 811 812 813 Asynchronous RCU Grace Periods 814 ------------------------------ 815 816 The following litmus test is derived from the 817 Documentation/litmus-tests/rcu/RCU+sync+free.l 818 emulate call_rcu(): 819 820 1 C RCU+sync+free 821 2 822 3 { 823 4 int x = 1; 824 5 int *y = &x; 825 6 int z = 1; 826 7 } 827 8 828 9 P0(int *x, int *z, int **y) 829 10 { 830 11 int *r0; 831 12 int r1; 832 13 833 14 rcu_read_lock(); 834 15 r0 = rcu_dereference(*y); 835 16 r1 = READ_ONCE(*r0); 836 17 rcu_read_unlock(); 837 18 } 838 19 839 20 P1(int *z, int **y, int *c) 840 21 { 841 22 rcu_assign_pointer(*y, z); 842 23 smp_store_release(*c, 1); // Emulate call 843 24 } 844 25 845 26 P2(int *x, int *z, int **y, int *c) 846 27 { 847 28 int r0; 848 29 849 30 r0 = smp_load_acquire(*c); // Note call_r 850 31 synchronize_rcu(); // Wait one grace peri 851 32 WRITE_ONCE(*x, 0); // Emulate the RCU cal 852 33 } 853 34 854 35 filter (2:r0=1) (* Reject too-early starts. 855 36 exists (0:r0=x /\ 0:r1=0) 856 857 Lines 4-6 initialize a linked list headed by " 858 "x". In addition, "z" is pre-initialized to p 859 will replace "x" with "z" in this list. 860 861 P0() on lines 9-18 enters an RCU read-side cri 862 list header "y" and dereferences it, leaving t 863 the node's value in "0:r1". 864 865 P1() on lines 20-24 updates the list header to 866 then emulates call_rcu() by doing a release st 867 868 P2() on lines 27-33 emulates the behind-the-sc 869 call_rcu(). Line 30 first does an acquire loa 870 waits for an RCU grace period to elapse, and f 871 the RCU callback, which in turn emulates a cal 872 873 Of course, it is possible for P2() to start to 874 value of "2:r0" is zero rather than the requir 875 The "filter" clause on line 35 handles this po 876 all executions in which "2:r0" is not equal to 877 878 879 Performance 880 ----------- 881 882 LKMM's exploration of the full state-space can 883 but it does not come for free. The price is e 884 complexity in terms of the number of processes 885 of statements in each process, and the total n 886 litmus test. 887 888 So it is best to start small and then work up. 889 your code down into small pieces each represen 890 requirement. 891 892 That said, herd7 is quite fast. On an unprepo 893 was able to analyze the following 10-process R 894 six seconds. 895 896 https://github.com/paulmckrcu/litmus/blob/mast 897 898 One way to make herd7 run faster is to use the 899 This option prevents herd7 from generating all 900 instead causing it to focus solely on whether 901 clause can be satisfied. With this option, he 902 litmus test in about 300 milliseconds, for mor 903 improvement in performance. 904 905 Larger 16-process litmus tests that would norm 906 of time complete in about 40 seconds with this 907 you do get an extra 65,535 states when you lea 908 true" option. 909 910 https://github.com/paulmckrcu/litmus/blob/mast 911 912 Nevertheless, litmus-test analysis really is o 913 whether with or without "-speedcheck true". I 914 processes to a 19-process litmus test requires 915 without, and about 8 minutes with "-speedcheck 916 results represent roughly an order of magnitud 917 16-process litmus test. Again, to be fair, th 918 no fewer than 524,287 additional states compar 919 920 https://github.com/paulmckrcu/litmus/blob/mast 921 922 If you don't like command-line arguments, you 923 by adding a "filter" clause with exactly the s 924 "exists" clause. 925 926 However, please note that seeing the full set 927 helpful when developing and debugging litmus t 928 929 930 LIMITATIONS 931 =========== 932 933 Limitations of the Linux-kernel memory model ( 934 935 1. Compiler optimizations are not accurat 936 the use of READ_ONCE() and WRITE_ONCE( 937 ability to optimize, but under some ci 938 for the compiler to undermine the memo 939 information, see Documentation/explana 940 the "THE PROGRAM ORDER RELATION: po AN 941 sections). 942 943 Note that this limitation in turn limi 944 accurately model address, control, and 945 For example, if the compiler can deduc 946 carrying a dependency, then the compil 947 by substituting a constant of that val 948 949 Conversely, LKMM will sometimes overes 950 reordering compilers and CPUs can carr 951 some pretty obvious cases of ordering. 952 953 r1 = READ_ONCE(x); 954 if (r1 == 0) 955 smp_mb(); 956 WRITE_ONCE(y, 1); 957 958 The WRITE_ONCE() does not depend on th 959 result, LKMM does not claim ordering. 960 dependency is present, the WRITE_ONCE( 961 the READ_ONCE(). There are two reason 962 963 The presence of the smp_mb() i 964 prevents the compiler from mov 965 up before the "if" statement, 966 to assume that r1 will sometim 967 comment below); 968 969 CPUs do not execute stores bef 970 branches, even in cases where 971 two arms of the branch have re 972 973 It is clear that it is not dangerous i 974 make weaker guarantees than architectu 975 desirable, as it gives compilers room 976 For instance, suppose that a 0 value i 977 behavior elsewhere. Then a clever com 978 can never be 0 in the if condition. A 979 compiler might deem it safe to optimiz 980 eliminating the branch and any orderin 981 guarantee otherwise. 982 983 2. Multiple access sizes for a single var 984 and neither are misaligned or partiall 985 986 3. Exceptions and interrupts are not mode 987 this limitation can be overcome by mod 988 exception with an additional process. 989 990 4. I/O such as MMIO or DMA is not support 991 992 5. Self-modifying code (such as that foun 993 alternatives mechanism, function trace 994 JIT compiler, and module loader) is no 995 996 6. Complete modeling of all variants of a 997 operations, locking primitives, and RC 998 For example, call_rcu() and rcu_barrie 999 However, a substantial amount of suppo 1000 operations, as shown in the linux-ker 1001 1002 Here are specific limitations: 1003 1004 a. When rcu_assign_pointer() is 1005 kernel provides no ordering, 1006 case as a store release. 1007 1008 b. The "unless" RMW operations a 1009 atomic_long_add_unless(), ato 1010 and atomic_dec_unless_positiv 1011 in litmus tests, for example, 1012 1013 One exception of this limitat 1014 which is provided directly by 1015 definition in linux-kernel.de 1016 modeled by herd7 therefore it 1017 1018 c. The call_rcu() function is no 1019 it can be emulated in litmus 1020 process that invokes synchron 1021 callback function, with (for 1022 from the site of the emulated 1023 of the additional process. 1024 1025 d. The rcu_barrier() function is 1026 emulated in litmus tests emul 1027 (for example) a release-acqui 1028 additional call_rcu() process 1029 emulated rcu-barrier(). 1030 1031 e. Reader-writer locking is not 1032 emulated in litmus tests usin 1033 operations. 1034 1035 The fragment of the C language supported by t 1036 limited and in some ways non-standard: 1037 1038 1. There is no automatic C-preprocessor 1039 run it manually, if you choose. 1040 1041 2. There is no way to create functions o 1042 that model the concurrent processes. 1043 1044 3. The Pn() functions' formal parameters 1045 global shared variables. Nothing can 1046 these functions. 1047 1048 4. The only functions that can be invoke 1049 into herd7 or that are defined in the 1050 1051 5. The "switch", "do", "for", "while", a 1052 not supported. The "switch" statemen 1053 "if" statement. The "do", "for", and 1054 often be emulated by manually unrolli 1055 enlisting the aid of the C preprocess 1056 code duplication. Some uses of "goto 1057 and some others by unrolling. 1058 1059 6. Although you can use a wide variety o 1060 variable declarations, and especially 1061 declarations, the "herd7" tool unders 1062 pointer types. There is no support f 1063 enumerations, characters, strings, ar 1064 1065 7. Parsing of variable declarations is v 1066 type checking. 1067 1068 8. Initializers differ from their C-lang 1069 For example, when an initializer cont 1070 variable, that name denotes a pointer 1071 the current value of that variable. 1072 is interpreted the way "int x = &y" w 1073 1074 9. Dynamic memory allocation is not supp 1075 be worked around in some cases by sup 1076 allocated variables. 1077 1078 Some of these limitations may be overcome in 1079 more likely to be addressed by incorporating 1080 into other tools. 1081 1082 Finally, please note that LKMM is subject to 1083 and compilers evolve.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.