1 ============================================== 1 =========================================================================== 2 Proper Locking Under a Preemptible Kernel: Kee 2 Proper Locking Under a Preemptible Kernel: Keeping Kernel Code Preempt-Safe 3 ============================================== 3 =========================================================================== 4 4 5 :Author: Robert Love <rml@tech9.net> 5 :Author: Robert Love <rml@tech9.net> 6 6 7 7 8 Introduction 8 Introduction 9 ============ 9 ============ 10 10 11 11 12 A preemptible kernel creates new locking issue 12 A preemptible kernel creates new locking issues. The issues are the same as 13 those under SMP: concurrency and reentrancy. 13 those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible 14 kernel model leverages existing SMP locking me 14 kernel model leverages existing SMP locking mechanisms. Thus, the kernel 15 requires explicit additional locking for very 15 requires explicit additional locking for very few additional situations. 16 16 17 This document is for all kernel hackers. Deve 17 This document is for all kernel hackers. Developing code in the kernel 18 requires protecting these situations. 18 requires protecting these situations. 19 19 20 20 21 RULE #1: Per-CPU data structures need explicit 21 RULE #1: Per-CPU data structures need explicit protection 22 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23 23 24 24 25 Two similar problems arise. An example code sn 25 Two similar problems arise. An example code snippet:: 26 26 27 struct this_needs_locking tux[NR_CPUS] 27 struct this_needs_locking tux[NR_CPUS]; 28 tux[smp_processor_id()] = some_value; 28 tux[smp_processor_id()] = some_value; 29 /* task is preempted here... */ 29 /* task is preempted here... */ 30 something = tux[smp_processor_id()]; 30 something = tux[smp_processor_id()]; 31 31 32 First, since the data is per-CPU, it may not h 32 First, since the data is per-CPU, it may not have explicit SMP locking, but 33 require it otherwise. Second, when a preempte 33 require it otherwise. Second, when a preempted task is finally rescheduled, 34 the previous value of smp_processor_id may not 34 the previous value of smp_processor_id may not equal the current. You must 35 protect these situations by disabling preempti 35 protect these situations by disabling preemption around them. 36 36 37 You can also use put_cpu() and get_cpu(), whic 37 You can also use put_cpu() and get_cpu(), which will disable preemption. 38 38 39 39 40 RULE #2: CPU state must be protected. 40 RULE #2: CPU state must be protected. 41 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 41 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 42 42 43 43 44 Under preemption, the state of the CPU must be 44 Under preemption, the state of the CPU must be protected. This is arch- 45 dependent, but includes CPU structures and sta 45 dependent, but includes CPU structures and state not preserved over a context 46 switch. For example, on x86, entering and exi 46 switch. For example, on x86, entering and exiting FPU mode is now a critical 47 section that must occur while preemption is di 47 section that must occur while preemption is disabled. Think what would happen 48 if the kernel is executing a floating-point in 48 if the kernel is executing a floating-point instruction and is then preempted. 49 Remember, the kernel does not save FPU state e 49 Remember, the kernel does not save FPU state except for user tasks. Therefore, 50 upon preemption, the FPU registers will be sol 50 upon preemption, the FPU registers will be sold to the lowest bidder. Thus, 51 preemption must be disabled around such region 51 preemption must be disabled around such regions. 52 52 53 Note, some FPU functions are already explicitl 53 Note, some FPU functions are already explicitly preempt safe. For example, 54 kernel_fpu_begin and kernel_fpu_end will disab 54 kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. 55 55 56 56 57 RULE #3: Lock acquire and release must be perf 57 RULE #3: Lock acquire and release must be performed by same task 58 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 58 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 59 59 60 60 61 A lock acquired in one task must be released b 61 A lock acquired in one task must be released by the same task. This 62 means you can't do oddball things like acquire 62 means you can't do oddball things like acquire a lock and go off to 63 play while another task releases it. If you w 63 play while another task releases it. If you want to do something 64 like this, acquire and release the task in the 64 like this, acquire and release the task in the same code path and 65 have the caller wait on an event by the other 65 have the caller wait on an event by the other task. 66 66 67 67 68 Solution 68 Solution 69 ======== 69 ======== 70 70 71 71 72 Data protection under preemption is achieved b 72 Data protection under preemption is achieved by disabling preemption for the 73 duration of the critical region. 73 duration of the critical region. 74 74 75 :: 75 :: 76 76 77 preempt_enable() decrement the 77 preempt_enable() decrement the preempt counter 78 preempt_disable() increment the 78 preempt_disable() increment the preempt counter 79 preempt_enable_no_resched() decrement, but 79 preempt_enable_no_resched() decrement, but do not immediately preempt 80 preempt_check_resched() if needed, res 80 preempt_check_resched() if needed, reschedule 81 preempt_count() return the pre 81 preempt_count() return the preempt counter 82 82 83 The functions are nestable. In other words, y 83 The functions are nestable. In other words, you can call preempt_disable 84 n-times in a code path, and preemption will no 84 n-times in a code path, and preemption will not be reenabled until the n-th 85 call to preempt_enable. The preempt statement 85 call to preempt_enable. The preempt statements define to nothing if 86 preemption is not enabled. 86 preemption is not enabled. 87 87 88 Note that you do not need to explicitly preven 88 Note that you do not need to explicitly prevent preemption if you are holding 89 any locks or interrupts are disabled, since pr 89 any locks or interrupts are disabled, since preemption is implicitly disabled 90 in those cases. 90 in those cases. 91 91 92 But keep in mind that 'irqs disabled' is a fun 92 But keep in mind that 'irqs disabled' is a fundamentally unsafe way of 93 disabling preemption - any cond_resched() or c 93 disabling preemption - any cond_resched() or cond_resched_lock() might trigger 94 a reschedule if the preempt count is 0. A simp 94 a reschedule if the preempt count is 0. A simple printk() might trigger a 95 reschedule. So use this implicit preemption-di 95 reschedule. So use this implicit preemption-disabling property only if you 96 know that the affected codepath does not do an 96 know that the affected codepath does not do any of this. Best policy is to use 97 this only for small, atomic code that you wrot 97 this only for small, atomic code that you wrote and which calls no complex 98 functions. 98 functions. 99 99 100 Example:: 100 Example:: 101 101 102 cpucache_t *cc; /* this is per-CPU */ 102 cpucache_t *cc; /* this is per-CPU */ 103 preempt_disable(); 103 preempt_disable(); 104 cc = cc_data(searchp); 104 cc = cc_data(searchp); 105 if (cc && cc->avail) { 105 if (cc && cc->avail) { 106 __free_block(searchp, cc_entry 106 __free_block(searchp, cc_entry(cc), cc->avail); 107 cc->avail = 0; 107 cc->avail = 0; 108 } 108 } 109 preempt_enable(); 109 preempt_enable(); 110 return 0; 110 return 0; 111 111 112 Notice how the preemption statements must enco 112 Notice how the preemption statements must encompass every reference of the 113 critical variables. Another example:: 113 critical variables. Another example:: 114 114 115 int buf[NR_CPUS]; 115 int buf[NR_CPUS]; 116 set_cpu_val(buf); 116 set_cpu_val(buf); 117 if (buf[smp_processor_id()] == -1) pri 117 if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n"); 118 spin_lock(&buf_lock); 118 spin_lock(&buf_lock); 119 /* ... */ 119 /* ... */ 120 120 121 This code is not preempt-safe, but see how eas 121 This code is not preempt-safe, but see how easily we can fix it by simply 122 moving the spin_lock up two lines. 122 moving the spin_lock up two lines. 123 123 124 124 125 Preventing preemption using interrupt disablin 125 Preventing preemption using interrupt disabling 126 ============================================== 126 =============================================== 127 127 128 128 129 It is possible to prevent a preemption event u 129 It is possible to prevent a preemption event using local_irq_disable and 130 local_irq_save. Note, when doing so, you must 130 local_irq_save. Note, when doing so, you must be very careful to not cause 131 an event that would set need_resched and resul 131 an event that would set need_resched and result in a preemption check. When 132 in doubt, rely on locking or explicit preempti 132 in doubt, rely on locking or explicit preemption disabling. 133 133 134 Note in 2.5 interrupt disabling is now only pe 134 Note in 2.5 interrupt disabling is now only per-CPU (e.g. local). 135 135 136 An additional concern is proper usage of local 136 An additional concern is proper usage of local_irq_disable and local_irq_save. 137 These may be used to protect from preemption, 137 These may be used to protect from preemption, however, on exit, if preemption 138 may be enabled, a test to see if preemption is 138 may be enabled, a test to see if preemption is required should be done. If 139 these are called from the spin_lock and read/w 139 these are called from the spin_lock and read/write lock macros, the right thing 140 is done. They may also be called within a spi 140 is done. They may also be called within a spin-lock protected region, however, 141 if they are ever called outside of this contex 141 if they are ever called outside of this context, a test for preemption should 142 be made. Do note that calls from interrupt con 142 be made. Do note that calls from interrupt context or bottom half/ tasklets 143 are also protected by preemption locks and so 143 are also protected by preemption locks and so may use the versions which do 144 not check preemption. 144 not check preemption.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.