1 ============== 1 ============== 2 OSNOISE Tracer 2 OSNOISE Tracer 3 ============== 3 ============== 4 4 5 In the context of high-performance computing ( 5 In the context of high-performance computing (HPC), the Operating System 6 Noise (*osnoise*) refers to the interference e 6 Noise (*osnoise*) refers to the interference experienced by an application 7 due to activities inside the operating system. 7 due to activities inside the operating system. In the context of Linux, 8 NMIs, IRQs, SoftIRQs, and any other system thr 8 NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the 9 system. Moreover, hardware-related jobs can al 9 system. Moreover, hardware-related jobs can also cause noise, for example, 10 via SMIs. 10 via SMIs. 11 11 12 hwlat_detector is one of the tools used to ide 12 hwlat_detector is one of the tools used to identify the most complex 13 source of noise: *hardware noise*. 13 source of noise: *hardware noise*. 14 14 15 In a nutshell, the hwlat_detector creates a th 15 In a nutshell, the hwlat_detector creates a thread that runs 16 periodically for a given period. At the beginn 16 periodically for a given period. At the beginning of a period, the thread 17 disables interrupt and starts sampling. While 17 disables interrupt and starts sampling. While running, the hwlatd 18 thread reads the time in a loop. As interrupts 18 thread reads the time in a loop. As interrupts are disabled, threads, 19 IRQs, and SoftIRQs cannot interfere with the h 19 IRQs, and SoftIRQs cannot interfere with the hwlatd thread. Hence, the 20 cause of any gap between two different reads o 20 cause of any gap between two different reads of the time roots either on 21 NMI or in the hardware itself. At the end of t 21 NMI or in the hardware itself. At the end of the period, hwlatd enables 22 interrupts and reports the max observed gap be 22 interrupts and reports the max observed gap between the reads. It also 23 prints a NMI occurrence counter. If the output 23 prints a NMI occurrence counter. If the output does not report NMI 24 executions, the user can conclude that the har 24 executions, the user can conclude that the hardware is the culprit for 25 the latency. The hwlat detects the NMI executi 25 the latency. The hwlat detects the NMI execution by observing 26 the entry and exit of a NMI. 26 the entry and exit of a NMI. 27 27 28 The osnoise tracer leverages the hwlat_detecto 28 The osnoise tracer leverages the hwlat_detector by running a 29 similar loop with preemption, SoftIRQs and IRQ 29 similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing 30 all the sources of *osnoise* during its execut 30 all the sources of *osnoise* during its execution. Using the same approach 31 of hwlat, osnoise takes note of the entry and 31 of hwlat, osnoise takes note of the entry and exit point of any 32 source of interferences, increasing a per-cpu 32 source of interferences, increasing a per-cpu interference counter. The 33 osnoise tracer also saves an interference coun 33 osnoise tracer also saves an interference counter for each source of 34 interference. The interference counter for NMI 34 interference. The interference counter for NMI, IRQs, SoftIRQs, and 35 threads is increased anytime the tool observes 35 threads is increased anytime the tool observes these interferences' entry 36 events. When a noise happens without any inter 36 events. When a noise happens without any interference from the operating 37 system level, the hardware noise counter incre 37 system level, the hardware noise counter increases, pointing to a 38 hardware-related noise. In this way, osnoise c 38 hardware-related noise. In this way, osnoise can account for any 39 source of interference. At the end of the peri 39 source of interference. At the end of the period, the osnoise tracer 40 prints the sum of all noise, the max single no 40 prints the sum of all noise, the max single noise, the percentage of CPU 41 available for the thread, and the counters for 41 available for the thread, and the counters for the noise sources. 42 42 43 Usage 43 Usage 44 ----- 44 ----- 45 45 46 Write the ASCII text "osnoise" into the curren 46 Write the ASCII text "osnoise" into the current_tracer file of the 47 tracing system (generally mounted at /sys/kern 47 tracing system (generally mounted at /sys/kernel/tracing). 48 48 49 For example:: 49 For example:: 50 50 51 [root@f32 ~]# cd /sys/kernel/tracing/ 51 [root@f32 ~]# cd /sys/kernel/tracing/ 52 [root@f32 tracing]# echo osnoise > cur 52 [root@f32 tracing]# echo osnoise > current_tracer 53 53 54 It is possible to follow the trace by reading 54 It is possible to follow the trace by reading the trace file:: 55 55 56 [root@f32 tracing]# cat trace 56 [root@f32 tracing]# cat trace 57 # tracer: osnoise 57 # tracer: osnoise 58 # 58 # 59 # _---- 59 # _-----=> irqs-off 60 # / _--- 60 # / _----=> need-resched 61 # | / _-- 61 # | / _---=> hardirq/softirq 62 # || / _- 62 # || / _--=> preempt-depth MAX 63 # || / 63 # || / SINGLE Interference counters: 64 # |||| 64 # |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+ 65 # TASK-PID CPU# |||| 65 # TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD 66 # | | | |||| 66 # | | | |||| | | | | | | | | | | 67 <...>-859 [000] .... 67 <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 68 <...>-860 [001] .... 68 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 69 <...>-861 [002] .... 69 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 70 <...>-862 [003] .... 70 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 71 <...>-863 [004] .... 71 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 72 <...>-864 [005] .... 72 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 73 <...>-865 [006] .... 73 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 74 <...>-866 [007] .... 74 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19 75 75 76 In addition to the regular trace fields (from 76 In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the 77 tracer prints a message at the end of each per 77 tracer prints a message at the end of each period for each CPU that is 78 running an osnoise/ thread. The osnoise specif 78 running an osnoise/ thread. The osnoise specific fields report: 79 79 80 - The RUNTIME IN US reports the amount of tim 80 - The RUNTIME IN US reports the amount of time in microseconds that 81 the osnoise thread kept looping reading the 81 the osnoise thread kept looping reading the time. 82 - The NOISE IN US reports the sum of noise in 82 - The NOISE IN US reports the sum of noise in microseconds observed 83 by the osnoise tracer during the associated 83 by the osnoise tracer during the associated runtime. 84 - The % OF CPU AVAILABLE reports the percenta 84 - The % OF CPU AVAILABLE reports the percentage of CPU available for 85 the osnoise thread during the runtime windo 85 the osnoise thread during the runtime window. 86 - The MAX SINGLE NOISE IN US reports the maxi 86 - The MAX SINGLE NOISE IN US reports the maximum single noise observed 87 during the runtime window. 87 during the runtime window. 88 - The Interference counters display how many 88 - The Interference counters display how many each of the respective 89 interference happened during the runtime wi 89 interference happened during the runtime window. 90 90 91 Note that the example above shows a high numbe 91 Note that the example above shows a high number of HW noise samples. 92 The reason being is that this sample was taken 92 The reason being is that this sample was taken on a virtual machine, 93 and the host interference is detected as a har 93 and the host interference is detected as a hardware interference. 94 94 95 Tracer Configuration 95 Tracer Configuration 96 -------------------- 96 -------------------- 97 97 98 The tracer has a set of options inside the osn 98 The tracer has a set of options inside the osnoise directory, they are: 99 99 100 - osnoise/cpus: CPUs at which a osnoise threa 100 - osnoise/cpus: CPUs at which a osnoise thread will execute. 101 - osnoise/period_us: the period of the osnois 101 - osnoise/period_us: the period of the osnoise thread. 102 - osnoise/runtime_us: how long an osnoise thr 102 - osnoise/runtime_us: how long an osnoise thread will look for noise. 103 - osnoise/stop_tracing_us: stop the system tr 103 - osnoise/stop_tracing_us: stop the system tracing if a single noise 104 higher than the configured value happens. W 104 higher than the configured value happens. Writing 0 disables this 105 option. 105 option. 106 - osnoise/stop_tracing_total_us: stop the sys 106 - osnoise/stop_tracing_total_us: stop the system tracing if total noise 107 higher than the configured value happens. W 107 higher than the configured value happens. Writing 0 disables this 108 option. 108 option. 109 - tracing_threshold: the minimum delta betwee 109 - tracing_threshold: the minimum delta between two time() reads to be 110 considered as noise, in us. When set to 0, 110 considered as noise, in us. When set to 0, the default value will 111 be used, which is currently 1 us. !! 111 be used, which is currently 5 us. 112 - osnoise/options: a set of on/off options th 112 - osnoise/options: a set of on/off options that can be enabled by 113 writing the option name to the file or disa 113 writing the option name to the file or disabled by writing the option 114 name preceded with the 'NO\_' prefix. For e 114 name preceded with the 'NO\_' prefix. For example, writing 115 NO_OSNOISE_WORKLOAD disables the OSNOISE_WO 115 NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The 116 special DEAFAULTS option resets all options 116 special DEAFAULTS option resets all options to the default value. 117 117 118 Tracer Options 118 Tracer Options 119 -------------- 119 -------------- 120 120 121 The osnoise/options file exposes a set of on/o 121 The osnoise/options file exposes a set of on/off configuration options for 122 the osnoise tracer. These options are: 122 the osnoise tracer. These options are: 123 123 124 - DEFAULTS: reset the options to the default 124 - DEFAULTS: reset the options to the default value. 125 - OSNOISE_WORKLOAD: do not dispatch osnoise w 125 - OSNOISE_WORKLOAD: do not dispatch osnoise workload (see dedicated 126 section below). 126 section below). 127 - PANIC_ON_STOP: call panic() if the tracer s 127 - PANIC_ON_STOP: call panic() if the tracer stops. This option serves to 128 capture a vmcore. 128 capture a vmcore. 129 - OSNOISE_PREEMPT_DISABLE: disable preemption 129 - OSNOISE_PREEMPT_DISABLE: disable preemption while running the osnoise 130 workload, allowing only IRQ and hardware-re 130 workload, allowing only IRQ and hardware-related noise. 131 - OSNOISE_IRQ_DISABLE: disable IRQs while run 131 - OSNOISE_IRQ_DISABLE: disable IRQs while running the osnoise workload, 132 allowing only NMIs and hardware-related noi 132 allowing only NMIs and hardware-related noise, like hwlat tracer. 133 133 134 Additional Tracing 134 Additional Tracing 135 ------------------ 135 ------------------ 136 136 137 In addition to the tracer, a set of tracepoint 137 In addition to the tracer, a set of tracepoints were added to 138 facilitate the identification of the osnoise s 138 facilitate the identification of the osnoise source. 139 139 140 - osnoise:sample_threshold: printed anytime a 140 - osnoise:sample_threshold: printed anytime a noise is higher than 141 the configurable tolerance_ns. 141 the configurable tolerance_ns. 142 - osnoise:nmi_noise: noise from NMI, includin 142 - osnoise:nmi_noise: noise from NMI, including the duration. 143 - osnoise:irq_noise: noise from an IRQ, inclu 143 - osnoise:irq_noise: noise from an IRQ, including the duration. 144 - osnoise:softirq_noise: noise from a SoftIRQ 144 - osnoise:softirq_noise: noise from a SoftIRQ, including the 145 duration. 145 duration. 146 - osnoise:thread_noise: noise from a thread, 146 - osnoise:thread_noise: noise from a thread, including the duration. 147 147 148 Note that all the values are *net values*. For 148 Note that all the values are *net values*. For example, if while osnoise 149 is running, another thread preempts the osnois 149 is running, another thread preempts the osnoise thread, it will start a 150 thread_noise duration at the start. Then, an I 150 thread_noise duration at the start. Then, an IRQ takes place, preempting 151 the thread_noise, starting a irq_noise. When t 151 the thread_noise, starting a irq_noise. When the IRQ ends its execution, 152 it will compute its duration, and this duratio 152 it will compute its duration, and this duration will be subtracted from 153 the thread_noise, in such a way as to avoid th 153 the thread_noise, in such a way as to avoid the double accounting of the 154 IRQ execution. This logic is valid for all sou 154 IRQ execution. This logic is valid for all sources of noise. 155 155 156 Here is one example of the usage of these trac 156 Here is one example of the usage of these tracepoints:: 157 157 158 osnoise/8-961 [008] d.h. 5789.8575 158 osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns 159 osnoise/8-961 [008] dNh. 5789.8584 159 osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns 160 migration/8-54 [008] d... 5789.8584 160 migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns 161 osnoise/8-961 [008] .... 5789.8584 161 osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2 162 162 163 In this example, a noise sample of 8 microseco 163 In this example, a noise sample of 8 microseconds was reported in the last 164 line, pointing to two interferences. Looking b 164 line, pointing to two interferences. Looking backward in the trace, the 165 two previous entries were about the migration 165 two previous entries were about the migration thread running after a 166 timer IRQ execution. The first event is not pa 166 timer IRQ execution. The first event is not part of the noise because 167 it took place one millisecond before. 167 it took place one millisecond before. 168 168 169 It is worth noticing that the sum of the durat 169 It is worth noticing that the sum of the duration reported in the 170 tracepoints is smaller than eight us reported 170 tracepoints is smaller than eight us reported in the sample_threshold. 171 The reason roots in the overhead of the entry 171 The reason roots in the overhead of the entry and exit code that happens 172 before and after any interference execution. T 172 before and after any interference execution. This justifies the dual 173 approach: measuring thread and tracing. 173 approach: measuring thread and tracing. 174 174 175 Running osnoise tracer without workload 175 Running osnoise tracer without workload 176 --------------------------------------- 176 --------------------------------------- 177 177 178 By enabling the osnoise tracer with the NO_OSN 178 By enabling the osnoise tracer with the NO_OSNOISE_WORKLOAD option set, 179 the osnoise: tracepoints serve to measure the 179 the osnoise: tracepoints serve to measure the execution time of 180 any type of Linux task, free from the interfer 180 any type of Linux task, free from the interference of other tasks.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.