~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/trace/timerlat-tracer.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/trace/timerlat-tracer.rst (Version linux-6.12-rc7) and /Documentation/trace/timerlat-tracer.rst (Version linux-6.8.12)


  1 ###############                                     1 ###############
  2 Timerlat tracer                                     2 Timerlat tracer
  3 ###############                                     3 ###############
  4                                                     4 
  5 The timerlat tracer aims to help the preemptiv      5 The timerlat tracer aims to help the preemptive kernel developers to
  6 find sources of wakeup latencies of real-time       6 find sources of wakeup latencies of real-time threads. Like cyclictest,
  7 the tracer sets a periodic timer that wakes up      7 the tracer sets a periodic timer that wakes up a thread. The thread then
  8 computes a *wakeup latency* value as the diffe      8 computes a *wakeup latency* value as the difference between the *current
  9 time* and the *absolute time* that the timer w      9 time* and the *absolute time* that the timer was set to expire. The main
 10 goal of timerlat is tracing in such a way to h     10 goal of timerlat is tracing in such a way to help kernel developers.
 11                                                    11 
 12 Usage                                              12 Usage
 13 -----                                              13 -----
 14                                                    14 
 15 Write the ASCII text "timerlat" into the curre     15 Write the ASCII text "timerlat" into the current_tracer file of the
 16 tracing system (generally mounted at /sys/kern     16 tracing system (generally mounted at /sys/kernel/tracing).
 17                                                    17 
 18 For example::                                      18 For example::
 19                                                    19 
 20         [root@f32 ~]# cd /sys/kernel/tracing/      20         [root@f32 ~]# cd /sys/kernel/tracing/
 21         [root@f32 tracing]# echo timerlat > cu     21         [root@f32 tracing]# echo timerlat > current_tracer
 22                                                    22 
 23 It is possible to follow the trace by reading      23 It is possible to follow the trace by reading the trace file::
 24                                                    24 
 25   [root@f32 tracing]# cat trace                    25   [root@f32 tracing]# cat trace
 26   # tracer: timerlat                               26   # tracer: timerlat
 27   #                                                27   #
 28   #                              _-----=> irqs     28   #                              _-----=> irqs-off
 29   #                             / _----=> need     29   #                             / _----=> need-resched
 30   #                            | / _---=> hard     30   #                            | / _---=> hardirq/softirq
 31   #                            || / _--=> pree     31   #                            || / _--=> preempt-depth
 32   #                            || /                32   #                            || /
 33   #                            ||||                33   #                            ||||             ACTIVATION
 34   #         TASK-PID      CPU# ||||   TIMESTAM     34   #         TASK-PID      CPU# ||||   TIMESTAMP    ID            CONTEXT                LATENCY
 35   #            | |         |   ||||      |         35   #            | |         |   ||||      |         |                  |                       |
 36           <idle>-0       [000] d.h1    54.0293     36           <idle>-0       [000] d.h1    54.029328: #1     context    irq timer_latency       932 ns
 37            <...>-867     [000] ....    54.0293     37            <...>-867     [000] ....    54.029339: #1     context thread timer_latency     11700 ns
 38           <idle>-0       [001] dNh1    54.0293     38           <idle>-0       [001] dNh1    54.029346: #1     context    irq timer_latency      2833 ns
 39            <...>-868     [001] ....    54.0293     39            <...>-868     [001] ....    54.029353: #1     context thread timer_latency      9820 ns
 40           <idle>-0       [000] d.h1    54.0303     40           <idle>-0       [000] d.h1    54.030328: #2     context    irq timer_latency       769 ns
 41            <...>-867     [000] ....    54.0303     41            <...>-867     [000] ....    54.030330: #2     context thread timer_latency      3070 ns
 42           <idle>-0       [001] d.h1    54.0303     42           <idle>-0       [001] d.h1    54.030344: #2     context    irq timer_latency       935 ns
 43            <...>-868     [001] ....    54.0303     43            <...>-868     [001] ....    54.030347: #2     context thread timer_latency      4351 ns
 44                                                    44 
 45                                                    45 
 46 The tracer creates a per-cpu kernel thread wit     46 The tracer creates a per-cpu kernel thread with real-time priority that
 47 prints two lines at every activation. The firs     47 prints two lines at every activation. The first is the *timer latency*
 48 observed at the *hardirq* context before the a     48 observed at the *hardirq* context before the activation of the thread.
 49 The second is the *timer latency* observed by      49 The second is the *timer latency* observed by the thread. The ACTIVATION
 50 ID field serves to relate the *irq* execution      50 ID field serves to relate the *irq* execution to its respective *thread*
 51 execution.                                         51 execution.
 52                                                    52 
 53 The *irq*/*thread* splitting is important to c     53 The *irq*/*thread* splitting is important to clarify in which context
 54 the unexpected high value is coming from. The      54 the unexpected high value is coming from. The *irq* context can be
 55 delayed by hardware-related actions, such as S     55 delayed by hardware-related actions, such as SMIs, NMIs, IRQs,
 56 or by thread masking interrupts. Once the time     56 or by thread masking interrupts. Once the timer happens, the delay
 57 can also be influenced by blocking caused by t     57 can also be influenced by blocking caused by threads. For example, by
 58 postponing the scheduler execution via preempt     58 postponing the scheduler execution via preempt_disable(), scheduler
 59 execution, or masking interrupts. Threads can      59 execution, or masking interrupts. Threads can also be delayed by the
 60 interference from other threads and IRQs.          60 interference from other threads and IRQs.
 61                                                    61 
 62 Tracer options                                     62 Tracer options
 63 ---------------------                              63 ---------------------
 64                                                    64 
 65 The timerlat tracer is built on top of osnoise     65 The timerlat tracer is built on top of osnoise tracer.
 66 So its configuration is also done in the osnoi     66 So its configuration is also done in the osnoise/ config
 67 directory. The timerlat configs are:               67 directory. The timerlat configs are:
 68                                                    68 
 69  - cpus: CPUs at which a timerlat thread will      69  - cpus: CPUs at which a timerlat thread will execute.
 70  - timerlat_period_us: the period of the timer     70  - timerlat_period_us: the period of the timerlat thread.
 71  - stop_tracing_us: stop the system tracing if     71  - stop_tracing_us: stop the system tracing if a
 72    timer latency at the *irq* context higher t     72    timer latency at the *irq* context higher than the configured
 73    value happens. Writing 0 disables this opti     73    value happens. Writing 0 disables this option.
 74  - stop_tracing_total_us: stop the system trac     74  - stop_tracing_total_us: stop the system tracing if a
 75    timer latency at the *thread* context is hi     75    timer latency at the *thread* context is higher than the configured
 76    value happens. Writing 0 disables this opti     76    value happens. Writing 0 disables this option.
 77  - print_stack: save the stack of the IRQ occu     77  - print_stack: save the stack of the IRQ occurrence. The stack is printed
 78    after the *thread context* event, or at the     78    after the *thread context* event, or at the IRQ handler if *stop_tracing_us*
 79    is hit.                                         79    is hit.
 80                                                    80 
 81 timerlat and osnoise                               81 timerlat and osnoise
 82 ----------------------------                       82 ----------------------------
 83                                                    83 
 84 The timerlat can also take advantage of the os     84 The timerlat can also take advantage of the osnoise: traceevents.
 85 For example::                                      85 For example::
 86                                                    86 
 87         [root@f32 ~]# cd /sys/kernel/tracing/      87         [root@f32 ~]# cd /sys/kernel/tracing/
 88         [root@f32 tracing]# echo timerlat > cu     88         [root@f32 tracing]# echo timerlat > current_tracer
 89         [root@f32 tracing]# echo 1 > events/os     89         [root@f32 tracing]# echo 1 > events/osnoise/enable
 90         [root@f32 tracing]# echo 25 > osnoise/     90         [root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us
 91         [root@f32 tracing]# tail -10 trace         91         [root@f32 tracing]# tail -10 trace
 92              cc1-87882   [005] d..h...   548.7     92              cc1-87882   [005] d..h...   548.771078: #402268 context    irq timer_latency     13585 ns
 93              cc1-87882   [005] dNLh1..   548.7     93              cc1-87882   [005] dNLh1..   548.771082: irq_noise: local_timer:236 start 548.771077442 duration 7597 ns
 94              cc1-87882   [005] dNLh2..   548.7     94              cc1-87882   [005] dNLh2..   548.771099: irq_noise: qxl:21 start 548.771085017 duration 7139 ns
 95              cc1-87882   [005] d...3..   548.7     95              cc1-87882   [005] d...3..   548.771102: thread_noise:      cc1:87882 start 548.771078243 duration 9909 ns
 96       timerlat/5-1035    [005] .......   548.7     96       timerlat/5-1035    [005] .......   548.771104: #402268 context thread timer_latency     39960 ns
 97                                                    97 
 98 In this case, the root cause of the timer late     98 In this case, the root cause of the timer latency does not point to a
 99 single cause but to multiple ones. Firstly, th     99 single cause but to multiple ones. Firstly, the timer IRQ was delayed
100 for 13 us, which may point to a long IRQ disab    100 for 13 us, which may point to a long IRQ disabled section (see IRQ
101 stacktrace section). Then the timer interrupt     101 stacktrace section). Then the timer interrupt that wakes up the timerlat
102 thread took 7597 ns, and the qxl:21 device IRQ    102 thread took 7597 ns, and the qxl:21 device IRQ took 7139 ns. Finally,
103 the cc1 thread noise took 9909 ns of time befo    103 the cc1 thread noise took 9909 ns of time before the context switch.
104 Such pieces of evidence are useful for the dev    104 Such pieces of evidence are useful for the developer to use other
105 tracing methods to figure out how to debug and    105 tracing methods to figure out how to debug and optimize the system.
106                                                   106 
107 It is worth mentioning that the *duration* val    107 It is worth mentioning that the *duration* values reported
108 by the osnoise: events are *net* values. For e    108 by the osnoise: events are *net* values. For example, the
109 thread_noise does not include the duration of     109 thread_noise does not include the duration of the overhead caused
110 by the IRQ execution (which indeed accounted f    110 by the IRQ execution (which indeed accounted for 12736 ns). But
111 the values reported by the timerlat tracer (ti    111 the values reported by the timerlat tracer (timerlat_latency)
112 are *gross* values.                               112 are *gross* values.
113                                                   113 
114 The art below illustrates a CPU timeline and h    114 The art below illustrates a CPU timeline and how the timerlat tracer
115 observes it at the top and the osnoise: events    115 observes it at the top and the osnoise: events at the bottom. Each "-"
116 in the timelines means circa 1 us, and the tim    116 in the timelines means circa 1 us, and the time moves ==>::
117                                                   117 
118       External     timer irq                      118       External     timer irq                   thread
119        clock        latency                       119        clock        latency                    latency
120        event        13585 ns                      120        event        13585 ns                   39960 ns
121          |             ^                          121          |             ^                         ^
122          v             |                          122          v             |                         |
123          |-------------|                          123          |-------------|                         |
124          |-------------+----------------------    124          |-------------+-------------------------|
125                        ^                          125                        ^                         ^
126   ============================================    126   ========================================================================
127                     [tmr irq]  [dev irq]          127                     [tmr irq]  [dev irq]
128   [another thread...^       v..^       v......    128   [another thread...^       v..^       v.......][timerlat/ thread]  <-- CPU timeline
129   ============================================    129   =========================================================================
130                     |-------|  |-------|          130                     |-------|  |-------|
131                             |--^       v------    131                             |--^       v-------|
132                             |          |          132                             |          |       |
133                             |          |          133                             |          |       + thread_noise: 9909 ns
134                             |          +-> irq    134                             |          +-> irq_noise: 6139 ns
135                             +-> irq_noise: 759    135                             +-> irq_noise: 7597 ns
136                                                   136 
137 IRQ stacktrace                                    137 IRQ stacktrace
138 ---------------------------                       138 ---------------------------
139                                                   139 
140 The osnoise/print_stack option is helpful for     140 The osnoise/print_stack option is helpful for the cases in which a thread
141 noise causes the major factor for the timer la    141 noise causes the major factor for the timer latency, because of preempt or
142 irq disabled. For example::                       142 irq disabled. For example::
143                                                   143 
144         [root@f32 tracing]# echo 500 > osnoise    144         [root@f32 tracing]# echo 500 > osnoise/stop_tracing_total_us
145         [root@f32 tracing]# echo 500 > osnoise    145         [root@f32 tracing]# echo 500 > osnoise/print_stack
146         [root@f32 tracing]# echo timerlat > cu    146         [root@f32 tracing]# echo timerlat > current_tracer
147         [root@f32 tracing]# tail -21 per_cpu/c    147         [root@f32 tracing]# tail -21 per_cpu/cpu7/trace
148           insmod-1026    [007] dN.h1..   200.2    148           insmod-1026    [007] dN.h1..   200.201948: irq_noise: local_timer:236 start 200.201939376 duration 7872 ns
149           insmod-1026    [007] d..h1..   200.2    149           insmod-1026    [007] d..h1..   200.202587: #29800 context    irq timer_latency      1616 ns
150           insmod-1026    [007] dN.h2..   200.2    150           insmod-1026    [007] dN.h2..   200.202598: irq_noise: local_timer:236 start 200.202586162 duration 11855 ns
151           insmod-1026    [007] dN.h3..   200.2    151           insmod-1026    [007] dN.h3..   200.202947: irq_noise: local_timer:236 start 200.202939174 duration 7318 ns
152           insmod-1026    [007] d...3..   200.2    152           insmod-1026    [007] d...3..   200.203444: thread_noise:   insmod:1026 start 200.202586933 duration 838681 ns
153       timerlat/7-1001    [007] .......   200.2    153       timerlat/7-1001    [007] .......   200.203445: #29800 context thread timer_latency    859978 ns
154       timerlat/7-1001    [007] ....1..   200.2    154       timerlat/7-1001    [007] ....1..   200.203446: <stack trace>
155   => timerlat_irq                                 155   => timerlat_irq
156   => __hrtimer_run_queues                         156   => __hrtimer_run_queues
157   => hrtimer_interrupt                            157   => hrtimer_interrupt
158   => __sysvec_apic_timer_interrupt                158   => __sysvec_apic_timer_interrupt
159   => asm_call_irq_on_stack                        159   => asm_call_irq_on_stack
160   => sysvec_apic_timer_interrupt                  160   => sysvec_apic_timer_interrupt
161   => asm_sysvec_apic_timer_interrupt              161   => asm_sysvec_apic_timer_interrupt
162   => delay_tsc                                    162   => delay_tsc
163   => dummy_load_1ms_pd_init                       163   => dummy_load_1ms_pd_init
164   => do_one_initcall                              164   => do_one_initcall
165   => do_init_module                               165   => do_init_module
166   => __do_sys_finit_module                        166   => __do_sys_finit_module
167   => do_syscall_64                                167   => do_syscall_64
168   => entry_SYSCALL_64_after_hwframe               168   => entry_SYSCALL_64_after_hwframe
169                                                   169 
170 In this case, it is possible to see that the t    170 In this case, it is possible to see that the thread added the highest
171 contribution to the *timer latency* and the st    171 contribution to the *timer latency* and the stack trace, saved during
172 the timerlat IRQ handler, points to a function    172 the timerlat IRQ handler, points to a function named
173 dummy_load_1ms_pd_init, which had the followin    173 dummy_load_1ms_pd_init, which had the following code (on purpose)::
174                                                   174 
175         static int __init dummy_load_1ms_pd_in    175         static int __init dummy_load_1ms_pd_init(void)
176         {                                         176         {
177                 preempt_disable();                177                 preempt_disable();
178                 mdelay(1);                        178                 mdelay(1);
179                 preempt_enable();                 179                 preempt_enable();
180                 return 0;                         180                 return 0;
181                                                   181 
182         }                                         182         }
183                                                   183 
184 User-space interface                              184 User-space interface
185 ---------------------------                       185 ---------------------------
186                                                   186 
187 Timerlat allows user-space threads to use time    187 Timerlat allows user-space threads to use timerlat infra-structure to
188 measure scheduling latency. This interface is     188 measure scheduling latency. This interface is accessible via a per-CPU
189 file descriptor inside $tracing_dir/osnoise/pe    189 file descriptor inside $tracing_dir/osnoise/per_cpu/cpu$ID/timerlat_fd.
190                                                   190 
191 This interface is accessible under the followi    191 This interface is accessible under the following conditions:
192                                                   192 
193  - timerlat tracer is enable                      193  - timerlat tracer is enable
194  - osnoise workload option is set to NO_OSNOIS    194  - osnoise workload option is set to NO_OSNOISE_WORKLOAD
195  - The user-space thread is affined to a singl    195  - The user-space thread is affined to a single processor
196  - The thread opens the file associated with i    196  - The thread opens the file associated with its single processor
197  - Only one thread can access the file at a ti    197  - Only one thread can access the file at a time
198                                                   198 
199 The open() syscall will fail if any of these c    199 The open() syscall will fail if any of these conditions are not met.
200 After opening the file descriptor, the user sp    200 After opening the file descriptor, the user space can read from it.
201                                                   201 
202 The read() system call will run a timerlat cod    202 The read() system call will run a timerlat code that will arm the
203 timer in the future and wait for it as the reg    203 timer in the future and wait for it as the regular kernel thread does.
204                                                   204 
205 When the timer IRQ fires, the timerlat IRQ wil    205 When the timer IRQ fires, the timerlat IRQ will execute, report the
206 IRQ latency and wake up the thread waiting in     206 IRQ latency and wake up the thread waiting in the read. The thread will be
207 scheduled and report the thread latency via tr    207 scheduled and report the thread latency via tracer - as for the kernel
208 thread.                                           208 thread.
209                                                   209 
210 The difference from the in-kernel timerlat is     210 The difference from the in-kernel timerlat is that, instead of re-arming
211 the timer, timerlat will return to the read()     211 the timer, timerlat will return to the read() system call. At this point,
212 the user can run any code.                        212 the user can run any code.
213                                                   213 
214 If the application rereads the file timerlat f    214 If the application rereads the file timerlat file descriptor, the tracer
215 will report the return from user-space latency    215 will report the return from user-space latency, which is the total
216 latency. If this is the end of the work, it ca    216 latency. If this is the end of the work, it can be interpreted as the
217 response time for the request.                    217 response time for the request.
218                                                   218 
219 After reporting the total latency, timerlat wi    219 After reporting the total latency, timerlat will restart the cycle, arm
220 a timer, and go to sleep for the following act    220 a timer, and go to sleep for the following activation.
221                                                   221 
222 If at any time one of the conditions is broken    222 If at any time one of the conditions is broken, e.g., the thread migrates
223 while in user space, or the timerlat tracer is    223 while in user space, or the timerlat tracer is disabled, the SIG_KILL
224 signal will be sent to the user-space thread.     224 signal will be sent to the user-space thread.
225                                                   225 
226 Here is an basic example of user-space code fo    226 Here is an basic example of user-space code for timerlat::
227                                                   227 
228  int main(void)                                   228  int main(void)
229  {                                                229  {
230         char buffer[1024];                        230         char buffer[1024];
231         int timerlat_fd;                          231         int timerlat_fd;
232         int retval;                               232         int retval;
233         long cpu = 0;   /* place in CPU 0 */      233         long cpu = 0;   /* place in CPU 0 */
234         cpu_set_t set;                            234         cpu_set_t set;
235                                                   235 
236         CPU_ZERO(&set);                           236         CPU_ZERO(&set);
237         CPU_SET(cpu, &set);                       237         CPU_SET(cpu, &set);
238                                                   238 
239         if (sched_setaffinity(gettid(), sizeof    239         if (sched_setaffinity(gettid(), sizeof(set), &set) == -1)
240                 return 1;                         240                 return 1;
241                                                   241 
242         snprintf(buffer, sizeof(buffer),          242         snprintf(buffer, sizeof(buffer),
243                 "/sys/kernel/tracing/osnoise/p    243                 "/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd",
244                 cpu);                             244                 cpu);
245                                                   245 
246         timerlat_fd = open(buffer, O_RDONLY);     246         timerlat_fd = open(buffer, O_RDONLY);
247         if (timerlat_fd < 0) {                    247         if (timerlat_fd < 0) {
248                 printf("error opening %s: %s\n    248                 printf("error opening %s: %s\n", buffer, strerror(errno));
249                 exit(1);                          249                 exit(1);
250         }                                         250         }
251                                                   251 
252         for (;;) {                                252         for (;;) {
253                 retval = read(timerlat_fd, buf    253                 retval = read(timerlat_fd, buffer, 1024);
254                 if (retval < 0)                   254                 if (retval < 0)
255                         break;                    255                         break;
256         }                                         256         }
257                                                   257 
258         close(timerlat_fd);                       258         close(timerlat_fd);
259         exit(0);                                  259         exit(0);
260  }                                                260  }
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php