~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/trace/events.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/trace/events.rst (Version linux-6.11.5) and /Documentation/trace/events.rst (Version linux-2.6.0)


  1 =============                                     
  2 Event Tracing                                     
  3 =============                                     
  4                                                   
  5 :Author: Theodore Ts'o                            
  6 :Updated: Li Zefan and Tom Zanussi                
  7                                                   
  8 1. Introduction                                   
  9 ===============                                   
 10                                                   
 11 Tracepoints (see Documentation/trace/tracepoin    
 12 without creating custom kernel modules to regi    
 13 using the event tracing infrastructure.           
 14                                                   
 15 Not all tracepoints can be traced using the ev    
 16 the kernel developer must provide code snippet    
 17 tracing information is saved into the tracing     
 18 tracing information should be printed.            
 19                                                   
 20 2. Using Event Tracing                            
 21 ======================                            
 22                                                   
 23 2.1 Via the 'set_event' interface                 
 24 ---------------------------------                 
 25                                                   
 26 The events which are available for tracing can    
 27 /sys/kernel/tracing/available_events.             
 28                                                   
 29 To enable a particular event, such as 'sched_w    
 30 to /sys/kernel/tracing/set_event. For example:    
 31                                                   
 32         # echo sched_wakeup >> /sys/kernel/tra    
 33                                                   
 34 .. Note:: '>>' is necessary, otherwise it will    
 35                                                   
 36 To disable an event, echo the event name to th    
 37 with an exclamation point::                       
 38                                                   
 39         # echo '!sched_wakeup' >> /sys/kernel/    
 40                                                   
 41 To disable all events, echo an empty line to t    
 42                                                   
 43         # echo > /sys/kernel/tracing/set_event    
 44                                                   
 45 To enable all events, echo ``*:*`` or ``*:`` t    
 46                                                   
 47         # echo *:* > /sys/kernel/tracing/set_e    
 48                                                   
 49 The events are organized into subsystems, such    
 50 etc., and a full event name looks like this: <    
 51 subsystem name is optional, but it is displaye    
 52 file.  All of the events in a subsystem can be    
 53 ``<subsystem>:*``; for example, to enable all     
 54 command::                                         
 55                                                   
 56         # echo 'irq:*' > /sys/kernel/tracing/s    
 57                                                   
 58 2.2 Via the 'enable' toggle                       
 59 ---------------------------                       
 60                                                   
 61 The events available are also listed in /sys/k    
 62 of directories.                                   
 63                                                   
 64 To enable event 'sched_wakeup'::                  
 65                                                   
 66         # echo 1 > /sys/kernel/tracing/events/    
 67                                                   
 68 To disable it::                                   
 69                                                   
 70         # echo 0 > /sys/kernel/tracing/events/    
 71                                                   
 72 To enable all events in sched subsystem::         
 73                                                   
 74         # echo 1 > /sys/kernel/tracing/events/    
 75                                                   
 76 To enable all events::                            
 77                                                   
 78         # echo 1 > /sys/kernel/tracing/events/    
 79                                                   
 80 When reading one of these enable files, there     
 81                                                   
 82  - 0 - all events this file affects are disabl    
 83  - 1 - all events this file affects are enable    
 84  - X - there is a mixture of events enabled an    
 85  - ? - this file does not affect any event        
 86                                                   
 87 2.3 Boot option                                   
 88 ---------------                                   
 89                                                   
 90 In order to facilitate early boot debugging, u    
 91                                                   
 92         trace_event=[event-list]                  
 93                                                   
 94 event-list is a comma separated list of events    
 95 format.                                           
 96                                                   
 97 3. Defining an event-enabled tracepoint           
 98 =======================================           
 99                                                   
100 See The example provided in samples/trace_even    
101                                                   
102 4. Event formats                                  
103 ================                                  
104                                                   
105 Each trace event has a 'format' file associate    
106 a description of each field in a logged event.    
107 be used to parse the binary trace stream, and     
108 find the field names that can be used in event    
109                                                   
110 It also displays the format string that will b    
111 event in text mode, along with the event name     
112 profiling.                                        
113                                                   
114 Every event has a set of ``common`` fields ass    
115 the fields prefixed with ``common_``.  The oth    
116 events and correspond to the fields defined in    
117 definition for that event.                        
118                                                   
119 Each field in the format has the form::           
120                                                   
121      field:field-type field-name; offset:N; si    
122                                                   
123 where offset is the offset of the field in the    
124 is the size of the data item, in bytes.           
125                                                   
126 For example, here's the information displayed     
127 event::                                           
128                                                   
129         # cat /sys/kernel/tracing/events/sched    
130                                                   
131         name: sched_wakeup                        
132         ID: 60                                    
133         format:                                   
134                 field:unsigned short common_ty    
135                 field:unsigned char common_fla    
136                 field:unsigned char common_pre    
137                 field:int common_pid;   offset    
138                 field:int common_tgid;  offset    
139                                                   
140                 field:char comm[TASK_COMM_LEN]    
141                 field:pid_t pid;        offset    
142                 field:int prio; offset:32;        
143                 field:int success;      offset    
144                 field:int cpu;  offset:40;        
145                                                   
146         print fmt: "task %s:%d [%d] success=%d    
147                    REC->prio, REC->success, RE    
148                                                   
149 This event contains 10 fields, the first 5 com    
150 event-specific.  All the fields for this event    
151 'comm' which is a string, a distinction import    
152                                                   
153 5. Event filtering                                
154 ==================                                
155                                                   
156 Trace events can be filtered in the kernel by     
157 'filter expressions' with them.  As soon as an    
158 the trace buffer, its fields are checked again    
159 associated with that event type.  An event wit    
160 'match' the filter will appear in the trace ou    
161 values don't match will be discarded.  An even    
162 associated with it matches everything, and is     
163 filter has been set for an event.                 
164                                                   
165 5.1 Expression syntax                             
166 ---------------------                             
167                                                   
168 A filter expression consists of one or more 'p    
169 combined using the logical operators '&&' and     
170 simply a clause that compares the value of a f    
171 logged event with a constant value and returns    
172 on whether the field value matched (1) or didn    
173                                                   
174           field-name relational-operator value    
175                                                   
176 Parentheses can be used to provide arbitrary l    
177 double-quotes can be used to prevent the shell    
178 operators as shell metacharacters.                
179                                                   
180 The field-names available for use in filters c    
181 'format' files for trace events (see section 4    
182                                                   
183 The relational-operators depend on the type of    
184                                                   
185 The operators available for numeric fields are    
186                                                   
187 ==, !=, <, <=, >, >=, &                           
188                                                   
189 And for string fields they are:                   
190                                                   
191 ==, !=, ~                                         
192                                                   
193 The glob (~) accepts a wild card character (\*    
194 ([). For example::                                
195                                                   
196   prev_comm ~ "*sh"                               
197   prev_comm ~ "sh*"                               
198   prev_comm ~ "*sh*"                              
199   prev_comm ~ "ba*sh"                             
200                                                   
201 If the field is a pointer that points into use    
202 "filename" from sys_enter_openat), then you ha    
203 field name::                                      
204                                                   
205   filename.ustring ~ "password"                   
206                                                   
207 As the kernel will have to know how to retriev    
208 is at from user space.                            
209                                                   
210 You can convert any long type to a function ad    
211                                                   
212   call_site.function == security_prepare_creds    
213                                                   
214 The above will filter when the field "call_sit    
215 "security_prepare_creds". That is, it will com    
216 the filter will return true if it is greater t    
217 the function "security_prepare_creds" and less    
218                                                   
219 The ".function" postfix can only be attached t    
220 be compared with "==" or "!=".                    
221                                                   
222 Cpumask fields or scalar fields that encode a     
223 a user-provided cpumask in cpulist format. The    
224                                                   
225   CPUS{$cpulist}                                  
226                                                   
227 Operators available to cpumask filtering are:     
228                                                   
229 & (intersection), ==, !=                          
230                                                   
231 For example, this will filter events that have    
232 in the given cpumask::                            
233                                                   
234   target_cpu & CPUS{17-42}                        
235                                                   
236 5.2 Setting filters                               
237 -------------------                               
238                                                   
239 A filter for an individual event is set by wri    
240 to the 'filter' file for the given event.         
241                                                   
242 For example::                                     
243                                                   
244         # cd /sys/kernel/tracing/events/sched/    
245         # echo "common_preempt_count > 4" > fi    
246                                                   
247 A slightly more involved example::                
248                                                   
249         # cd /sys/kernel/tracing/events/signal    
250         # echo "((sig >= 10 && sig < 15) || si    
251                                                   
252 If there is an error in the expression, you'll    
253 argument' error when setting it, and the erron    
254 an error message can be seen by looking at the    
255                                                   
256         # cd /sys/kernel/tracing/events/signal    
257         # echo "((sig >= 10 && sig < 15) || ds    
258         -bash: echo: write error: Invalid argu    
259         # cat filter                              
260         ((sig >= 10 && sig < 15) || dsig == 17    
261         ^                                         
262         parse_error: Field not found              
263                                                   
264 Currently the caret ('^') for an error always     
265 the filter string; the error message should st    
266 even without more accurate position info.         
267                                                   
268 5.2.1 Filter limitations                          
269 ------------------------                          
270                                                   
271 If a filter is placed on a string pointer ``(c    
272 to a string on the ring buffer, but instead po    
273 memory, then, for safety reasons, at most 1024    
274 copied onto a temporary buffer to do the compa    
275 faults (the pointer points to memory that shou    
276 string compare will be treated as not matching    
277                                                   
278 5.3 Clearing filters                              
279 --------------------                              
280                                                   
281 To clear the filter for an event, write a '0'     
282 file.                                             
283                                                   
284 To clear the filters for all events in a subsy    
285 subsystem's filter file.                          
286                                                   
287 5.4 Subsystem filters                             
288 ---------------------                             
289                                                   
290 For convenience, filters for every event in a     
291 cleared as a group by writing a filter express    
292 at the root of the subsystem.  Note however, t    
293 event within the subsystem lacks a field speci    
294 filter, or if the filter can't be applied for     
295 filter for that event will retain its previous    
296 result in an unintended mixture of filters whi    
297 confusing (to the user who might think differe    
298 effect) trace output.  Only filters that refer    
299 fields can be guaranteed to propagate successf    
300                                                   
301 Here are a few subsystem filter examples that     
302 above points:                                     
303                                                   
304 Clear the filters on all events in the sched s    
305                                                   
306         # cd /sys/kernel/tracing/events/sched     
307         # echo 0 > filter                         
308         # cat sched_switch/filter                 
309         none                                      
310         # cat sched_wakeup/filter                 
311         none                                      
312                                                   
313 Set a filter using only common fields for all     
314 subsystem (all events end up with the same fil    
315                                                   
316         # cd /sys/kernel/tracing/events/sched     
317         # echo common_pid == 0 > filter           
318         # cat sched_switch/filter                 
319         common_pid == 0                           
320         # cat sched_wakeup/filter                 
321         common_pid == 0                           
322                                                   
323 Attempt to set a filter using a non-common fie    
324 sched subsystem (all events but those that hav    
325 their old filters)::                              
326                                                   
327         # cd /sys/kernel/tracing/events/sched     
328         # echo prev_pid == 0 > filter             
329         # cat sched_switch/filter                 
330         prev_pid == 0                             
331         # cat sched_wakeup/filter                 
332         common_pid == 0                           
333                                                   
334 5.5 PID filtering                                 
335 -----------------                                 
336                                                   
337 The set_event_pid file in the same directory a    
338 exists, will filter all events from tracing an    
339 PID listed in the set_event_pid file.             
340 ::                                                
341                                                   
342         # cd /sys/kernel/tracing                  
343         # echo $$ > set_event_pid                 
344         # echo 1 > events/enable                  
345                                                   
346 Will only trace events for the current task.      
347                                                   
348 To add more PIDs without losing the PIDs alrea    
349 ::                                                
350                                                   
351         # echo 123 244 1 >> set_event_pid         
352                                                   
353                                                   
354 6. Event triggers                                 
355 =================                                 
356                                                   
357 Trace events can be made to conditionally invo    
358 which can take various forms and are described    
359 examples would be enabling or disabling other     
360 a stack trace whenever the trace event is hit.    
361 with attached triggers is invoked, the set of     
362 associated with that event is invoked.  Any gi    
363 additionally have an event filter of the same     
364 section 5 (Event filtering) associated with it    
365 be invoked if the event being invoked passes t    
366 If no filter is associated with the trigger, i    
367                                                   
368 Triggers are added to and removed from a parti    
369 trigger expressions to the 'trigger' file for     
370                                                   
371 A given event can have any number of triggers     
372 subject to any restrictions that individual co    
373 regard.                                           
374                                                   
375 Event triggers are implemented on top of "soft    
376 whenever a trace event has one or more trigger    
377 the event is activated even if it isn't actual    
378 disabled in a "soft" mode.  That is, the trace    
379 but just will not be traced, unless of course     
380 This scheme allows triggers to be invoked even    
381 enabled, and also allows the current event fil    
382 used for conditionally invoking triggers.         
383                                                   
384 The syntax for event triggers is roughly based    
385 set_ftrace_filter 'ftrace filter commands' (se    
386 section of Documentation/trace/ftrace.rst), bu    
387 differences and the implementation isn't curre    
388 way, so beware about making generalizations be    
389                                                   
390 .. Note::                                         
391      Writing into trace_marker (See Documentat    
392      can also enable triggers that are written    
393      /sys/kernel/tracing/events/ftrace/print/t    
394                                                   
395 6.1 Expression syntax                             
396 ---------------------                             
397                                                   
398 Triggers are added by echoing the command to t    
399                                                   
400   # echo 'command[:count] [if filter]' > trigg    
401                                                   
402 Triggers are removed by echoing the same comma    
403 to the 'trigger' file::                           
404                                                   
405   # echo '!command[:count] [if filter]' > trig    
406                                                   
407 The [if filter] part isn't used in matching co    
408 leaving that off in a '!' command will accompl    
409 having it in.                                     
410                                                   
411 The filter syntax is the same as that describe    
412 filtering' section above.                         
413                                                   
414 For ease of use, writing to the trigger file u    
415 adds or removes a single trigger and there's n    
416 ('>' actually behaves like '>>') or truncation    
417 triggers (you have to use '!' for each one add    
418                                                   
419 6.2 Supported trigger commands                    
420 ------------------------------                    
421                                                   
422 The following commands are supported:             
423                                                   
424 - enable_event/disable_event                      
425                                                   
426   These commands can enable or disable another    
427   the triggering event is hit.  When these com    
428   the other trace event is activated, but disa    
429   That is, the tracepoint will be called, but     
430   The event tracepoint stays in this mode as l    
431   in effect that can trigger it.                  
432                                                   
433   For example, the following trigger causes km    
434   traced when a read system call is entered, a    
435   specifies that this enablement happens only     
436                                                   
437           # echo 'enable_event:kmem:kmalloc:1'    
438               /sys/kernel/tracing/events/sysca    
439                                                   
440   The following trigger causes kmalloc events     
441   when a read system call exits.  This disable    
442   read system call exit::                         
443                                                   
444           # echo 'disable_event:kmem:kmalloc'     
445               /sys/kernel/tracing/events/sysca    
446                                                   
447   The format is::                                 
448                                                   
449       enable_event:<system>:<event>[:count]       
450       disable_event:<system>:<event>[:count]      
451                                                   
452   To remove the above commands::                  
453                                                   
454           # echo '!enable_event:kmem:kmalloc:1    
455               /sys/kernel/tracing/events/sysca    
456                                                   
457           # echo '!disable_event:kmem:kmalloc'    
458               /sys/kernel/tracing/events/sysca    
459                                                   
460   Note that there can be any number of enable/    
461   per triggering event, but there can only be     
462   triggered event. e.g. sys_enter_read can hav    
463   kmem:kmalloc and sched:sched_switch, but can    
464   versions such as kmem:kmalloc and kmem:kmall    
465   bytes_req == 256' and 'kmem:kmalloc if bytes    
466   could be combined into a single filter on km    
467                                                   
468 - stacktrace                                      
469                                                   
470   This command dumps a stacktrace in the trace    
471   triggering event occurs.                        
472                                                   
473   For example, the following trigger dumps a s    
474   kmalloc tracepoint is hit::                     
475                                                   
476           # echo 'stacktrace' > \                 
477                 /sys/kernel/tracing/events/kme    
478                                                   
479   The following trigger dumps a stacktrace the    
480   request happens with a size >= 64K::            
481                                                   
482           # echo 'stacktrace:5 if bytes_req >=    
483                 /sys/kernel/tracing/events/kme    
484                                                   
485   The format is::                                 
486                                                   
487       stacktrace[:count]                          
488                                                   
489   To remove the above commands::                  
490                                                   
491           # echo '!stacktrace' > \                
492                 /sys/kernel/tracing/events/kme    
493                                                   
494           # echo '!stacktrace:5 if bytes_req >    
495                 /sys/kernel/tracing/events/kme    
496                                                   
497   The latter can also be removed more simply b    
498   the filter)::                                   
499                                                   
500           # echo '!stacktrace:5' > \              
501                 /sys/kernel/tracing/events/kme    
502                                                   
503   Note that there can be only one stacktrace t    
504   event.                                          
505                                                   
506 - snapshot                                        
507                                                   
508   This command causes a snapshot to be trigger    
509   triggering event occurs.                        
510                                                   
511   The following command creates a snapshot eve    
512   queue is unplugged with a depth > 1.  If you    
513   events or functions at the time, the snapsho    
514   capture those events when the trigger event     
515                                                   
516           # echo 'snapshot if nr_rq > 1' > \      
517                 /sys/kernel/tracing/events/blo    
518                                                   
519   To only snapshot once::                         
520                                                   
521           # echo 'snapshot:1 if nr_rq > 1' > \    
522                 /sys/kernel/tracing/events/blo    
523                                                   
524   To remove the above commands::                  
525                                                   
526           # echo '!snapshot if nr_rq > 1' > \     
527                 /sys/kernel/tracing/events/blo    
528                                                   
529           # echo '!snapshot:1 if nr_rq > 1' >     
530                 /sys/kernel/tracing/events/blo    
531                                                   
532   Note that there can be only one snapshot tri    
533   event.                                          
534                                                   
535 - traceon/traceoff                                
536                                                   
537   These commands turn tracing on and off when     
538   hit. The parameter determines how many times    
539   turned on and off. If unspecified, there is     
540                                                   
541   The following command turns tracing off the     
542   request queue is unplugged with a depth > 1.    
543   set of events or functions at the time, you     
544   trace buffer to see the sequence of events t    
545   trigger event::                                 
546                                                   
547           # echo 'traceoff:1 if nr_rq > 1' > \    
548                 /sys/kernel/tracing/events/blo    
549                                                   
550   To always disable tracing when nr_rq  > 1::     
551                                                   
552           # echo 'traceoff if nr_rq > 1' > \      
553                 /sys/kernel/tracing/events/blo    
554                                                   
555   To remove the above commands::                  
556                                                   
557           # echo '!traceoff:1 if nr_rq > 1' >     
558                 /sys/kernel/tracing/events/blo    
559                                                   
560           # echo '!traceoff if nr_rq > 1' > \     
561                 /sys/kernel/tracing/events/blo    
562                                                   
563   Note that there can be only one traceon or t    
564   triggering event.                               
565                                                   
566 - hist                                            
567                                                   
568   This command aggregates event hits into a ha    
569   more trace event format fields (or stacktrac    
570   totals derived from one or more trace event     
571   event counts (hitcount).                        
572                                                   
573   See Documentation/trace/histogram.rst for de    
574                                                   
575 7. In-kernel trace event API                      
576 ============================                      
577                                                   
578 In most cases, the command-line interface to t    
579 sufficient.  Sometimes, however, applications     
580 more complex relationships than can be express    
581 series of linked command-line expressions, or     
582 commands may be simply too cumbersome.  An exa    
583 application that needs to 'listen' to the trac    
584 maintain an in-kernel state machine detecting,    
585 illegal kernel state occurs in the scheduler.     
586                                                   
587 The trace event subsystem provides an in-kerne    
588 or other kernel code to generate user-defined     
589 will, which can be used to either augment the     
590 and/or signal that a particular important stat    
591                                                   
592 A similar in-kernel API is also available for     
593 kretprobe events.                                 
594                                                   
595 Both the synthetic event and k/ret/probe event    
596 of a lower-level "dynevent_cmd" event command     
597 available for more specialized applications, o    
598 higher-level trace event APIs.                    
599                                                   
600 The API provided for these purposes is describ    
601 following:                                        
602                                                   
603   - dynamically creating synthetic event defin    
604   - dynamically creating kprobe and kretprobe     
605   - tracing synthetic events from in-kernel co    
606   - the low-level "dynevent_cmd" API              
607                                                   
608 7.1 Dyamically creating synthetic event defini    
609 ----------------------------------------------    
610                                                   
611 There are a couple ways to create a new synthe    
612 module or other kernel code.                      
613                                                   
614 The first creates the event in one step, using    
615 In this method, the name of the event to creat    
616 the fields is supplied to synth_event_create()    
617 synthetic event with that name and fields will    
618 call.  For example, to create a new "schedtest    
619                                                   
620   ret = synth_event_create("schedtest", sched_    
621                            ARRAY_SIZE(sched_fi    
622                                                   
623 The sched_fields param in this example points     
624 synth_field_desc, each of which describes an e    
625 name::                                            
626                                                   
627   static struct synth_field_desc sched_fields[    
628         { .type = "pid_t",              .name     
629         { .type = "char[16]",           .name     
630         { .type = "u64",                .name     
631         { .type = "u64",                .name     
632         { .type = "unsigned int",       .name     
633         { .type = "char[64]",           .name     
634         { .type = "int",                .name     
635   };                                              
636                                                   
637 See synth_field_size() for available types.       
638                                                   
639 If field_name contains [n], the field is consi    
640                                                   
641 If field_names contains[] (no subscript), the     
642 be a dynamic array, which will only take as mu    
643 is required to hold the array.                    
644                                                   
645 Because space for an event is reserved before     
646 to the event, using dynamic arrays implies tha    
647 in-kernel API described below can't be used wi    
648 other non-piecewise in-kernel APIs can, howeve    
649 arrays.                                           
650                                                   
651 If the event is created from within a module,     
652 must be passed to synth_event_create().  This     
653 trace buffer won't contain unreadable events w    
654 removed.                                          
655                                                   
656 At this point, the event object is ready to be    
657 events.                                           
658                                                   
659 In the second method, the event is created in     
660 allows events to be created dynamically and wi    
661 and populate an array of fields beforehand.       
662                                                   
663 To use this method, an empty or partially empt    
664 first be created using synth_event_gen_cmd_sta    
665 synth_event_gen_cmd_array_start().  For synth_    
666 the name of the event along with one or more p    
667 representing a 'type field_name;' field specif    
668 supplied.  For synth_event_gen_cmd_array_start    
669 event along with an array of struct synth_fiel    
670 supplied. Before calling synth_event_gen_cmd_s    
671 synth_event_gen_cmd_array_start(), the user sh    
672 initialize a dynevent_cmd object using synth_e    
673                                                   
674 For example, to create a new "schedtest" synth    
675 fields::                                          
676                                                   
677   struct dynevent_cmd cmd;                        
678   char *buf;                                      
679                                                   
680   /* Create a buffer to hold the generated com    
681   buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERN    
682                                                   
683   /* Before generating the command, initialize    
684   synth_event_cmd_init(&cmd, buf, MAX_DYNEVENT    
685                                                   
686   ret = synth_event_gen_cmd_start(&cmd, "sched    
687                                   "pid_t", "ne    
688                                   "u64", "ts_n    
689                                                   
690 Alternatively, using an array of struct synth_    
691 containing the same information::                 
692                                                   
693   ret = synth_event_gen_cmd_array_start(&cmd,     
694                                         fields    
695                                                   
696 Once the synthetic event object has been creat    
697 populated with more fields.  Fields are added     
698 synth_event_add_field(), supplying the dyneven    
699 type, and a field name.  For example, to add a    
700 "intfield", the following call should be made:    
701                                                   
702   ret = synth_event_add_field(&cmd, "int", "in    
703                                                   
704 See synth_field_size() for available types. If    
705 the field is considered to be an array.           
706                                                   
707 A group of fields can also be added all at onc    
708 synth_field_desc with add_synth_fields().  For    
709 just the first four sched_fields::                
710                                                   
711   ret = synth_event_add_fields(&cmd, sched_fie    
712                                                   
713 If you already have a string of the form 'type    
714 synth_event_add_field_str() can be used to add    
715 also automatically append a ';' to the string.    
716                                                   
717 Once all the fields have been added, the event    
718 registered by calling the synth_event_gen_cmd_    
719                                                   
720   ret = synth_event_gen_cmd_end(&cmd);            
721                                                   
722 At this point, the event object is ready to be    
723 events.                                           
724                                                   
725 7.2 Tracing synthetic events from in-kernel co    
726 ----------------------------------------------    
727                                                   
728 To trace a synthetic event, there are several     
729 option is to trace the event in one call, usin    
730 with a variable number of values, or synth_eve    
731 array of values to be set.  A second option ca    
732 need for a pre-formed array of values or list     
733 synth_event_trace_start() and synth_event_trac    
734 synth_event_add_next_val() or synth_event_add_    
735 piecewise.                                        
736                                                   
737 7.2.1 Tracing a synthetic event all at once       
738 -------------------------------------------       
739                                                   
740 To trace a synthetic event all at once, the sy    
741 synth_event_trace_array() functions can be use    
742                                                   
743 The synth_event_trace() function is passed the    
744 representing the synthetic event (which can be    
745 trace_get_event_file() using the synthetic eve    
746 the system name, and the trace instance name (    
747 trace array)), along with an variable number o    
748 synthetic event field, and the number of value    
749                                                   
750 So, to trace an event corresponding to the syn    
751 above, code like the following could be used::    
752                                                   
753   ret = synth_event_trace(create_synth_test, 7    
754                           444,             /*     
755                           (u64)"clackers", /*     
756                           1000000,         /*     
757                           1000,            /*     
758                           smp_processor_id(),/    
759                           (u64)"Thneed",   /*     
760                           999);            /*     
761                                                   
762 All vals should be cast to u64, and string val    
763 strings, cast to u64.  Strings will be copied     
764 the event for the string, using these pointers    
765                                                   
766 Alternatively, the synth_event_trace_array() f    
767 accomplish the same thing.  It is passed the t    
768 representing the synthetic event (which can be    
769 trace_get_event_file() using the synthetic eve    
770 the system name, and the trace instance name (    
771 trace array)), along with an array of u64, one    
772 event field.                                      
773                                                   
774 To trace an event corresponding to the synthet    
775 above, code like the following could be used::    
776                                                   
777   u64 vals[7];                                    
778                                                   
779   vals[0] = 777;                  /* next_pid_    
780   vals[1] = (u64)"tiddlywinks";   /* next_comm    
781   vals[2] = 1000000;              /* ts_ns */     
782   vals[3] = 1000;                 /* ts_ms */     
783   vals[4] = smp_processor_id();   /* cpu */       
784   vals[5] = (u64)"thneed";        /* my_string    
785   vals[6] = 398;                  /* my_int_fi    
786                                                   
787 The 'vals' array is just an array of u64, the     
788 match the number of field in the synthetic eve    
789 the same order as the synthetic event fields.     
790                                                   
791 All vals should be cast to u64, and string val    
792 strings, cast to u64.  Strings will be copied     
793 the event for the string, using these pointers    
794                                                   
795 In order to trace a synthetic event, a pointer    
796 is needed.  The trace_get_event_file() functio    
797 it - it will find the file in the given trace     
798 NULL since the top trace array is being used)     
799 preventing the instance containing it from goi    
800                                                   
801        schedtest_event_file = trace_get_event_    
802                                                   
803                                                   
804 Before tracing the event, it should be enabled    
805 the synthetic event won't actually show up in     
806                                                   
807 To enable a synthetic event from the kernel, t    
808 can be used (which is not specific to syntheti    
809 the "synthetic" system name to be specified ex    
810                                                   
811 To enable the event, pass 'true' to it::          
812                                                   
813        trace_array_set_clr_event(schedtest_eve    
814                                  "synthetic",     
815                                                   
816 To disable it pass false::                        
817                                                   
818        trace_array_set_clr_event(schedtest_eve    
819                                  "synthetic",     
820                                                   
821 Finally, synth_event_trace_array() can be used    
822 event, which should be visible in the trace bu    
823                                                   
824        ret = synth_event_trace_array(schedtest    
825                                      ARRAY_SIZ    
826                                                   
827 To remove the synthetic event, the event shoul    
828 trace instance should be 'put' back using trac    
829                                                   
830        trace_array_set_clr_event(schedtest_eve    
831                                  "synthetic",     
832        trace_put_event_file(schedtest_event_fi    
833                                                   
834 If those have been successful, synth_event_del    
835 remove the event::                                
836                                                   
837        ret = synth_event_delete("schedtest");     
838                                                   
839 7.2.2 Tracing a synthetic event piecewise         
840 -----------------------------------------         
841                                                   
842 To trace a synthetic using the piecewise metho    
843 synth_event_trace_start() function is used to     
844 event trace::                                     
845                                                   
846        struct synth_event_trace_state trace_st    
847                                                   
848        ret = synth_event_trace_start(schedtest    
849                                                   
850 It's passed the trace_event_file representing     
851 using the same methods as described above, alo    
852 struct synth_event_trace_state object, which w    
853 used to maintain state between this and follow    
854                                                   
855 Once the event has been opened, which means sp    
856 reserved in the trace buffer, the individual f    
857 are two ways to do that, either one after anot    
858 the event, which requires no lookups, or by na    
859 tradeoff is flexibility in doing the assignmen    
860 lookup per field.                                 
861                                                   
862 To assign the values one after the other witho    
863 synth_event_add_next_val() should be used.  Ea    
864 same synth_event_trace_state object used in th    
865 along with the value to set the next field in     
866 field is set, the 'cursor' points to the next     
867 by the subsequent call, continuing until all t    
868 in order.  The same sequence of calls as in th    
869 this method would be (without error-handling c    
870                                                   
871        /* next_pid_field */                       
872        ret = synth_event_add_next_val(777, &tr    
873                                                   
874        /* next_comm_field */                      
875        ret = synth_event_add_next_val((u64)"sl    
876                                                   
877        /* ts_ns */                                
878        ret = synth_event_add_next_val(1000000,    
879                                                   
880        /* ts_ms */                                
881        ret = synth_event_add_next_val(1000, &t    
882                                                   
883        /* cpu */                                  
884        ret = synth_event_add_next_val(smp_proc    
885                                                   
886        /* my_string_field */                      
887        ret = synth_event_add_next_val((u64)"th    
888                                                   
889        /* my_int_field */                         
890        ret = synth_event_add_next_val(395, &tr    
891                                                   
892 To assign the values in any order, synth_event    
893 used.  Each call is passed the same synth_even    
894 the synth_event_trace_start(), along with the     
895 to set and the value to set it to.  The same s    
896 the above examples using this method would be     
897 code)::                                           
898                                                   
899        ret = synth_event_add_val("next_pid_fie    
900        ret = synth_event_add_val("next_comm_fi    
901                                  &trace_state)    
902        ret = synth_event_add_val("ts_ns", 1000    
903        ret = synth_event_add_val("ts_ms", 1000    
904        ret = synth_event_add_val("cpu", smp_pr    
905        ret = synth_event_add_val("my_string_fi    
906                                  &trace_state)    
907        ret = synth_event_add_val("my_int_field    
908                                                   
909 Note that synth_event_add_next_val() and synth    
910 incompatible if used within the same trace of     
911 can be used but not both at the same time.        
912                                                   
913 Finally, the event won't be actually traced un    
914 which is done using synth_event_trace_end(), w    
915 struct synth_event_trace_state object used in     
916                                                   
917        ret = synth_event_trace_end(&trace_stat    
918                                                   
919 Note that synth_event_trace_end() must be call    
920 of whether any of the add calls failed (say du    
921 being passed in).                                 
922                                                   
923 7.3 Dyamically creating kprobe and kretprobe e    
924 ----------------------------------------------    
925                                                   
926 To create a kprobe or kretprobe trace event fr    
927 kprobe_event_gen_cmd_start() or kretprobe_even    
928 functions can be used.                            
929                                                   
930 To create a kprobe event, an empty or partiall    
931 should first be created using kprobe_event_gen    
932 of the event and the probe location should be     
933 or args each representing a probe field should    
934 function.  Before calling kprobe_event_gen_cmd    
935 should create and initialize a dynevent_cmd ob    
936 kprobe_event_cmd_init().                          
937                                                   
938 For example, to create a new "schedtest" kprob    
939                                                   
940   struct dynevent_cmd cmd;                        
941   char *buf;                                      
942                                                   
943   /* Create a buffer to hold the generated com    
944   buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERN    
945                                                   
946   /* Before generating the command, initialize    
947   kprobe_event_cmd_init(&cmd, buf, MAX_DYNEVEN    
948                                                   
949   /*                                              
950    * Define the gen_kprobe_test event with the    
951    * fields.                                      
952    */                                             
953   ret = kprobe_event_gen_cmd_start(&cmd, "gen_    
954                                    "dfd=%ax",     
955                                                   
956 Once the kprobe event object has been created,    
957 populated with more fields.  Fields can be add    
958 kprobe_event_add_fields(), supplying the dynev    
959 with a variable arg list of probe fields.  For    
960 couple additional fields, the following call c    
961                                                   
962   ret = kprobe_event_add_fields(&cmd, "flags=%    
963                                                   
964 Once all the fields have been added, the event    
965 registered by calling the kprobe_event_gen_cmd    
966 kretprobe_event_gen_cmd_end() functions, depen    
967 or kretprobe command was started::                
968                                                   
969   ret = kprobe_event_gen_cmd_end(&cmd);           
970                                                   
971 or::                                              
972                                                   
973   ret = kretprobe_event_gen_cmd_end(&cmd);        
974                                                   
975 At this point, the event object is ready to be    
976 events.                                           
977                                                   
978 Similarly, a kretprobe event can be created us    
979 kretprobe_event_gen_cmd_start() with a probe n    
980 additional params such as $retval::               
981                                                   
982   ret = kretprobe_event_gen_cmd_start(&cmd, "g    
983                                       "do_sys_    
984                                                   
985 Similar to the synthetic event case, code like    
986 used to enable the newly created kprobe event:    
987                                                   
988   gen_kprobe_test = trace_get_event_file(NULL,    
989                                                   
990   ret = trace_array_set_clr_event(gen_kprobe_t    
991                                   "kprobes", "    
992                                                   
993 Finally, also similar to synthetic events, the    
994 used to give the kprobe event file back and de    
995                                                   
996   trace_put_event_file(gen_kprobe_test);          
997                                                   
998   ret = kprobe_event_delete("gen_kprobe_test")    
999                                                   
1000 7.4 The "dynevent_cmd" low-level API             
1001 ------------------------------------             
1002                                                  
1003 Both the in-kernel synthetic event and kprobe    
1004 top of a lower-level "dynevent_cmd" interface    
1005 meant to provide the basis for higher-level i    
1006 synthetic and kprobe interfaces, which can be    
1007                                                  
1008 The basic idea is simple and amounts to provi    
1009 layer that can be used to generate trace even    
1010 generated command strings can then be passed     
1011 and event creation code that already exists i    
1012 subsystem for creating the corresponding trac    
1013                                                  
1014 In a nutshell, the way it works is that the h    
1015 code creates a struct dynevent_cmd object, th    
1016 functions, dynevent_arg_add() and dynevent_ar    
1017 a command string, which finally causes the co    
1018 using the dynevent_create() function.  The de    
1019 are described below.                             
1020                                                  
1021 The first step in building a new command stri    
1022 initialize an instance of a dynevent_cmd.  He    
1023 create a dynevent_cmd on the stack and initia    
1024                                                  
1025   struct dynevent_cmd cmd;                       
1026   char *buf;                                     
1027   int ret;                                       
1028                                                  
1029   buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KER    
1030                                                  
1031   dynevent_cmd_init(cmd, buf, maxlen, DYNEVEN    
1032                     foo_event_run_command);      
1033                                                  
1034 The dynevent_cmd initialization needs to be g    
1035 buffer and the length of the buffer (MAX_DYNE    
1036 for this purpose - at 2k it's generally too b    
1037 on the stack, so is dynamically allocated), a    
1038 is meant to be used to check that further API    
1039 correct command type, and a pointer to an eve    
1040 callback that will be called to actually exec    
1041 command function.                                
1042                                                  
1043 Once that's done, the command string can by b    
1044 calls to argument-adding functions.              
1045                                                  
1046 To add a single argument, define and initiali    
1047 or struct dynevent_arg_pair object.  Here's a    
1048 possible arg addition, which is simply to app    
1049 a whitespace-separated argument to the comman    
1050                                                  
1051   struct dynevent_arg arg;                       
1052                                                  
1053   dynevent_arg_init(&arg, NULL, 0);              
1054                                                  
1055   arg.str = name;                                
1056                                                  
1057   ret = dynevent_arg_add(cmd, &arg);             
1058                                                  
1059 The arg object is first initialized using dyn    
1060 this case the parameters are NULL or 0, which    
1061 optional sanity-checking function or separato    
1062 the arg.                                         
1063                                                  
1064 Here's another more complicated example using    
1065 used to create an argument that consists of a    
1066 together as a unit, for example, a 'type fiel    
1067 expression arg e.g. 'flags=%cx'::                
1068                                                  
1069   struct dynevent_arg_pair arg_pair;             
1070                                                  
1071   dynevent_arg_pair_init(&arg_pair, dynevent_    
1072                                                  
1073   arg_pair.lhs = type;                           
1074   arg_pair.rhs = name;                           
1075                                                  
1076   ret = dynevent_arg_pair_add(cmd, &arg_pair)    
1077                                                  
1078 Again, the arg_pair is first initialized, in     
1079 function used to check the sanity of the args    
1080 neither part of the pair is NULL), along with    
1081 to add an operator between the pair (here non    
1082 appended onto the end of the arg pair (here '    
1083                                                  
1084 There's also a dynevent_str_add() function th    
1085 add a string as-is, with no spaces, delimiter    
1086                                                  
1087 Any number of dynevent_*_add() calls can be m    
1088 (until its length surpasses cmd->maxlen).  Wh    
1089 been added and the command string is complete    
1090 do is run the command, which happens by simpl    
1091 dynevent_create()::                              
1092                                                  
1093   ret = dynevent_create(&cmd);                   
1094                                                  
1095 At that point, if the return value is 0, the     
1096 created and is ready to use.                     
1097                                                  
1098 See the dynevent_cmd function definitions the    
1099 of the API.                                      
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php