~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/trace/user_events.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/trace/user_events.rst (Version linux-6.11.5) and /Documentation/trace/user_events.rst (Version linux-6.7.12)


  1 =========================================           1 =========================================
  2 user_events: User-based Event Tracing               2 user_events: User-based Event Tracing
  3 =========================================           3 =========================================
  4                                                     4 
  5 :Author: Beau Belgrave                              5 :Author: Beau Belgrave
  6                                                     6 
  7 Overview                                            7 Overview
  8 --------                                            8 --------
  9 User based trace events allow user processes t      9 User based trace events allow user processes to create events and trace data
 10 that can be viewed via existing tools, such as     10 that can be viewed via existing tools, such as ftrace and perf.
 11 To enable this feature, build your kernel with     11 To enable this feature, build your kernel with CONFIG_USER_EVENTS=y.
 12                                                    12 
 13 Programs can view status of the events via         13 Programs can view status of the events via
 14 /sys/kernel/tracing/user_events_status and can     14 /sys/kernel/tracing/user_events_status and can both register and write
 15 data out via /sys/kernel/tracing/user_events_d     15 data out via /sys/kernel/tracing/user_events_data.
 16                                                    16 
 17 Programs can also use /sys/kernel/tracing/dyna     17 Programs can also use /sys/kernel/tracing/dynamic_events to register and
 18 delete user based events via the u: prefix. Th     18 delete user based events via the u: prefix. The format of the command to
 19 dynamic_events is the same as the ioctl with t     19 dynamic_events is the same as the ioctl with the u: prefix applied. This
 20 requires CAP_PERFMON due to the event persisti     20 requires CAP_PERFMON due to the event persisting, otherwise -EPERM is returned.
 21                                                    21 
 22 Typically programs will register a set of even     22 Typically programs will register a set of events that they wish to expose to
 23 tools that can read trace_events (such as ftra     23 tools that can read trace_events (such as ftrace and perf). The registration
 24 process tells the kernel which address and bit     24 process tells the kernel which address and bit to reflect if any tool has
 25 enabled the event and data should be written.      25 enabled the event and data should be written. The registration will give back
 26 a write index which describes the data when a      26 a write index which describes the data when a write() or writev() is called
 27 on the /sys/kernel/tracing/user_events_data fi     27 on the /sys/kernel/tracing/user_events_data file.
 28                                                    28 
 29 The structures referenced in this document are     29 The structures referenced in this document are contained within the
 30 /include/uapi/linux/user_events.h file in the      30 /include/uapi/linux/user_events.h file in the source tree.
 31                                                    31 
 32 **NOTE:** *Both user_events_status and user_ev     32 **NOTE:** *Both user_events_status and user_events_data are under the tracefs
 33 filesystem and may be mounted at different pat     33 filesystem and may be mounted at different paths than above.*
 34                                                    34 
 35 Registering                                        35 Registering
 36 -----------                                        36 -----------
 37 Registering within a user process is done via      37 Registering within a user process is done via ioctl() out to the
 38 /sys/kernel/tracing/user_events_data file. The     38 /sys/kernel/tracing/user_events_data file. The command to issue is
 39 DIAG_IOCSREG.                                      39 DIAG_IOCSREG.
 40                                                    40 
 41 This command takes a packed struct user_reg as     41 This command takes a packed struct user_reg as an argument::
 42                                                    42 
 43   struct user_reg {                                43   struct user_reg {
 44         /* Input: Size of the user_reg structu     44         /* Input: Size of the user_reg structure being used */
 45         __u32 size;                                45         __u32 size;
 46                                                    46 
 47         /* Input: Bit in enable address to use     47         /* Input: Bit in enable address to use */
 48         __u8 enable_bit;                           48         __u8 enable_bit;
 49                                                    49 
 50         /* Input: Enable size in bytes at addr     50         /* Input: Enable size in bytes at address */
 51         __u8 enable_size;                          51         __u8 enable_size;
 52                                                    52 
 53         /* Input: Flags to use, if any */          53         /* Input: Flags to use, if any */
 54         __u16 flags;                               54         __u16 flags;
 55                                                    55 
 56         /* Input: Address to update when enabl     56         /* Input: Address to update when enabled */
 57         __u64 enable_addr;                         57         __u64 enable_addr;
 58                                                    58 
 59         /* Input: Pointer to string with event     59         /* Input: Pointer to string with event name, description and flags */
 60         __u64 name_args;                           60         __u64 name_args;
 61                                                    61 
 62         /* Output: Index of the event to use w     62         /* Output: Index of the event to use when writing data */
 63         __u32 write_index;                         63         __u32 write_index;
 64   } __attribute__((__packed__));                   64   } __attribute__((__packed__));
 65                                                    65 
 66 The struct user_reg requires all the above inp     66 The struct user_reg requires all the above inputs to be set appropriately.
 67                                                    67 
 68 + size: This must be set to sizeof(struct user     68 + size: This must be set to sizeof(struct user_reg).
 69                                                    69 
 70 + enable_bit: The bit to reflect the event sta     70 + enable_bit: The bit to reflect the event status at the address specified by
 71   enable_addr.                                     71   enable_addr.
 72                                                    72 
 73 + enable_size: The size of the value specified     73 + enable_size: The size of the value specified by enable_addr.
 74   This must be 4 (32-bit) or 8 (64-bit). 64-bi     74   This must be 4 (32-bit) or 8 (64-bit). 64-bit values are only allowed to be
 75   used on 64-bit kernels, however, 32-bit can      75   used on 64-bit kernels, however, 32-bit can be used on all kernels.
 76                                                    76 
 77 + flags: The flags to use, if any.                 77 + flags: The flags to use, if any.
 78   Callers should first attempt to use flags an     78   Callers should first attempt to use flags and retry without flags to ensure
 79   support for lower versions of the kernel. If     79   support for lower versions of the kernel. If a flag is not supported -EINVAL
 80   is returned.                                     80   is returned.
 81                                                    81 
 82 + enable_addr: The address of the value to use     82 + enable_addr: The address of the value to use to reflect event status. This
 83   must be naturally aligned and write accessib     83   must be naturally aligned and write accessible within the user program.
 84                                                    84 
 85 + name_args: The name and arguments to describ     85 + name_args: The name and arguments to describe the event, see command format
 86   for details.                                     86   for details.
 87                                                    87 
 88 The following flags are currently supported.       88 The following flags are currently supported.
 89                                                    89 
 90 + USER_EVENT_REG_PERSIST: The event will not d     90 + USER_EVENT_REG_PERSIST: The event will not delete upon the last reference
 91   closing. Callers may use this if an event sh     91   closing. Callers may use this if an event should exist even after the
 92   process closes or unregisters the event. Req     92   process closes or unregisters the event. Requires CAP_PERFMON otherwise
 93   -EPERM is returned.                              93   -EPERM is returned.
 94                                                    94 
 95 + USER_EVENT_REG_MULTI_FORMAT: The event can c << 
 96   allows programs to prevent themselves from b << 
 97   format changes and they wish to use the same << 
 98   tracepoint name will be in the new format of << 
 99   format of "name". A tracepoint will be creat << 
100   and format. This means if several processes  << 
101   they will use the same tracepoint. If yet an << 
102   but a different format than the other proces << 
103   tracepoint with a new unique id. Recording p << 
104   the various different formats of the event n << 
105   recording. The system name of the tracepoint << 
106   instead of "user_events". This prevents sing << 
107   with any multi-format event names within tra << 
108   a hex string. Recording programs should ensu << 
109   the event name they registered and has a suf << 
110   has hex characters. For example to find all  << 
111   can use the regex "^test\.[0-9a-fA-F]+$".    << 
112                                                << 
113 Upon successful registration the following is      95 Upon successful registration the following is set.
114                                                    96 
115 + write_index: The index to use for this file      97 + write_index: The index to use for this file descriptor that represents this
116   event when writing out data. The index is un     98   event when writing out data. The index is unique to this instance of the file
117   descriptor that was used for the registratio     99   descriptor that was used for the registration. See writing data for details.
118                                                   100 
119 User based events show up under tracefs like a    101 User based events show up under tracefs like any other event under the
120 subsystem named "user_events". This means tool    102 subsystem named "user_events". This means tools that wish to attach to the
121 events need to use /sys/kernel/tracing/events/    103 events need to use /sys/kernel/tracing/events/user_events/[name]/enable
122 or perf record -e user_events:[name] when atta    104 or perf record -e user_events:[name] when attaching/recording.
123                                                   105 
124 **NOTE:** The event subsystem name by default     106 **NOTE:** The event subsystem name by default is "user_events". Callers should
125 not assume it will always be "user_events". Op    107 not assume it will always be "user_events". Operators reserve the right in the
126 future to change the subsystem name per-proces    108 future to change the subsystem name per-process to accommodate event isolation.
127 In addition if the USER_EVENT_REG_MULTI_FORMAT << 
128 will have a unique id appended to it and the s << 
129 "user_events_multi" as described above.        << 
130                                                   109 
131 Command Format                                    110 Command Format
132 ^^^^^^^^^^^^^^                                    111 ^^^^^^^^^^^^^^
133 The command string format is as follows::         112 The command string format is as follows::
134                                                   113 
135   name[:FLAG1[,FLAG2...]] [Field1[;Field2...]]    114   name[:FLAG1[,FLAG2...]] [Field1[;Field2...]]
136                                                   115 
137 Supported Flags                                   116 Supported Flags
138 ^^^^^^^^^^^^^^^                                   117 ^^^^^^^^^^^^^^^
139 None yet                                          118 None yet
140                                                   119 
141 Field Format                                      120 Field Format
142 ^^^^^^^^^^^^                                      121 ^^^^^^^^^^^^
143 ::                                                122 ::
144                                                   123 
145   type name [size]                                124   type name [size]
146                                                   125 
147 Basic types are supported (__data_loc, u32, u6    126 Basic types are supported (__data_loc, u32, u64, int, char, char[20], etc).
148 User programs are encouraged to use clearly si    127 User programs are encouraged to use clearly sized types like u32.
149                                                   128 
150 **NOTE:** *Long is not supported since size ca    129 **NOTE:** *Long is not supported since size can vary between user and kernel.*
151                                                   130 
152 The size is only valid for types that start wi    131 The size is only valid for types that start with a struct prefix.
153 This allows user programs to describe custom s    132 This allows user programs to describe custom structs out to tools, if required.
154                                                   133 
155 For example, a struct in C that looks like thi    134 For example, a struct in C that looks like this::
156                                                   135 
157   struct mytype {                                 136   struct mytype {
158     char data[20];                                137     char data[20];
159   };                                              138   };
160                                                   139 
161 Would be represented by the following field::     140 Would be represented by the following field::
162                                                   141 
163   struct mytype myname 20                         142   struct mytype myname 20
164                                                   143 
165 Deleting                                          144 Deleting
166 --------                                          145 --------
167 Deleting an event from within a user process i    146 Deleting an event from within a user process is done via ioctl() out to the
168 /sys/kernel/tracing/user_events_data file. The    147 /sys/kernel/tracing/user_events_data file. The command to issue is
169 DIAG_IOCSDEL.                                     148 DIAG_IOCSDEL.
170                                                   149 
171 This command only requires a single string spe    150 This command only requires a single string specifying the event to delete by
172 its name. Delete will only succeed if there ar    151 its name. Delete will only succeed if there are no references left to the
173 event (in both user and kernel space). User pr    152 event (in both user and kernel space). User programs should use a separate file
174 to request deletes than the one used for regis    153 to request deletes than the one used for registration due to this.
175                                                   154 
176 **NOTE:** By default events will auto-delete w    155 **NOTE:** By default events will auto-delete when there are no references left
177 to the event. If programs do not want auto-del    156 to the event. If programs do not want auto-delete, they must use the
178 USER_EVENT_REG_PERSIST flag when registering t    157 USER_EVENT_REG_PERSIST flag when registering the event. Once that flag is used
179 the event exists until DIAG_IOCSDEL is invoked    158 the event exists until DIAG_IOCSDEL is invoked. Both register and delete of an
180 event that persists requires CAP_PERFMON, othe !! 159 event that persists requires CAP_PERFMON, otherwise -EPERM is returned.
181 there are multiple formats of the same event n << 
182 name will be attempted to be deleted. If only  << 
183 be deleted then the /sys/kernel/tracing/dynami << 
184 that specific format of the event.             << 
185                                                   160 
186 Unregistering                                     161 Unregistering
187 -------------                                     162 -------------
188 If after registering an event it is no longer     163 If after registering an event it is no longer wanted to be updated then it can
189 be disabled via ioctl() out to the /sys/kernel    164 be disabled via ioctl() out to the /sys/kernel/tracing/user_events_data file.
190 The command to issue is DIAG_IOCSUNREG. This i    165 The command to issue is DIAG_IOCSUNREG. This is different than deleting, where
191 deleting actually removes the event from the s    166 deleting actually removes the event from the system. Unregistering simply tells
192 the kernel your process is no longer intereste    167 the kernel your process is no longer interested in updates to the event.
193                                                   168 
194 This command takes a packed struct user_unreg     169 This command takes a packed struct user_unreg as an argument::
195                                                   170 
196   struct user_unreg {                             171   struct user_unreg {
197         /* Input: Size of the user_unreg struc    172         /* Input: Size of the user_unreg structure being used */
198         __u32 size;                               173         __u32 size;
199                                                   174 
200         /* Input: Bit to unregister */            175         /* Input: Bit to unregister */
201         __u8 disable_bit;                         176         __u8 disable_bit;
202                                                   177 
203         /* Input: Reserved, set to 0 */           178         /* Input: Reserved, set to 0 */
204         __u8 __reserved;                          179         __u8 __reserved;
205                                                   180 
206         /* Input: Reserved, set to 0 */           181         /* Input: Reserved, set to 0 */
207         __u16 __reserved2;                        182         __u16 __reserved2;
208                                                   183 
209         /* Input: Address to unregister */        184         /* Input: Address to unregister */
210         __u64 disable_addr;                       185         __u64 disable_addr;
211   } __attribute__((__packed__));                  186   } __attribute__((__packed__));
212                                                   187 
213 The struct user_unreg requires all the above i    188 The struct user_unreg requires all the above inputs to be set appropriately.
214                                                   189 
215 + size: This must be set to sizeof(struct user    190 + size: This must be set to sizeof(struct user_unreg).
216                                                   191 
217 + disable_bit: This must be set to the bit to     192 + disable_bit: This must be set to the bit to disable (same bit that was
218   previously registered via enable_bit).          193   previously registered via enable_bit).
219                                                   194 
220 + disable_addr: This must be set to the addres    195 + disable_addr: This must be set to the address to disable (same address that was
221   previously registered via enable_addr).         196   previously registered via enable_addr).
222                                                   197 
223 **NOTE:** Events are automatically unregistere    198 **NOTE:** Events are automatically unregistered when execve() is invoked. During
224 fork() the registered events will be retained     199 fork() the registered events will be retained and must be unregistered manually
225 in each process if wanted.                        200 in each process if wanted.
226                                                   201 
227 Status                                            202 Status
228 ------                                            203 ------
229 When tools attach/record user based events the    204 When tools attach/record user based events the status of the event is updated
230 in realtime. This allows user programs to only    205 in realtime. This allows user programs to only incur the cost of the write() or
231 writev() calls when something is actively atta    206 writev() calls when something is actively attached to the event.
232                                                   207 
233 The kernel will update the specified bit that     208 The kernel will update the specified bit that was registered for the event as
234 tools attach/detach from the event. User progr    209 tools attach/detach from the event. User programs simply check if the bit is set
235 to see if something is attached or not.           210 to see if something is attached or not.
236                                                   211 
237 Administrators can easily check the status of     212 Administrators can easily check the status of all registered events by reading
238 the user_events_status file directly via a ter    213 the user_events_status file directly via a terminal. The output is as follows::
239                                                   214 
240   Name [# Comments]                               215   Name [# Comments]
241   ...                                             216   ...
242                                                   217 
243   Active: ActiveCount                             218   Active: ActiveCount
244   Busy: BusyCount                                 219   Busy: BusyCount
245                                                   220 
246 For example, on a system that has a single eve    221 For example, on a system that has a single event the output looks like this::
247                                                   222 
248   test                                            223   test
249                                                   224 
250   Active: 1                                       225   Active: 1
251   Busy: 0                                         226   Busy: 0
252                                                   227 
253 If a user enables the user event via ftrace, t    228 If a user enables the user event via ftrace, the output would change to this::
254                                                   229 
255   test # Used by ftrace                           230   test # Used by ftrace
256                                                   231 
257   Active: 1                                       232   Active: 1
258   Busy: 1                                         233   Busy: 1
259                                                   234 
260 Writing Data                                      235 Writing Data
261 ------------                                      236 ------------
262 After registering an event the same fd that wa    237 After registering an event the same fd that was used to register can be used
263 to write an entry for that event. The write_in    238 to write an entry for that event. The write_index returned must be at the start
264 of the data, then the remaining data is treate    239 of the data, then the remaining data is treated as the payload of the event.
265                                                   240 
266 For example, if write_index returned was 1 and    241 For example, if write_index returned was 1 and I wanted to write out an int
267 payload of the event. Then the data would have    242 payload of the event. Then the data would have to be 8 bytes (2 ints) in size,
268 with the first 4 bytes being equal to 1 and th    243 with the first 4 bytes being equal to 1 and the last 4 bytes being equal to the
269 value I want as the payload.                      244 value I want as the payload.
270                                                   245 
271 In memory this would look like this::             246 In memory this would look like this::
272                                                   247 
273   int index;                                      248   int index;
274   int payload;                                    249   int payload;
275                                                   250 
276 User programs might have well known structs th    251 User programs might have well known structs that they wish to use to emit out
277 as payloads. In those cases writev() can be us    252 as payloads. In those cases writev() can be used, with the first vector being
278 the index and the following vector(s) being th    253 the index and the following vector(s) being the actual event payload.
279                                                   254 
280 For example, if I have a struct like this::       255 For example, if I have a struct like this::
281                                                   256 
282   struct payload {                                257   struct payload {
283         int src;                                  258         int src;
284         int dst;                                  259         int dst;
285         int flags;                                260         int flags;
286   } __attribute__((__packed__));                  261   } __attribute__((__packed__));
287                                                   262 
288 It's advised for user programs to do the follo    263 It's advised for user programs to do the following::
289                                                   264 
290   struct iovec io[2];                             265   struct iovec io[2];
291   struct payload e;                               266   struct payload e;
292                                                   267 
293   io[0].iov_base = &write_index;                  268   io[0].iov_base = &write_index;
294   io[0].iov_len = sizeof(write_index);            269   io[0].iov_len = sizeof(write_index);
295   io[1].iov_base = &e;                            270   io[1].iov_base = &e;
296   io[1].iov_len = sizeof(e);                      271   io[1].iov_len = sizeof(e);
297                                                   272 
298   writev(fd, (const struct iovec*)io, 2);         273   writev(fd, (const struct iovec*)io, 2);
299                                                   274 
300 **NOTE:** *The write_index is not emitted out     275 **NOTE:** *The write_index is not emitted out into the trace being recorded.*
301                                                   276 
302 Example Code                                      277 Example Code
303 ------------                                      278 ------------
304 See sample code in samples/user_events.           279 See sample code in samples/user_events.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php