1 ================================= 1 ================================= 2 Using ftrace to hook to functions 2 Using ftrace to hook to functions 3 ================================= 3 ================================= 4 4 5 .. Copyright 2017 VMware Inc. 5 .. Copyright 2017 VMware Inc. 6 .. Author: Steven Rostedt <srostedt@goodmis 6 .. Author: Steven Rostedt <srostedt@goodmis.org> 7 .. License: The GNU Free Documentation Lice 7 .. License: The GNU Free Documentation License, Version 1.2 8 .. (dual licensed under the GPL 8 .. (dual licensed under the GPL v2) 9 9 10 Written for: 4.14 10 Written for: 4.14 11 11 12 Introduction 12 Introduction 13 ============ 13 ============ 14 14 15 The ftrace infrastructure was originally creat 15 The ftrace infrastructure was originally created to attach callbacks to the 16 beginning of functions in order to record and 16 beginning of functions in order to record and trace the flow of the kernel. 17 But callbacks to the start of a function can h 17 But callbacks to the start of a function can have other use cases. Either 18 for live kernel patching, or for security moni 18 for live kernel patching, or for security monitoring. This document describes 19 how to use ftrace to implement your own functi 19 how to use ftrace to implement your own function callbacks. 20 20 21 21 22 The ftrace context 22 The ftrace context 23 ================== 23 ================== 24 .. warning:: 24 .. warning:: 25 25 26 The ability to add a callback to almost any 26 The ability to add a callback to almost any function within the 27 kernel comes with risks. A callback can be c 27 kernel comes with risks. A callback can be called from any context 28 (normal, softirq, irq, and NMI). Callbacks c 28 (normal, softirq, irq, and NMI). Callbacks can also be called just before 29 going to idle, during CPU bring up and taked 29 going to idle, during CPU bring up and takedown, or going to user space. 30 This requires extra care to what can be done 30 This requires extra care to what can be done inside a callback. A callback 31 can be called outside the protective scope o 31 can be called outside the protective scope of RCU. 32 32 33 There are helper functions to help against rec !! 33 The ftrace infrastructure has some protections against recursions and RCU 34 RCU is watching. These are explained below. !! 34 but one must still be very careful how they use the callbacks. 35 35 36 36 37 The ftrace_ops structure 37 The ftrace_ops structure 38 ======================== 38 ======================== 39 39 40 To register a function callback, a ftrace_ops 40 To register a function callback, a ftrace_ops is required. This structure 41 is used to tell ftrace what function should be 41 is used to tell ftrace what function should be called as the callback 42 as well as what protections the callback will 42 as well as what protections the callback will perform and not require 43 ftrace to handle. 43 ftrace to handle. 44 44 45 There is only one field that is needed to be s 45 There is only one field that is needed to be set when registering 46 an ftrace_ops with ftrace: 46 an ftrace_ops with ftrace: 47 47 48 .. code-block:: c 48 .. code-block:: c 49 49 50 struct ftrace_ops ops = { 50 struct ftrace_ops ops = { 51 .func = my_callback_ 51 .func = my_callback_func, 52 .flags = MY_FTRACE_FL 52 .flags = MY_FTRACE_FLAGS 53 .private = any_private_ 53 .private = any_private_data_structure, 54 }; 54 }; 55 55 56 Both .flags and .private are optional. Only .f 56 Both .flags and .private are optional. Only .func is required. 57 57 58 To enable tracing call:: !! 58 To enable tracing call: 59 59 60 register_ftrace_function(&ops); !! 60 .. c:function:: register_ftrace_function(&ops); 61 61 62 To disable tracing call:: !! 62 To disable tracing call: 63 63 64 unregister_ftrace_function(&ops); !! 64 .. c:function:: unregister_ftrace_function(&ops); 65 65 66 The above is defined by including the header:: !! 66 The above is defined by including the header: 67 67 68 #include <linux/ftrace.h> !! 68 .. c:function:: #include <linux/ftrace.h> 69 69 70 The registered callback will start being calle 70 The registered callback will start being called some time after the 71 register_ftrace_function() is called and befor 71 register_ftrace_function() is called and before it returns. The exact time 72 that callbacks start being called is dependent 72 that callbacks start being called is dependent upon architecture and scheduling 73 of services. The callback itself will have to 73 of services. The callback itself will have to handle any synchronization if it 74 must begin at an exact moment. 74 must begin at an exact moment. 75 75 76 The unregister_ftrace_function() will guarante 76 The unregister_ftrace_function() will guarantee that the callback is 77 no longer being called by functions after the 77 no longer being called by functions after the unregister_ftrace_function() 78 returns. Note that to perform this guarantee, 78 returns. Note that to perform this guarantee, the unregister_ftrace_function() 79 may take some time to finish. 79 may take some time to finish. 80 80 81 81 82 The callback function 82 The callback function 83 ===================== 83 ===================== 84 84 85 The prototype of the callback function is as f 85 The prototype of the callback function is as follows (as of v4.14): 86 86 87 .. code-block:: c 87 .. code-block:: c 88 88 89 void callback_func(unsigned long ip, unsign 89 void callback_func(unsigned long ip, unsigned long parent_ip, 90 struct ftrace_ops *op, s 90 struct ftrace_ops *op, struct pt_regs *regs); 91 91 92 @ip 92 @ip 93 This is the instruction pointer of th 93 This is the instruction pointer of the function that is being traced. 94 (where the fentry or mcount is within 94 (where the fentry or mcount is within the function) 95 95 96 @parent_ip 96 @parent_ip 97 This is the instruction pointer of the 97 This is the instruction pointer of the function that called the 98 the function being traced (where the c 98 the function being traced (where the call of the function occurred). 99 99 100 @op 100 @op 101 This is a pointer to ftrace_ops that w 101 This is a pointer to ftrace_ops that was used to register the callback. 102 This can be used to pass data to the c 102 This can be used to pass data to the callback via the private pointer. 103 103 104 @regs 104 @regs 105 If the FTRACE_OPS_FL_SAVE_REGS or FTRA 105 If the FTRACE_OPS_FL_SAVE_REGS or FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED 106 flags are set in the ftrace_ops struct 106 flags are set in the ftrace_ops structure, then this will be pointing 107 to the pt_regs structure like it would 107 to the pt_regs structure like it would be if an breakpoint was placed 108 at the start of the function where ftr 108 at the start of the function where ftrace was tracing. Otherwise it 109 either contains garbage, or NULL. 109 either contains garbage, or NULL. 110 110 111 Protect your callback << 112 ===================== << 113 << 114 As functions can be called from anywhere, and << 115 called by a callback may also be traced, and c << 116 recursion protection must be used. There are t << 117 can help in this regard. If you start your cod << 118 << 119 .. code-block:: c << 120 << 121 int bit; << 122 << 123 bit = ftrace_test_recursion_trylock(ip << 124 if (bit < 0) << 125 return; << 126 << 127 and end it with: << 128 << 129 .. code-block:: c << 130 << 131 ftrace_test_recursion_unlock(bit); << 132 << 133 The code in between will be safe to use, even << 134 function that the callback is tracing. Note, o << 135 ftrace_test_recursion_trylock() will disable p << 136 ftrace_test_recursion_unlock() will enable it << 137 enabled). The instruction pointer (ip) and its << 138 ftrace_test_recursion_trylock() to record wher << 139 (if CONFIG_FTRACE_RECORD_RECURSION is set). << 140 << 141 Alternatively, if the FTRACE_OPS_FL_RECURSION << 142 (as explained below), then a helper trampoline << 143 for recursion for the callback and no recursio << 144 But this is at the expense of a slightly more << 145 function call. << 146 << 147 If your callback accesses any data or critical << 148 protection, it is best to make sure that RCU i << 149 that data or critical section will not be prot << 150 case add: << 151 << 152 .. code-block:: c << 153 << 154 if (!rcu_is_watching()) << 155 return; << 156 << 157 Alternatively, if the FTRACE_OPS_FL_RCU flag i << 158 (as explained below), then a helper trampoline << 159 for rcu_is_watching for the callback and no ot << 160 But this is at the expense of a slightly more << 161 function call. << 162 << 163 111 164 The ftrace FLAGS 112 The ftrace FLAGS 165 ================ 113 ================ 166 114 167 The ftrace_ops flags are all defined and docum 115 The ftrace_ops flags are all defined and documented in include/linux/ftrace.h. 168 Some of the flags are used for internal infras 116 Some of the flags are used for internal infrastructure of ftrace, but the 169 ones that users should be aware of are the fol 117 ones that users should be aware of are the following: 170 118 171 FTRACE_OPS_FL_SAVE_REGS 119 FTRACE_OPS_FL_SAVE_REGS 172 If the callback requires reading or mo 120 If the callback requires reading or modifying the pt_regs 173 passed to the callback, then it must s 121 passed to the callback, then it must set this flag. Registering 174 a ftrace_ops with this flag set on an 122 a ftrace_ops with this flag set on an architecture that does not 175 support passing of pt_regs to the call 123 support passing of pt_regs to the callback will fail. 176 124 177 FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED 125 FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED 178 Similar to SAVE_REGS but the registeri 126 Similar to SAVE_REGS but the registering of a 179 ftrace_ops on an architecture that doe 127 ftrace_ops on an architecture that does not support passing of regs 180 will not fail with this flag set. But 128 will not fail with this flag set. But the callback must check if 181 regs is NULL or not to determine if th 129 regs is NULL or not to determine if the architecture supports it. 182 130 183 FTRACE_OPS_FL_RECURSION !! 131 FTRACE_OPS_FL_RECURSION_SAFE 184 By default, it is expected that the ca !! 132 By default, a wrapper is added around the callback to 185 But if the callback is not that worrie !! 133 make sure that recursion of the function does not occur. That is, 186 setting this bit will add the recursio !! 134 if a function that is called as a result of the callback's execution 187 callback by calling a helper function !! 135 is also traced, ftrace will prevent the callback from being called 188 protection and only call the callback !! 136 again. But this wrapper adds some overhead, and if the callback is 189 !! 137 safe from recursion, it can set this flag to disable the ftrace 190 Note, if this flag is not set, and rec !! 138 protection. 191 cause the system to crash, and possibl !! 139 192 !! 140 Note, if this flag is set, and recursion does occur, it could cause 193 Note, if this flag is set, then the ca !! 141 the system to crash, and possibly reboot via a triple fault. 194 with preemption disabled. If it is not !! 142 195 (but not guaranteed) that the callback !! 143 It is OK if another callback traces a function that is called by a 196 preemptable context. !! 144 callback that is marked recursion safe. Recursion safe callbacks >> 145 must never trace any function that are called by the callback >> 146 itself or any nested functions that those functions call. >> 147 >> 148 If this flag is set, it is possible that the callback will also >> 149 be called with preemption enabled (when CONFIG_PREEMPTION is set), >> 150 but this is not guaranteed. 197 151 198 FTRACE_OPS_FL_IPMODIFY 152 FTRACE_OPS_FL_IPMODIFY 199 Requires FTRACE_OPS_FL_SAVE_REGS set. 153 Requires FTRACE_OPS_FL_SAVE_REGS set. If the callback is to "hijack" 200 the traced function (have another func 154 the traced function (have another function called instead of the 201 traced function), it requires setting 155 traced function), it requires setting this flag. This is what live 202 kernel patches uses. Without this flag 156 kernel patches uses. Without this flag the pt_regs->ip can not be 203 modified. 157 modified. 204 158 205 Note, only one ftrace_ops with FTRACE_ 159 Note, only one ftrace_ops with FTRACE_OPS_FL_IPMODIFY set may be 206 registered to any given function at a 160 registered to any given function at a time. 207 161 208 FTRACE_OPS_FL_RCU 162 FTRACE_OPS_FL_RCU 209 If this is set, then the callback will 163 If this is set, then the callback will only be called by functions 210 where RCU is "watching". This is requi 164 where RCU is "watching". This is required if the callback function 211 performs any rcu_read_lock() operation 165 performs any rcu_read_lock() operation. 212 166 213 RCU stops watching when the system goe 167 RCU stops watching when the system goes idle, the time when a CPU 214 is taken down and comes back online, a 168 is taken down and comes back online, and when entering from kernel 215 to user space and back to kernel space 169 to user space and back to kernel space. During these transitions, 216 a callback may be executed and RCU syn 170 a callback may be executed and RCU synchronization will not protect 217 it. 171 it. 218 172 219 FTRACE_OPS_FL_PERMANENT 173 FTRACE_OPS_FL_PERMANENT 220 If this is set on any ftrace ops, then 174 If this is set on any ftrace ops, then the tracing cannot disabled by 221 writing 0 to the proc sysctl ftrace_en 175 writing 0 to the proc sysctl ftrace_enabled. Equally, a callback with 222 the flag set cannot be registered if f 176 the flag set cannot be registered if ftrace_enabled is 0. 223 177 224 Livepatch uses it not to lose the func 178 Livepatch uses it not to lose the function redirection, so the system 225 stays protected. 179 stays protected. 226 180 227 181 228 Filtering which functions to trace 182 Filtering which functions to trace 229 ================================== 183 ================================== 230 184 231 If a callback is only to be called from specif 185 If a callback is only to be called from specific functions, a filter must be 232 set up. The filters are added by name, or ip i 186 set up. The filters are added by name, or ip if it is known. 233 187 234 .. code-block:: c 188 .. code-block:: c 235 189 236 int ftrace_set_filter(struct ftrace_ops *op 190 int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf, 237 int len, int reset); 191 int len, int reset); 238 192 239 @ops 193 @ops 240 The ops to set the filter with 194 The ops to set the filter with 241 195 242 @buf 196 @buf 243 The string that holds the function fil 197 The string that holds the function filter text. 244 @len 198 @len 245 The length of the string. 199 The length of the string. 246 200 247 @reset 201 @reset 248 Non-zero to reset all filters before a 202 Non-zero to reset all filters before applying this filter. 249 203 250 Filters denote which functions should be enabl 204 Filters denote which functions should be enabled when tracing is enabled. 251 If @buf is NULL and reset is set, all function 205 If @buf is NULL and reset is set, all functions will be enabled for tracing. 252 206 253 The @buf can also be a glob expression to enab 207 The @buf can also be a glob expression to enable all functions that 254 match a specific pattern. 208 match a specific pattern. 255 209 256 See Filter Commands in :file:`Documentation/tr 210 See Filter Commands in :file:`Documentation/trace/ftrace.rst`. 257 211 258 To just trace the schedule function: 212 To just trace the schedule function: 259 213 260 .. code-block:: c 214 .. code-block:: c 261 215 262 ret = ftrace_set_filter(&ops, "schedule", s 216 ret = ftrace_set_filter(&ops, "schedule", strlen("schedule"), 0); 263 217 264 To add more functions, call the ftrace_set_fil 218 To add more functions, call the ftrace_set_filter() more than once with the 265 @reset parameter set to zero. To remove the cu 219 @reset parameter set to zero. To remove the current filter set and replace it 266 with new functions defined by @buf, have @rese 220 with new functions defined by @buf, have @reset be non-zero. 267 221 268 To remove all the filtered functions and trace 222 To remove all the filtered functions and trace all functions: 269 223 270 .. code-block:: c 224 .. code-block:: c 271 225 272 ret = ftrace_set_filter(&ops, NULL, 0, 1); 226 ret = ftrace_set_filter(&ops, NULL, 0, 1); 273 227 274 228 275 Sometimes more than one function has the same 229 Sometimes more than one function has the same name. To trace just a specific 276 function in this case, ftrace_set_filter_ip() 230 function in this case, ftrace_set_filter_ip() can be used. 277 231 278 .. code-block:: c 232 .. code-block:: c 279 233 280 ret = ftrace_set_filter_ip(&ops, ip, 0, 0); 234 ret = ftrace_set_filter_ip(&ops, ip, 0, 0); 281 235 282 Although the ip must be the address where the 236 Although the ip must be the address where the call to fentry or mcount is 283 located in the function. This function is used 237 located in the function. This function is used by perf and kprobes that 284 gets the ip address from the user (usually usi 238 gets the ip address from the user (usually using debug info from the kernel). 285 239 286 If a glob is used to set the filter, functions 240 If a glob is used to set the filter, functions can be added to a "notrace" 287 list that will prevent those functions from ca 241 list that will prevent those functions from calling the callback. 288 The "notrace" list takes precedence over the " 242 The "notrace" list takes precedence over the "filter" list. If the 289 two lists are non-empty and contain the same f 243 two lists are non-empty and contain the same functions, the callback will not 290 be called by any function. 244 be called by any function. 291 245 292 An empty "notrace" list means to allow all fun 246 An empty "notrace" list means to allow all functions defined by the filter 293 to be traced. 247 to be traced. 294 248 295 .. code-block:: c 249 .. code-block:: c 296 250 297 int ftrace_set_notrace(struct ftrace_ops *o 251 int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf, 298 int len, int reset); 252 int len, int reset); 299 253 300 This takes the same parameters as ftrace_set_f 254 This takes the same parameters as ftrace_set_filter() but will add the 301 functions it finds to not be traced. This is a 255 functions it finds to not be traced. This is a separate list from the 302 filter list, and this function does not modify 256 filter list, and this function does not modify the filter list. 303 257 304 A non-zero @reset will clear the "notrace" lis 258 A non-zero @reset will clear the "notrace" list before adding functions 305 that match @buf to it. 259 that match @buf to it. 306 260 307 Clearing the "notrace" list is the same as cle 261 Clearing the "notrace" list is the same as clearing the filter list 308 262 309 .. code-block:: c 263 .. code-block:: c 310 264 311 ret = ftrace_set_notrace(&ops, NULL, 0, 1); 265 ret = ftrace_set_notrace(&ops, NULL, 0, 1); 312 266 313 The filter and notrace lists may be changed at 267 The filter and notrace lists may be changed at any time. If only a set of 314 functions should call the callback, it is best 268 functions should call the callback, it is best to set the filters before 315 registering the callback. But the changes may 269 registering the callback. But the changes may also happen after the callback 316 has been registered. 270 has been registered. 317 271 318 If a filter is in place, and the @reset is non 272 If a filter is in place, and the @reset is non-zero, and @buf contains a 319 matching glob to functions, the switch will ha 273 matching glob to functions, the switch will happen during the time of 320 the ftrace_set_filter() call. At no time will 274 the ftrace_set_filter() call. At no time will all functions call the callback. 321 275 322 .. code-block:: c 276 .. code-block:: c 323 277 324 ftrace_set_filter(&ops, "schedule", strlen( 278 ftrace_set_filter(&ops, "schedule", strlen("schedule"), 1); 325 279 326 register_ftrace_function(&ops); 280 register_ftrace_function(&ops); 327 281 328 msleep(10); 282 msleep(10); 329 283 330 ftrace_set_filter(&ops, "try_to_wake_up", s 284 ftrace_set_filter(&ops, "try_to_wake_up", strlen("try_to_wake_up"), 1); 331 285 332 is not the same as: 286 is not the same as: 333 287 334 .. code-block:: c 288 .. code-block:: c 335 289 336 ftrace_set_filter(&ops, "schedule", strlen( 290 ftrace_set_filter(&ops, "schedule", strlen("schedule"), 1); 337 291 338 register_ftrace_function(&ops); 292 register_ftrace_function(&ops); 339 293 340 msleep(10); 294 msleep(10); 341 295 342 ftrace_set_filter(&ops, NULL, 0, 1); 296 ftrace_set_filter(&ops, NULL, 0, 1); 343 297 344 ftrace_set_filter(&ops, "try_to_wake_up", s 298 ftrace_set_filter(&ops, "try_to_wake_up", strlen("try_to_wake_up"), 0); 345 299 346 As the latter will have a short time where all 300 As the latter will have a short time where all functions will call 347 the callback, between the time of the reset, a 301 the callback, between the time of the reset, and the time of the 348 new setting of the filter. 302 new setting of the filter.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.