1 ================ 1 ================ 2 Delay accounting 2 Delay accounting 3 ================ 3 ================ 4 4 5 Tasks encounter delays in execution when they 5 Tasks encounter delays in execution when they wait 6 for some kernel resource to become available e 6 for some kernel resource to become available e.g. a 7 runnable task may wait for a free CPU to run o 7 runnable task may wait for a free CPU to run on. 8 8 9 The per-task delay accounting functionality me 9 The per-task delay accounting functionality measures 10 the delays experienced by a task while 10 the delays experienced by a task while 11 11 12 a) waiting for a CPU (while being runnable) 12 a) waiting for a CPU (while being runnable) 13 b) completion of synchronous block I/O initiat 13 b) completion of synchronous block I/O initiated by the task 14 c) swapping in pages 14 c) swapping in pages 15 d) memory reclaim 15 d) memory reclaim 16 e) thrashing 16 e) thrashing 17 f) direct compact 17 f) direct compact 18 g) write-protect copy 18 g) write-protect copy 19 h) IRQ/SOFTIRQ 19 h) IRQ/SOFTIRQ 20 20 21 and makes these statistics available to usersp 21 and makes these statistics available to userspace through 22 the taskstats interface. 22 the taskstats interface. 23 23 24 Such delays provide feedback for setting a tas 24 Such delays provide feedback for setting a task's cpu priority, 25 io priority and rss limit values appropriately 25 io priority and rss limit values appropriately. Long delays for 26 important tasks could be a trigger for raising 26 important tasks could be a trigger for raising its corresponding priority. 27 27 28 The functionality, through its use of the task 28 The functionality, through its use of the taskstats interface, also provides 29 delay statistics aggregated for all tasks (or 29 delay statistics aggregated for all tasks (or threads) belonging to a 30 thread group (corresponding to a traditional U 30 thread group (corresponding to a traditional Unix process). This is a commonly 31 needed aggregation that is more efficiently do 31 needed aggregation that is more efficiently done by the kernel. 32 32 33 Userspace utilities, particularly resource man 33 Userspace utilities, particularly resource management applications, can also 34 aggregate delay statistics into arbitrary grou 34 aggregate delay statistics into arbitrary groups. To enable this, delay 35 statistics of a task are available both during 35 statistics of a task are available both during its lifetime as well as on its 36 exit, ensuring continuous and complete monitor 36 exit, ensuring continuous and complete monitoring can be done. 37 37 38 38 39 Interface 39 Interface 40 --------- 40 --------- 41 41 42 Delay accounting uses the taskstats interface 42 Delay accounting uses the taskstats interface which is described 43 in detail in a separate document in this direc 43 in detail in a separate document in this directory. Taskstats returns a 44 generic data structure to userspace correspond 44 generic data structure to userspace corresponding to per-pid and per-tgid 45 statistics. The delay accounting functionality 45 statistics. The delay accounting functionality populates specific fields of 46 this structure. See 46 this structure. See 47 47 48 include/uapi/linux/taskstats.h 48 include/uapi/linux/taskstats.h 49 49 50 for a description of the fields pertaining to 50 for a description of the fields pertaining to delay accounting. 51 It will generally be in the form of counters r 51 It will generally be in the form of counters returning the cumulative 52 delay seen for cpu, sync block I/O, swapin, me 52 delay seen for cpu, sync block I/O, swapin, memory reclaim, thrash page 53 cache, direct compact, write-protect copy, IRQ 53 cache, direct compact, write-protect copy, IRQ/SOFTIRQ etc. 54 54 55 Taking the difference of two successive readin 55 Taking the difference of two successive readings of a given 56 counter (say cpu_delay_total) for a task will 56 counter (say cpu_delay_total) for a task will give the delay 57 experienced by the task waiting for the corres 57 experienced by the task waiting for the corresponding resource 58 in that interval. 58 in that interval. 59 59 60 When a task exits, records containing the per- 60 When a task exits, records containing the per-task statistics 61 are sent to userspace without requiring a comm 61 are sent to userspace without requiring a command. If it is the last exiting 62 task of a thread group, the per-tgid statistic 62 task of a thread group, the per-tgid statistics are also sent. More details 63 are given in the taskstats interface descripti 63 are given in the taskstats interface description. 64 64 65 The getdelays.c userspace utility in tools/acc 65 The getdelays.c userspace utility in tools/accounting directory allows simple 66 commands to be run and the corresponding delay 66 commands to be run and the corresponding delay statistics to be displayed. It 67 also serves as an example of using the tasksta 67 also serves as an example of using the taskstats interface. 68 68 69 Usage 69 Usage 70 ----- 70 ----- 71 71 72 Compile the kernel with:: 72 Compile the kernel with:: 73 73 74 CONFIG_TASK_DELAY_ACCT=y 74 CONFIG_TASK_DELAY_ACCT=y 75 CONFIG_TASKSTATS=y 75 CONFIG_TASKSTATS=y 76 76 77 Delay accounting is disabled by default at boo 77 Delay accounting is disabled by default at boot up. 78 To enable, add:: 78 To enable, add:: 79 79 80 delayacct 80 delayacct 81 81 82 to the kernel boot options. The rest of the in 82 to the kernel boot options. The rest of the instructions below assume this has 83 been done. Alternatively, use sysctl kernel.ta 83 been done. Alternatively, use sysctl kernel.task_delayacct to switch the state 84 at runtime. Note however that only tasks start 84 at runtime. Note however that only tasks started after enabling it will have 85 delayacct information. 85 delayacct information. 86 86 87 After the system has booted up, use a utility 87 After the system has booted up, use a utility 88 similar to getdelays.c to access the delays 88 similar to getdelays.c to access the delays 89 seen by a given task or a task group (tgid). 89 seen by a given task or a task group (tgid). 90 The utility also allows a given command to be 90 The utility also allows a given command to be 91 executed and the corresponding delays to be 91 executed and the corresponding delays to be 92 seen. 92 seen. 93 93 94 General format of the getdelays command:: 94 General format of the getdelays command:: 95 95 96 getdelays [-dilv] [-t tgid] [-p pid] 96 getdelays [-dilv] [-t tgid] [-p pid] 97 97 98 Get delays, since system boot, for pid 10:: 98 Get delays, since system boot, for pid 10:: 99 99 100 # ./getdelays -d -p 10 100 # ./getdelays -d -p 10 101 (output similar to next case) 101 (output similar to next case) 102 102 103 Get sum of delays, since system boot, for all 103 Get sum of delays, since system boot, for all pids with tgid 5:: 104 104 105 # ./getdelays -d -t 5 105 # ./getdelays -d -t 5 106 print delayacct stats ON 106 print delayacct stats ON 107 TGID 5 107 TGID 5 108 108 109 109 110 CPU count real total 110 CPU count real total virtual total delay total delay average 111 8 7000000 111 8 7000000 6872122 3382277 0.423ms 112 IO count delay total 112 IO count delay total delay average 113 0 0 0 113 0 0 0.000ms 114 SWAP count delay total 114 SWAP count delay total delay average 115 0 0 115 0 0 0.000ms 116 RECLAIM count delay total 116 RECLAIM count delay total delay average 117 0 0 0 117 0 0 0.000ms 118 THRASHING count delay total 118 THRASHING count delay total delay average 119 0 0 119 0 0 0.000ms 120 COMPACT count delay total 120 COMPACT count delay total delay average 121 0 0 121 0 0 0.000ms 122 WPCOPY count delay total 122 WPCOPY count delay total delay average 123 0 0 123 0 0 0.000ms 124 IRQ count delay total 124 IRQ count delay total delay average 125 0 0 125 0 0 0.000ms 126 126 127 Get IO accounting for pid 1, it works only wit 127 Get IO accounting for pid 1, it works only with -p:: 128 128 129 # ./getdelays -i -p 1 129 # ./getdelays -i -p 1 130 printing IO accounting 130 printing IO accounting 131 linuxrc: read=65536, write=0, cancelle 131 linuxrc: read=65536, write=0, cancelled_write=0 132 132 133 The above command can be used with -v to get m 133 The above command can be used with -v to get more debug information.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.