1 perf-bench(1) 1 perf-bench(1) 2 ============= 2 ============= 3 3 4 NAME 4 NAME 5 ---- 5 ---- 6 perf-bench - General framework for benchmark s 6 perf-bench - General framework for benchmark suites 7 7 8 SYNOPSIS 8 SYNOPSIS 9 -------- 9 -------- 10 [verse] 10 [verse] 11 'perf bench' [<common options>] <subsystem> <s 11 'perf bench' [<common options>] <subsystem> <suite> [<options>] 12 12 13 DESCRIPTION 13 DESCRIPTION 14 ----------- 14 ----------- 15 This 'perf bench' command is a general framewo 15 This 'perf bench' command is a general framework for benchmark suites. 16 16 17 COMMON OPTIONS 17 COMMON OPTIONS 18 -------------- 18 -------------- 19 -r:: 19 -r:: 20 --repeat=:: 20 --repeat=:: 21 Specify number of times to repeat the run (def 21 Specify number of times to repeat the run (default 10). 22 22 23 -f:: 23 -f:: 24 --format=:: 24 --format=:: 25 Specify format style. 25 Specify format style. 26 Current available format styles are: 26 Current available format styles are: 27 27 28 'default':: 28 'default':: 29 Default style. This is mainly for human readin 29 Default style. This is mainly for human reading. 30 --------------------- 30 --------------------- 31 % perf bench sched pipe # 31 % perf bench sched pipe # with no style specified 32 (executing 1000000 pipe operations between two 32 (executing 1000000 pipe operations between two tasks) 33 Total time:5.855 sec 33 Total time:5.855 sec 34 5.855061 usecs/op 34 5.855061 usecs/op 35 170792 ops/sec 35 170792 ops/sec 36 --------------------- 36 --------------------- 37 37 38 'simple':: 38 'simple':: 39 This simple style is friendly for automated 39 This simple style is friendly for automated 40 processing by scripts. 40 processing by scripts. 41 --------------------- 41 --------------------- 42 % perf bench --format=simple sched pipe # 42 % perf bench --format=simple sched pipe # specified simple 43 5.988 43 5.988 44 --------------------- 44 --------------------- 45 45 46 SUBSYSTEM 46 SUBSYSTEM 47 --------- 47 --------- 48 48 49 'sched':: 49 'sched':: 50 Scheduler and IPC mechanisms. 50 Scheduler and IPC mechanisms. 51 51 52 'syscall':: 52 'syscall':: 53 System call performance (throughput). 53 System call performance (throughput). 54 54 55 'mem':: 55 'mem':: 56 Memory access performance. 56 Memory access performance. 57 57 58 'numa':: 58 'numa':: 59 NUMA scheduling and MM benchmarks. 59 NUMA scheduling and MM benchmarks. 60 60 61 'futex':: 61 'futex':: 62 Futex stressing benchmarks. 62 Futex stressing benchmarks. 63 63 64 'epoll':: 64 'epoll':: 65 Eventpoll (epoll) stressing benchmarks 65 Eventpoll (epoll) stressing benchmarks. 66 66 67 'internals':: 67 'internals':: 68 Benchmark internal perf functionality. 68 Benchmark internal perf functionality. 69 69 70 'uprobe':: 70 'uprobe':: 71 Benchmark overhead of uprobe + BPF. 71 Benchmark overhead of uprobe + BPF. 72 72 73 'all':: 73 'all':: 74 All benchmark subsystems. 74 All benchmark subsystems. 75 75 76 SUITES FOR 'sched' 76 SUITES FOR 'sched' 77 ~~~~~~~~~~~~~~~~~~ 77 ~~~~~~~~~~~~~~~~~~ 78 *messaging*:: 78 *messaging*:: 79 Suite for evaluating performance of scheduler 79 Suite for evaluating performance of scheduler and IPC mechanisms. 80 Based on hackbench by Rusty Russell. 80 Based on hackbench by Rusty Russell. 81 81 82 Options of *messaging* 82 Options of *messaging* 83 ^^^^^^^^^^^^^^^^^^^^^^ 83 ^^^^^^^^^^^^^^^^^^^^^^ 84 -p:: 84 -p:: 85 --pipe:: 85 --pipe:: 86 Use pipe() instead of socketpair() 86 Use pipe() instead of socketpair() 87 87 88 -t:: 88 -t:: 89 --thread:: 89 --thread:: 90 Be multi thread instead of multi process 90 Be multi thread instead of multi process 91 91 92 -g:: 92 -g:: 93 --group=:: 93 --group=:: 94 Specify number of groups 94 Specify number of groups 95 95 96 -l:: 96 -l:: 97 --nr_loops=:: 97 --nr_loops=:: 98 Specify number of loops 98 Specify number of loops 99 99 100 Example of *messaging* 100 Example of *messaging* 101 ^^^^^^^^^^^^^^^^^^^^^^ 101 ^^^^^^^^^^^^^^^^^^^^^^ 102 102 103 --------------------- 103 --------------------- 104 % perf bench sched messaging # 104 % perf bench sched messaging # run with default 105 options (20 sender and receiver processes per 105 options (20 sender and receiver processes per group) 106 (10 groups == 400 processes run) 106 (10 groups == 400 processes run) 107 107 108 Total time:0.308 sec 108 Total time:0.308 sec 109 109 110 % perf bench sched messaging -t -g 20 # 110 % perf bench sched messaging -t -g 20 # be multi-thread, with 20 groups 111 (20 sender and receiver threads per group) 111 (20 sender and receiver threads per group) 112 (20 groups == 800 threads run) 112 (20 groups == 800 threads run) 113 113 114 Total time:0.582 sec 114 Total time:0.582 sec 115 --------------------- 115 --------------------- 116 116 117 *pipe*:: 117 *pipe*:: 118 Suite for pipe() system call. 118 Suite for pipe() system call. 119 Based on pipe-test-1m.c by Ingo Molnar. 119 Based on pipe-test-1m.c by Ingo Molnar. 120 120 121 Options of *pipe* 121 Options of *pipe* 122 ^^^^^^^^^^^^^^^^^ 122 ^^^^^^^^^^^^^^^^^ 123 -l:: 123 -l:: 124 --loop=:: 124 --loop=:: 125 Specify number of loops. 125 Specify number of loops. 126 126 127 -G:: 127 -G:: 128 --cgroups=:: 128 --cgroups=:: 129 Names of cgroups for sender and receiver, sepa 129 Names of cgroups for sender and receiver, separated by a comma. 130 This is useful to check cgroup context switchi 130 This is useful to check cgroup context switching overhead. 131 Note that perf doesn't create nor delete the c 131 Note that perf doesn't create nor delete the cgroups, so users should 132 make sure that the cgroups exist and are acces 132 make sure that the cgroups exist and are accessible before use. 133 133 134 134 135 Example of *pipe* 135 Example of *pipe* 136 ^^^^^^^^^^^^^^^^^ 136 ^^^^^^^^^^^^^^^^^ 137 137 138 --------------------- 138 --------------------- 139 % perf bench sched pipe 139 % perf bench sched pipe 140 (executing 1000000 pipe operations between two 140 (executing 1000000 pipe operations between two tasks) 141 141 142 Total time:8.091 sec 142 Total time:8.091 sec 143 8.091833 usecs/op 143 8.091833 usecs/op 144 123581 ops/sec 144 123581 ops/sec 145 145 146 % perf bench sched pipe -l 1000 # 146 % perf bench sched pipe -l 1000 # loop 1000 147 (executing 1000 pipe operations between two ta 147 (executing 1000 pipe operations between two tasks) 148 148 149 Total time:0.016 sec 149 Total time:0.016 sec 150 16.948000 usecs/op 150 16.948000 usecs/op 151 59004 ops/sec 151 59004 ops/sec 152 152 153 % perf bench sched pipe -G AAA,BBB 153 % perf bench sched pipe -G AAA,BBB 154 (executing 1000000 pipe operations between cgr 154 (executing 1000000 pipe operations between cgroups) 155 # Running 'sched/pipe' benchmark: 155 # Running 'sched/pipe' benchmark: 156 # Executed 1000000 pipe operations between two 156 # Executed 1000000 pipe operations between two processes 157 157 158 Total time: 6.886 [sec] 158 Total time: 6.886 [sec] 159 159 160 6.886208 usecs/op 160 6.886208 usecs/op 161 145217 ops/sec 161 145217 ops/sec 162 162 163 --------------------- 163 --------------------- 164 164 165 SUITES FOR 'syscall' 165 SUITES FOR 'syscall' 166 ~~~~~~~~~~~~~~~~~~ 166 ~~~~~~~~~~~~~~~~~~ 167 *basic*:: 167 *basic*:: 168 Suite for evaluating performance of core syste 168 Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics). 169 This uses a single thread simply doing getppid 169 This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not 170 cached by glibc. 170 cached by glibc. 171 171 172 172 173 SUITES FOR 'mem' 173 SUITES FOR 'mem' 174 ~~~~~~~~~~~~~~~~ 174 ~~~~~~~~~~~~~~~~ 175 *memcpy*:: 175 *memcpy*:: 176 Suite for evaluating performance of simple mem 176 Suite for evaluating performance of simple memory copy in various ways. 177 177 178 Options of *memcpy* 178 Options of *memcpy* 179 ^^^^^^^^^^^^^^^^^^^ 179 ^^^^^^^^^^^^^^^^^^^ 180 -l:: 180 -l:: 181 --size:: 181 --size:: 182 Specify size of memory to copy (default: 1MB). 182 Specify size of memory to copy (default: 1MB). 183 Available units are B, KB, MB, GB and TB (case 183 Available units are B, KB, MB, GB and TB (case insensitive). 184 184 185 -f:: 185 -f:: 186 --function:: 186 --function:: 187 Specify function to copy (default: default). 187 Specify function to copy (default: default). 188 Available functions are depend on the architec 188 Available functions are depend on the architecture. 189 On x86-64, x86-64-unrolled, x86-64-movsq and x 189 On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported. 190 190 191 -l:: 191 -l:: 192 --nr_loops:: 192 --nr_loops:: 193 Repeat memcpy invocation this number of times. 193 Repeat memcpy invocation this number of times. 194 194 195 -c:: 195 -c:: 196 --cycles:: 196 --cycles:: 197 Use perf's cpu-cycles event instead of gettime 197 Use perf's cpu-cycles event instead of gettimeofday syscall. 198 198 199 *memset*:: 199 *memset*:: 200 Suite for evaluating performance of simple mem 200 Suite for evaluating performance of simple memory set in various ways. 201 201 202 Options of *memset* 202 Options of *memset* 203 ^^^^^^^^^^^^^^^^^^^ 203 ^^^^^^^^^^^^^^^^^^^ 204 -l:: 204 -l:: 205 --size:: 205 --size:: 206 Specify size of memory to set (default: 1MB). 206 Specify size of memory to set (default: 1MB). 207 Available units are B, KB, MB, GB and TB (case 207 Available units are B, KB, MB, GB and TB (case insensitive). 208 208 209 -f:: 209 -f:: 210 --function:: 210 --function:: 211 Specify function to set (default: default). 211 Specify function to set (default: default). 212 Available functions are depend on the architec 212 Available functions are depend on the architecture. 213 On x86-64, x86-64-unrolled, x86-64-stosq and x 213 On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported. 214 214 215 -l:: 215 -l:: 216 --nr_loops:: 216 --nr_loops:: 217 Repeat memset invocation this number of times. 217 Repeat memset invocation this number of times. 218 218 219 -c:: 219 -c:: 220 --cycles:: 220 --cycles:: 221 Use perf's cpu-cycles event instead of gettime 221 Use perf's cpu-cycles event instead of gettimeofday syscall. 222 222 223 SUITES FOR 'numa' 223 SUITES FOR 'numa' 224 ~~~~~~~~~~~~~~~~~ 224 ~~~~~~~~~~~~~~~~~ 225 *mem*:: 225 *mem*:: 226 Suite for evaluating NUMA workloads. 226 Suite for evaluating NUMA workloads. 227 227 228 SUITES FOR 'futex' 228 SUITES FOR 'futex' 229 ~~~~~~~~~~~~~~~~~~ 229 ~~~~~~~~~~~~~~~~~~ 230 *hash*:: 230 *hash*:: 231 Suite for evaluating hash tables. 231 Suite for evaluating hash tables. 232 232 233 *wake*:: 233 *wake*:: 234 Suite for evaluating wake calls. 234 Suite for evaluating wake calls. 235 235 236 *wake-parallel*:: 236 *wake-parallel*:: 237 Suite for evaluating parallel wake calls. 237 Suite for evaluating parallel wake calls. 238 238 239 *requeue*:: 239 *requeue*:: 240 Suite for evaluating requeue calls. 240 Suite for evaluating requeue calls. 241 241 242 *lock-pi*:: 242 *lock-pi*:: 243 Suite for evaluating futex lock_pi calls. 243 Suite for evaluating futex lock_pi calls. 244 244 245 SUITES FOR 'epoll' 245 SUITES FOR 'epoll' 246 ~~~~~~~~~~~~~~~~~~ 246 ~~~~~~~~~~~~~~~~~~ 247 *wait*:: 247 *wait*:: 248 Suite for evaluating concurrent epoll_wait cal 248 Suite for evaluating concurrent epoll_wait calls. 249 249 250 *ctl*:: 250 *ctl*:: 251 Suite for evaluating multiple epoll_ctl calls. 251 Suite for evaluating multiple epoll_ctl calls. 252 252 253 SUITES FOR 'internals' 253 SUITES FOR 'internals' 254 ~~~~~~~~~~~~~~~~~~~~~~ 254 ~~~~~~~~~~~~~~~~~~~~~~ 255 *synthesize*:: 255 *synthesize*:: 256 Suite for evaluating perf's event synthesis pe 256 Suite for evaluating perf's event synthesis performance. 257 257 258 SEE ALSO 258 SEE ALSO 259 -------- 259 -------- 260 linkperf:perf[1] 260 linkperf:perf[1]
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.