1 perf-mem(1) 1 perf-mem(1) 2 =========== 2 =========== 3 3 4 NAME 4 NAME 5 ---- 5 ---- 6 perf-mem - Profile memory accesses 6 perf-mem - Profile memory accesses 7 7 8 SYNOPSIS 8 SYNOPSIS 9 -------- 9 -------- 10 [verse] 10 [verse] 11 'perf mem' [<options>] (record [<command>] | r 11 'perf mem' [<options>] (record [<command>] | report) 12 12 13 DESCRIPTION 13 DESCRIPTION 14 ----------- 14 ----------- 15 "perf mem record" runs a command and gathers m 15 "perf mem record" runs a command and gathers memory operation data 16 from it, into perf.data. Perf record options a 16 from it, into perf.data. Perf record options are accepted and are passed through. 17 17 18 "perf mem report" displays the result. It invo 18 "perf mem report" displays the result. It invokes perf report with the 19 right set of options to display a memory acces 19 right set of options to display a memory access profile. By default, loads 20 and stores are sampled. Use the -t option to l 20 and stores are sampled. Use the -t option to limit to loads or stores. 21 21 22 Note that on Intel systems the memory latency 22 Note that on Intel systems the memory latency reported is the use-latency, 23 not the pure load (or store latency). Use late 23 not the pure load (or store latency). Use latency includes any pipeline 24 queuing delays in addition to the memory subsy 24 queuing delays in addition to the memory subsystem latency. 25 25 26 On Arm64 this uses SPE to sample load and stor 26 On Arm64 this uses SPE to sample load and store operations, therefore hardware 27 and kernel support is required. See linkperf:p 27 and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide. 28 Due to the statistical nature of SPE sampling, 28 Due to the statistical nature of SPE sampling, not every memory operation will 29 be sampled. 29 be sampled. 30 30 31 COMMON OPTIONS 31 COMMON OPTIONS 32 -------------- 32 -------------- 33 -f:: 33 -f:: 34 --force:: 34 --force:: 35 Don't do ownership validation 35 Don't do ownership validation 36 36 37 -t:: 37 -t:: 38 --type=<type>:: 38 --type=<type>:: 39 Select the memory operation type: load 39 Select the memory operation type: load or store (default: load,store) 40 40 41 -v:: 41 -v:: 42 --verbose:: 42 --verbose:: 43 Be more verbose (show counter open err 43 Be more verbose (show counter open errors, etc) 44 44 45 -p:: 45 -p:: 46 --phys-data:: 46 --phys-data:: 47 Record/Report sample physical addresse 47 Record/Report sample physical addresses 48 48 49 --data-page-size:: 49 --data-page-size:: 50 Record/Report sample data address page 50 Record/Report sample data address page size 51 51 52 RECORD OPTIONS 52 RECORD OPTIONS 53 -------------- 53 -------------- 54 <command>...:: 54 <command>...:: 55 Any command you can specify in a shell 55 Any command you can specify in a shell. 56 56 57 -e:: 57 -e:: 58 --event <event>:: 58 --event <event>:: 59 Event selector. Use 'perf mem record - 59 Event selector. Use 'perf mem record -e list' to list available events. 60 60 61 -K:: 61 -K:: 62 --all-kernel:: 62 --all-kernel:: 63 Configure all used events to run in ke 63 Configure all used events to run in kernel space. 64 64 65 -U:: 65 -U:: 66 --all-user:: 66 --all-user:: 67 Configure all used events to run in us 67 Configure all used events to run in user space. 68 68 69 --ldlat <n>:: 69 --ldlat <n>:: 70 Specify desired latency for loads even 70 Specify desired latency for loads event. Supported on Intel and Arm64 71 processors only. Ignored on other arch 71 processors only. Ignored on other archs. 72 72 73 REPORT OPTIONS 73 REPORT OPTIONS 74 -------------- 74 -------------- 75 -i:: 75 -i:: 76 --input=<file>:: 76 --input=<file>:: 77 Input file name. 77 Input file name. 78 78 79 -C:: 79 -C:: 80 --cpu=<cpu>:: 80 --cpu=<cpu>:: 81 Monitor only on the list of CPUs provi 81 Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a 82 comma-separated list with no space: 0, 82 comma-separated list with no space: 0,1. Ranges of CPUs are specified with - 83 like 0-2. Default is to monitor all CP 83 like 0-2. Default is to monitor all CPUS. 84 84 85 -D:: 85 -D:: 86 --dump-raw-samples:: 86 --dump-raw-samples:: 87 Dump the raw decoded samples on the sc 87 Dump the raw decoded samples on the screen in a format that is easy to parse with 88 one sample per line. 88 one sample per line. 89 89 90 -s:: 90 -s:: 91 --sort=<key>:: 91 --sort=<key>:: 92 Group result by given key(s) - multipl 92 Group result by given key(s) - multiple keys can be specified 93 in CSV format. The keys are specific 93 in CSV format. The keys are specific to memory samples are: 94 symbol_daddr, symbol_iaddr, dso_daddr, 94 symbol_daddr, symbol_iaddr, dso_daddr, locked, tlb, mem, snoop, 95 dcacheline, phys_daddr, data_page_size 95 dcacheline, phys_daddr, data_page_size, blocked. 96 96 97 - symbol_daddr: name of data symbol be 97 - symbol_daddr: name of data symbol being executed on at the time of sample 98 - symbol_iaddr: name of code symbol be 98 - symbol_iaddr: name of code symbol being executed on at the time of sample 99 - dso_daddr: name of library or module 99 - dso_daddr: name of library or module containing the data being executed 100 on at the time of the sam 100 on at the time of the sample 101 - locked: whether the bus was locked a 101 - locked: whether the bus was locked at the time of the sample 102 - tlb: type of tlb access for the data 102 - tlb: type of tlb access for the data at the time of the sample 103 - mem: type of memory access for the d 103 - mem: type of memory access for the data at the time of the sample 104 - snoop: type of snoop (if any) for th 104 - snoop: type of snoop (if any) for the data at the time of the sample 105 - dcacheline: the cacheline the data a 105 - dcacheline: the cacheline the data address is on at the time of the sample 106 - phys_daddr: physical address of data 106 - phys_daddr: physical address of data being executed on at the time of sample 107 - data_page_size: the data page size o 107 - data_page_size: the data page size of data being executed on at the time of sample 108 - blocked: reason of blocked load acce 108 - blocked: reason of blocked load access for the data at the time of the sample 109 109 110 And the default sort keys are changed 110 And the default sort keys are changed to local_weight, mem, sym, dso, 111 symbol_daddr, dso_daddr, snoop, tlb, l 111 symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat. 112 112 113 -T:: 113 -T:: 114 --type-profile:: 114 --type-profile:: 115 Show data-type profile result instead 115 Show data-type profile result instead of code symbols. This requires 116 the debug information and it will chan 116 the debug information and it will change the default sort keys to: 117 mem, snoop, tlb, type. 117 mem, snoop, tlb, type. 118 118 119 -U:: 119 -U:: 120 --hide-unresolved:: 120 --hide-unresolved:: 121 Only display entries resolved to a sym 121 Only display entries resolved to a symbol. 122 122 123 -x:: 123 -x:: 124 --field-separator=<separator>:: 124 --field-separator=<separator>:: 125 Specify the field separator used when 125 Specify the field separator used when dump raw samples (-D option). By default, 126 The separator is the space character. 126 The separator is the space character. 127 127 128 In addition, for report all perf report option 128 In addition, for report all perf report options are valid, and for record 129 all perf record options. 129 all perf record options. 130 130 131 SEE ALSO 131 SEE ALSO 132 -------- 132 -------- 133 linkperf:perf-record[1], linkperf:perf-report[ 133 linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.