1 perf-mem(1) 1 perf-mem(1) 2 =========== 2 =========== 3 3 4 NAME 4 NAME 5 ---- 5 ---- 6 perf-mem - Profile memory accesses 6 perf-mem - Profile memory accesses 7 7 8 SYNOPSIS 8 SYNOPSIS 9 -------- 9 -------- 10 [verse] 10 [verse] 11 'perf mem' [<options>] (record [<command>] | r 11 'perf mem' [<options>] (record [<command>] | report) 12 12 13 DESCRIPTION 13 DESCRIPTION 14 ----------- 14 ----------- 15 "perf mem record" runs a command and gathers m 15 "perf mem record" runs a command and gathers memory operation data 16 from it, into perf.data. Perf record options a 16 from it, into perf.data. Perf record options are accepted and are passed through. 17 17 18 "perf mem report" displays the result. It invo 18 "perf mem report" displays the result. It invokes perf report with the 19 right set of options to display a memory acces 19 right set of options to display a memory access profile. By default, loads 20 and stores are sampled. Use the -t option to l 20 and stores are sampled. Use the -t option to limit to loads or stores. 21 21 22 Note that on Intel systems the memory latency 22 Note that on Intel systems the memory latency reported is the use-latency, 23 not the pure load (or store latency). Use late 23 not the pure load (or store latency). Use latency includes any pipeline 24 queuing delays in addition to the memory subsy !! 24 queueing delays in addition to the memory subsystem latency. 25 25 26 On Arm64 this uses SPE to sample load and stor !! 26 OPTIONS 27 and kernel support is required. See linkperf:p !! 27 ------- 28 Due to the statistical nature of SPE sampling, !! 28 <command>...:: 29 be sampled. !! 29 Any command you can specify in a shell. >> 30 >> 31 -i:: >> 32 --input=<file>:: >> 33 Input file name. 30 34 31 COMMON OPTIONS << 32 -------------- << 33 -f:: 35 -f:: 34 --force:: 36 --force:: 35 Don't do ownership validation 37 Don't do ownership validation 36 38 37 -t:: 39 -t:: 38 --type=<type>:: 40 --type=<type>:: 39 Select the memory operation type: load 41 Select the memory operation type: load or store (default: load,store) 40 42 41 -v:: !! 43 -D:: 42 --verbose:: !! 44 --dump-raw-samples:: 43 Be more verbose (show counter open err !! 45 Dump the raw decoded samples on the screen in a format that is easy to parse with >> 46 one sample per line. >> 47 >> 48 -x:: >> 49 --field-separator=<separator>:: >> 50 Specify the field separator used when dump raw samples (-D option). By default, >> 51 The separator is the space character. >> 52 >> 53 -C:: >> 54 --cpu=<cpu>:: >> 55 Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a >> 56 comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. Default >> 57 is to monitor all CPUS. >> 58 -U:: >> 59 --hide-unresolved:: >> 60 Only display entries resolved to a symbol. 44 61 45 -p:: 62 -p:: 46 --phys-data:: 63 --phys-data:: 47 Record/Report sample physical addresse 64 Record/Report sample physical addresses 48 65 49 --data-page-size:: << 50 Record/Report sample data address page << 51 << 52 RECORD OPTIONS 66 RECORD OPTIONS 53 -------------- 67 -------------- 54 <command>...:: << 55 Any command you can specify in a shell << 56 << 57 -e:: 68 -e:: 58 --event <event>:: 69 --event <event>:: 59 Event selector. Use 'perf mem record - 70 Event selector. Use 'perf mem record -e list' to list available events. 60 71 61 -K:: 72 -K:: 62 --all-kernel:: 73 --all-kernel:: 63 Configure all used events to run in ke 74 Configure all used events to run in kernel space. 64 75 65 -U:: 76 -U:: 66 --all-user:: 77 --all-user:: 67 Configure all used events to run in us 78 Configure all used events to run in user space. 68 79 69 --ldlat <n>:: !! 80 -v:: 70 Specify desired latency for loads even !! 81 --verbose:: 71 processors only. Ignored on other arch !! 82 Be more verbose (show counter open errors, etc) 72 << 73 REPORT OPTIONS << 74 -------------- << 75 -i:: << 76 --input=<file>:: << 77 Input file name. << 78 << 79 -C:: << 80 --cpu=<cpu>:: << 81 Monitor only on the list of CPUs provi << 82 comma-separated list with no space: 0, << 83 like 0-2. Default is to monitor all CP << 84 << 85 -D:: << 86 --dump-raw-samples:: << 87 Dump the raw decoded samples on the sc << 88 one sample per line. << 89 << 90 -s:: << 91 --sort=<key>:: << 92 Group result by given key(s) - multipl << 93 in CSV format. The keys are specific << 94 symbol_daddr, symbol_iaddr, dso_daddr, << 95 dcacheline, phys_daddr, data_page_size << 96 << 97 - symbol_daddr: name of data symbol be << 98 - symbol_iaddr: name of code symbol be << 99 - dso_daddr: name of library or module << 100 on at the time of the sam << 101 - locked: whether the bus was locked a << 102 - tlb: type of tlb access for the data << 103 - mem: type of memory access for the d << 104 - snoop: type of snoop (if any) for th << 105 - dcacheline: the cacheline the data a << 106 - phys_daddr: physical address of data << 107 - data_page_size: the data page size o << 108 - blocked: reason of blocked load acce << 109 << 110 And the default sort keys are changed << 111 symbol_daddr, dso_daddr, snoop, tlb, l << 112 << 113 -T:: << 114 --type-profile:: << 115 Show data-type profile result instead << 116 the debug information and it will chan << 117 mem, snoop, tlb, type. << 118 << 119 -U:: << 120 --hide-unresolved:: << 121 Only display entries resolved to a sym << 122 83 123 -x:: !! 84 --ldlat <n>:: 124 --field-separator=<separator>:: !! 85 Specify desired latency for loads event. 125 Specify the field separator used when << 126 The separator is the space character. << 127 86 128 In addition, for report all perf report option 87 In addition, for report all perf report options are valid, and for record 129 all perf record options. 88 all perf record options. 130 89 131 SEE ALSO 90 SEE ALSO 132 -------- 91 -------- 133 linkperf:perf-record[1], linkperf:perf-report[ !! 92 linkperf:perf-record[1], linkperf:perf-report[1]
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.