~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/tools/perf/Documentation/perf-arm-spe.txt

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /tools/perf/Documentation/perf-arm-spe.txt (Version linux-6.12-rc7) and /tools/perf/Documentation/perf-arm-spe.txt (Version linux-4.17.19)


  1 perf-arm-spe(1)                                   
  2 ================                                  
  3                                                   
  4 NAME                                              
  5 ----                                              
  6 perf-arm-spe - Support for Arm Statistical Pro    
  7                                                   
  8 SYNOPSIS                                          
  9 --------                                          
 10 [verse]                                           
 11 'perf record' -e arm_spe//                        
 12                                                   
 13 DESCRIPTION                                       
 14 -----------                                       
 15                                                   
 16 The SPE (Statistical Profiling Extension) feat    
 17  events down to individual instructions. Rathe    
 18 instruction to sample and then captures data f    
 19 in cycles. For loads and stores it also includ    
 20                                                   
 21 The sampling has 5 stages:                        
 22                                                   
 23   1. Choose an operation                          
 24   2. Collect data about the operation             
 25   3. Optionally discard the record based on a     
 26   4. Write the record to memory                   
 27   5. Interrupt when the buffer is full            
 28                                                   
 29 Choose an operation                               
 30 ~~~~~~~~~~~~~~~~~~~                               
 31                                                   
 32 This is chosen from a sample population, for S    
 33 architectural instructions or all micro-ops. S    
 34 architecture provides a mechanism for the SPE     
 35 sample. This minimum interval is used by the d    
 36 perturbation is also added to the sampling int    
 37                                                   
 38 Collect data about the operation                  
 39 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                  
 40                                                   
 41 Program counter, PMU events, timings and data     
 42 Sampling ensures there is only one sampled ope    
 43                                                   
 44 Optionally discard the record based on a filte    
 45 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    
 46                                                   
 47 Based on programmable criteria, choose whether    
 48 discarded then the flow stops here for this sa    
 49                                                   
 50 Write the record to memory                        
 51 ~~~~~~~~~~~~~~~~~~~~~~~~~~                        
 52                                                   
 53 The record is appended to a memory buffer         
 54                                                   
 55 Interrupt when the buffer is full                 
 56 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                 
 57                                                   
 58 When the buffer fills, an interrupt is sent an    
 59 Perf saves the raw data in the perf.data file.    
 60                                                   
 61 Opening the file                                  
 62 ----------------                                  
 63                                                   
 64 Up until this point no decoding of the SPE dat    
 65 recorded file is opened with 'perf report' or     
 66 the data, Perf generates "synthetic samples" a    
 67 recording. These samples are the same as if no    
 68 although they may have more attributes associa    
 69 just the instruction pointer, but an SPE sampl    
 70                                                   
 71 Why Sampling?                                     
 72 -------------                                     
 73                                                   
 74  - Sampling, rather than tracing, cuts down th    
 75  hardware. Only one sampled operation is in fl    
 76                                                   
 77  - Allows precise attribution data, including:    
 78  addresses.                                       
 79                                                   
 80  - Allows correlation between an instruction a    
 81  indicates which particular cache was hit, but    
 82  different implementations can have different     
 83                                                   
 84 However, SPE does not provide any call-graph i    
 85                                                   
 86 Collisions                                        
 87 ----------                                        
 88                                                   
 89 When an operation is sampled while a previous     
 90 occurs. The new sample is dropped. Collisions     
 91 should be set to avoid collisions.                
 92                                                   
 93 The 'sample_collision' PMU event can be used t    
 94 count is based on collisions _before_ filterin    
 95 number for samples dropped that would have mad    
 96 guide.                                            
 97                                                   
 98 The effect of microarchitectural sampling         
 99 -----------------------------------------         
100                                                   
101 If an implementation samples micro-operations     
102 be weighted accordingly.                          
103                                                   
104 For example, if a given instruction A is alway    
105 becomes twice as likely to appear in the sampl    
106                                                   
107 The coarse effect of conversions, and, if appl    
108 estimated from the 'sample_pop' and 'inst_reti    
109                                                   
110 Kernel Requirements                               
111 -------------------                               
112                                                   
113 The ARM_SPE_PMU config must be set to build as    
114                                                   
115 Depending on CPU model, the kernel may need to    
116 (kpti=off). If KPTI needs to be disabled, this    
117 inaccessible. Try passing 'kpti=off' on the ke    
118                                                   
119 For the full criteria that determine whether K    
120 unmap_kernel_at_el0() in the kernel sources. C    
121 are on the CPUs in kpti_safe_list, or on Arm v    
122                                                   
123 The SPE interrupt must also be described by th    
124 disabled (or isn't required to be disabled) bu    
125 /sys/bus/event_source/devices/, then it's poss    
126 ACPI or DT. In this case no warning will be pr    
127                                                   
128 Capturing SPE with perf command-line tools        
129 ------------------------------------------        
130                                                   
131 You can record a session with SPE samples:        
132                                                   
133   perf record -e arm_spe// -- ./mybench           
134                                                   
135 The sample period is set from the -c option, a    
136 it's recommended to set this to a higher value    
137                                                   
138 Config parameters                                 
139 ~~~~~~~~~~~~~~~~~                                 
140                                                   
141 These are placed between the // in the event a    
142 arm_spe/load_filter=1,min_latency=10/'            
143                                                   
144   branch_filter=1     - collect branches only     
145   event_filter=<mask> - filter on specific eve    
146   jitter=1            - use jitter to avoid re    
147   load_filter=1       - collect loads only (PM    
148   min_latency=<n>     - collect only samples w    
149   pa_enable=1         - collect physical addre    
150   pct_enable=1        - collect physical times    
151   store_filter=1      - collect stores only (P    
152   ts_enable=1         - enable timestamping wi    
153                                                   
154 +++*+++ Latency is the total latency from the     
155 than only the execution latency.                  
156                                                   
157 Only some events can be filtered on; these inc    
158                                                   
159   bit 1     - instruction retired (i.e. omit s    
160   bit 3     - L1D refill                          
161   bit 5     - TLB refill                          
162   bit 7     - mispredict                          
163   bit 11    - misaligned access                   
164                                                   
165 So to sample just retired instructions:           
166                                                   
167   perf record -e arm_spe/event_filter=2/ -- ./    
168                                                   
169 or just mispredicted branches:                    
170                                                   
171   perf record -e arm_spe/event_filter=0x80/ --    
172                                                   
173 Viewing the data                                  
174 ~~~~~~~~~~~~~~~~~                                 
175                                                   
176 By default perf report and perf script will as    
177 attributes/events of the SPE record. Because i    
178 them, the samples in these groups are not nece    
179 groups:                                           
180                                                   
181   Available samples                               
182   0 arm_spe//                                     
183   0 dummy:u                                       
184   21 l1d-miss                                     
185   897 l1d-access                                  
186   5 llc-miss                                      
187   7 llc-access                                    
188   2 tlb-miss                                      
189   1K tlb-access                                   
190   36 branch-miss                                  
191   0 remote-access                                 
192   900 memory                                      
193                                                   
194 The arm_spe// and dummy:u events are implement    
195                                                   
196 To get a full list of unique samples that are     
197 generate 'instruction' samples. The period opt    
198 instruction unless you want to further downsam    
199                                                   
200   perf report --itrace=i1i                        
201                                                   
202 Memory access details are also stored on the s    
203                                                   
204   perf report --mem-mode                          
205                                                   
206 Common errors                                     
207 ~~~~~~~~~~~~~                                     
208                                                   
209  - "Cannot find PMU `arm_spe'. Missing kernel     
210                                                   
211    Module not built or loaded, KPTI not disabl    
212    or running on a VM. See 'Kernel Requirement    
213                                                   
214  - "Arm SPE CONTEXT packets not found in the t    
215                                                   
216    Root privilege is required to collect conte    
217    assigning PIDs to kernel samples. For users    
218                                                   
219  - Excessively large perf.data file size          
220                                                   
221    Increase sampling interval (see above)         
222                                                   
223                                                   
224 SEE ALSO                                          
225 --------                                          
226                                                   
227 linkperf:perf-record[1], linkperf:perf-script[    
228 linkperf:perf-inject[1]                           
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php