~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/tools/perf/Documentation/perf-amd-ibs.txt

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /tools/perf/Documentation/perf-amd-ibs.txt (Version linux-6.12-rc7) and /tools/perf/Documentation/perf-amd-ibs.txt (Version linux-5.12.19)


  1 perf-amd-ibs(1)                                   
  2 ===============                                   
  3                                                   
  4 NAME                                              
  5 ----                                              
  6 perf-amd-ibs - Support for AMD Instruction-Bas    
  7                                                   
  8 SYNOPSIS                                          
  9 --------                                          
 10 [verse]                                           
 11 'perf record' -e ibs_op//                         
 12 'perf record' -e ibs_fetch//                      
 13                                                   
 14 DESCRIPTION                                       
 15 -----------                                       
 16                                                   
 17 Instruction-Based Sampling (IBS) provides prec    
 18 profiling support on AMD platforms. IBS has tw    
 19 Op and IBS Fetch. IBS Op sampling provides inf    
 20 execution (micro-op execution to be precise) w    
 21 hit/miss, d-TLB hit/miss, cache miss latency,     
 22 behavior etc. IBS Fetch sampling provides info    
 23 with details like i-cache hit/miss, i-TLB hit/    
 24 per-smt-thread i.e. each SMT hardware thread c    
 25                                                   
 26 Both, IBS Op and IBS Fetch, are exposed as PMU    
 27 using the Linux perf utility. The following fi    
 28 if IBS is supported by the hardware and kernel    
 29                                                   
 30   /sys/bus/event_source/devices/ibs_op/           
 31   /sys/bus/event_source/devices/ibs_fetch/        
 32                                                   
 33 IBS Op PMU supports two events: cycles and mic    
 34 one event: fetch ops.                             
 35                                                   
 36 IBS PMUs do not have user/kernel filtering cap    
 37 CAP_SYS_ADMIN or CAP_PERFMON privilege.           
 38                                                   
 39 IBS VS. REGULAR CORE PMU                          
 40 ------------------------                          
 41                                                   
 42 IBS gives samples with precise IP, i.e. the IP    
 43 no skid. Whereas the IP recorded by regular co    
 44 (sample was generated at IP X but perf would r    
 45 regular core PMU might not help for profiling     
 46 precision. Further, IBS provides additional in    
 47 question. On the other hand, regular core PMU     
 48 plethora of events, counting mode (less interf    
 49 counters, event grouping support, filtering ca    
 50                                                   
 51 Three regular core PMU events are internally f    
 52 precise_ip attribute is set:                      
 53                                                   
 54         -e cpu-cycles:p becomes -e ibs_op//       
 55         -e r076:p becomes -e ibs_op//             
 56         -e r0C1:p becomes -e ibs_op/cnt_ctl=1/    
 57                                                   
 58 EXAMPLES                                          
 59 --------                                          
 60                                                   
 61 IBS Op PMU                                        
 62 ~~~~~~~~~~                                        
 63                                                   
 64 System-wide profile, cycles event, sampling pe    
 65                                                   
 66         # perf record -e ibs_op// -c 100000 -a    
 67                                                   
 68 Per-cpu profile (cpu10), cycles event, samplin    
 69                                                   
 70         # perf record -e ibs_op// -c 100000 -C    
 71                                                   
 72 Per-cpu profile (cpu10), cycles event, samplin    
 73                                                   
 74         # perf record -e ibs_op// -F 1000 -C 1    
 75                                                   
 76 System-wide profile, uOps event, sampling peri    
 77                                                   
 78         # perf record -e ibs_op/cnt_ctl=1/ -c     
 79                                                   
 80 Same command, but also capture IBS register ra    
 81                                                   
 82         # perf record -e ibs_op/cnt_ctl=1/ -c     
 83                                                   
 84 System-wide profile, uOps event, sampling peri    
 85                                                   
 86         # perf record -e ibs_op/cnt_ctl=1,l3mi    
 87                                                   
 88 Per process(upstream v6.2 onward), uOps event,    
 89                                                   
 90         # perf record -e ibs_op/cnt_ctl=1/ -c     
 91                                                   
 92 Per process(upstream v6.2 onward), uOps event,    
 93                                                   
 94         # perf record -e ibs_op/cnt_ctl=1/ -c     
 95                                                   
 96 To analyse recorded profile in aggregate mode     
 97                                                   
 98         # perf report                             
 99         /* Select a line and press 'a' to dril    
100                                                   
101 To go over each sample                            
102                                                   
103         # perf script                             
104                                                   
105 Raw dump of IBS registers when profiled with -    
106                                                   
107         # perf report -D                          
108         /* Look for PERF_RECORD_SAMPLE */         
109                                                   
110         Example register raw dump:                
111                                                   
112         ibs_op_ctl:     000002c30006186a MaxCn    
113                 Val 1 CntCtl 0=cycles CurCnt      
114         IbsOpRip:       ffffffff8204aea7          
115         ibs_op_data:    0000010002550001 CompT    
116                 BrnRet 0  RipInvalid 0 BrnFuse    
117         ibs_op_data2:   0000000000000013 RmtNo    
118         ibs_op_data3:   0000000031960092 LdOp     
119                 DcL2TlbMiss 0 DcL1TlbHit2M 1 D    
120                 DcMiss 1 DcMisAcc 0 DcWcMemAcc    
121                 DcMissNoMabAlloc 0 DcLinAddrVa    
122                 DcL2TlbHit1G 0 L2Miss 1 SwPf 0    
123                 OpDcMissOpenMemReqs 12 DcMissL    
124         IbsDCLinAd:     ff110008a5398920          
125         IbsDCPhysAd:    00000008a5398920          
126                                                   
127 IBS applied in a real world usecase               
128                                                   
129         ~90% regression was observed in tbench    
130         which was counter intuitive. IBS profi    
131         using perf helped in identifying exact    
132                                                   
133         https://lore.kernel.org/r/202209210636    
134                                                   
135 IBS Fetch PMU                                     
136 ~~~~~~~~~~~~~                                     
137                                                   
138 Similar commands can be used with Fetch PMU as    
139                                                   
140 System-wide profile, fetch ops event, sampling    
141                                                   
142         # perf record -e ibs_fetch// -c 100000    
143                                                   
144 System-wide profile, fetch ops event, sampling    
145                                                   
146         # perf record -e ibs_fetch/rand_en=1/     
147                                                   
148         Random enable adds small degree of var    
149         helps in cases like long running loops    
150         instruction over and over because of f    
151                                                   
152 etc.                                              
153                                                   
154 PERF MEM AND PERF C2C                             
155 ---------------------                             
156                                                   
157 perf mem is a memory access profiler tool and     
158 cacheline analyser tool. Both of them internal    
159 Below is a simple example of the perf mem tool    
160                                                   
161         # perf mem record -c 100000 -- make       
162         # perf mem report                         
163                                                   
164 A normal perf mem report output will provide d    
165 However, it can also be aggregated based on ou    
166                                                   
167         # perf mem report -F mem,sample,snoop     
168         Samples: 3M of event 'ibs_op//', Event    
169         Memory access                             
170         N/A                                       
171         L1 hit                                    
172         L2 hit                                    
173         L3 hit                                    
174         L3 hit                                    
175         RAM hit                                   
176         Remote node, same socket RAM hit          
177         Remote core, same node Any cache hit      
178         Remote core, same node Any cache hit      
179         Remote node, same socket Any cache hit    
180         Remote node, same socket Any cache hit    
181         Uncached hit                              
182                                                   
183 Please refer to their man page for more detail    
184                                                   
185 SEE ALSO                                          
186 --------                                          
187                                                   
188 linkperf:perf-record[1], linkperf:perf-script[    
189 linkperf:perf-mem[1], linkperf:perf-c2c[1]        
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php