~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/tools/perf/Documentation/intel-hybrid.txt

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /tools/perf/Documentation/intel-hybrid.txt (Version linux-6.12-rc7) and /tools/perf/Documentation/intel-hybrid.txt (Version linux-5.14.21)


  1 Intel hybrid support                                1 Intel hybrid support
  2 --------------------                                2 --------------------
  3 Support for Intel hybrid events within perf to      3 Support for Intel hybrid events within perf tools.
  4                                                     4 
  5 For some Intel platforms, such as AlderLake, w      5 For some Intel platforms, such as AlderLake, which is hybrid platform and
  6 it consists of atom cpu and core cpu. Each cpu      6 it consists of atom cpu and core cpu. Each cpu has dedicated event list.
  7 Part of events are available on core cpu, part      7 Part of events are available on core cpu, part of events are available
  8 on atom cpu and even part of events are availa      8 on atom cpu and even part of events are available on both.
  9                                                     9 
 10 Kernel exports two new cpu pmus via sysfs:         10 Kernel exports two new cpu pmus via sysfs:
 11 /sys/devices/cpu_core                              11 /sys/devices/cpu_core
 12 /sys/devices/cpu_atom                              12 /sys/devices/cpu_atom
 13                                                    13 
 14 The 'cpus' files are created under the directo     14 The 'cpus' files are created under the directories. For example,
 15                                                    15 
 16 cat /sys/devices/cpu_core/cpus                     16 cat /sys/devices/cpu_core/cpus
 17 0-15                                               17 0-15
 18                                                    18 
 19 cat /sys/devices/cpu_atom/cpus                     19 cat /sys/devices/cpu_atom/cpus
 20 16-23                                              20 16-23
 21                                                    21 
 22 It indicates cpu0-cpu15 are core cpus and cpu1     22 It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
 23                                                    23 
                                                   >>  24 Quickstart
                                                   >>  25 
                                                   >>  26 List hybrid event
                                                   >>  27 -----------------
                                                   >>  28 
 24 As before, use perf-list to list the symbolic      29 As before, use perf-list to list the symbolic event.
 25                                                    30 
 26 perf list                                          31 perf list
 27                                                    32 
 28 inst_retired.any                                   33 inst_retired.any
 29         [Fixed Counter: Counts the number of i     34         [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
 30 inst_retired.any                                   35 inst_retired.any
 31         [Number of instructions retired. Fixed     36         [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
 32                                                    37 
 33 The 'Unit: xxx' is added to brief description      38 The 'Unit: xxx' is added to brief description to indicate which pmu
 34 the event is belong to. Same event name but wi     39 the event is belong to. Same event name but with different pmu can
 35 be supported.                                      40 be supported.
 36                                                    41 
 37 Enable hybrid event with a specific pmu            42 Enable hybrid event with a specific pmu
                                                   >>  43 ---------------------------------------
 38                                                    44 
 39 To enable a core only event or atom only event     45 To enable a core only event or atom only event, following syntax is supported:
 40                                                    46 
 41         cpu_core/<event name>/                     47         cpu_core/<event name>/
 42 or                                                 48 or
 43         cpu_atom/<event name>/                     49         cpu_atom/<event name>/
 44                                                    50 
 45 For example, count the 'cycles' event on core      51 For example, count the 'cycles' event on core cpus.
 46                                                    52 
 47         perf stat -e cpu_core/cycles/              53         perf stat -e cpu_core/cycles/
 48                                                    54 
 49 Create two events for one hardware event autom     55 Create two events for one hardware event automatically
                                                   >>  56 ------------------------------------------------------
 50                                                    57 
 51 When creating one event and the event is avail     58 When creating one event and the event is available on both atom and core,
 52 two events are created automatically. One is f     59 two events are created automatically. One is for atom, the other is for
 53 core. Most of hardware events and cache events     60 core. Most of hardware events and cache events are available on both
 54 cpu_core and cpu_atom.                             61 cpu_core and cpu_atom.
 55                                                    62 
 56 For hardware events, they have pre-defined con     63 For hardware events, they have pre-defined configs (e.g. 0 for cycles).
 57 But on hybrid platform, kernel needs to know w     64 But on hybrid platform, kernel needs to know where the event comes from
 58 (from atom or from core). The original perf ev     65 (from atom or from core). The original perf event type PERF_TYPE_HARDWARE
 59 can't carry pmu information. So now this type      66 can't carry pmu information. So now this type is extended to be PMU aware
 60 type. The PMU type ID is stored at attr.config     67 type. The PMU type ID is stored at attr.config[63:32].
 61                                                    68 
 62 PMU type ID is retrieved from sysfs.               69 PMU type ID is retrieved from sysfs.
 63 /sys/devices/cpu_atom/type                         70 /sys/devices/cpu_atom/type
 64 /sys/devices/cpu_core/type                         71 /sys/devices/cpu_core/type
 65                                                    72 
 66 The new attr.config layout for PERF_TYPE_HARDW     73 The new attr.config layout for PERF_TYPE_HARDWARE:
 67                                                    74 
 68 PERF_TYPE_HARDWARE:                 0xEEEEEEEE     75 PERF_TYPE_HARDWARE:                 0xEEEEEEEE000000AA
 69                                     AA: hardwa     76                                     AA: hardware event ID
 70                                     EEEEEEEE:      77                                     EEEEEEEE: PMU type ID
 71                                                    78 
 72 Cache event is similar. The type PERF_TYPE_HW_     79 Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
 73 PMU aware type. The PMU type ID is stored at a     80 PMU aware type. The PMU type ID is stored at attr.config[63:32].
 74                                                    81 
 75 The new attr.config layout for PERF_TYPE_HW_CA     82 The new attr.config layout for PERF_TYPE_HW_CACHE:
 76                                                    83 
 77 PERF_TYPE_HW_CACHE:                 0xEEEEEEEE     84 PERF_TYPE_HW_CACHE:                 0xEEEEEEEE00DDCCBB
 78                                     BB: hardwa     85                                     BB: hardware cache ID
 79                                     CC: hardwa     86                                     CC: hardware cache op ID
 80                                     DD: hardwa     87                                     DD: hardware cache op result ID
 81                                     EEEEEEEE:      88                                     EEEEEEEE: PMU type ID
 82                                                    89 
 83 When enabling a hardware event without specifi     90 When enabling a hardware event without specified pmu, such as,
 84 perf stat -e cycles -a (use system-wide in thi     91 perf stat -e cycles -a (use system-wide in this example), two events
 85 are created automatically.                         92 are created automatically.
 86                                                    93 
 87   --------------------------------------------     94   ------------------------------------------------------------
 88   perf_event_attr:                                 95   perf_event_attr:
 89     size                             120           96     size                             120
 90     config                           0x4000000     97     config                           0x400000000
 91     sample_type                      IDENTIFIE     98     sample_type                      IDENTIFIER
 92     read_format                      TOTAL_TIM     99     read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
 93     disabled                         1            100     disabled                         1
 94     inherit                          1            101     inherit                          1
 95     exclude_guest                    1            102     exclude_guest                    1
 96   --------------------------------------------    103   ------------------------------------------------------------
 97                                                   104 
 98 and                                               105 and
 99                                                   106 
100   --------------------------------------------    107   ------------------------------------------------------------
101   perf_event_attr:                                108   perf_event_attr:
102     size                             120          109     size                             120
103     config                           0x8000000    110     config                           0x800000000
104     sample_type                      IDENTIFIE    111     sample_type                      IDENTIFIER
105     read_format                      TOTAL_TIM    112     read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
106     disabled                         1            113     disabled                         1
107     inherit                          1            114     inherit                          1
108     exclude_guest                    1            115     exclude_guest                    1
109   --------------------------------------------    116   ------------------------------------------------------------
110                                                   117 
111 type 0 is PERF_TYPE_HARDWARE.                     118 type 0 is PERF_TYPE_HARDWARE.
112 0x4 in 0x400000000 indicates it's cpu_core pmu    119 0x4 in 0x400000000 indicates it's cpu_core pmu.
113 0x8 in 0x800000000 indicates it's cpu_atom pmu    120 0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
114                                                   121 
115 The kernel creates 'cycles' (0x400000000) on c    122 The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
116 and create 'cycles' (0x800000000) on cpu16-cpu    123 and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
117                                                   124 
118 For perf-stat result, it displays two events:     125 For perf-stat result, it displays two events:
119                                                   126 
120  Performance counter stats for 'system wide':     127  Performance counter stats for 'system wide':
121                                                   128 
122            6,744,979      cpu_core/cycles/        129            6,744,979      cpu_core/cycles/
123            1,965,552      cpu_atom/cycles/        130            1,965,552      cpu_atom/cycles/
124                                                   131 
125 The first 'cycles' is core event, the second '    132 The first 'cycles' is core event, the second 'cycles' is atom event.
126                                                   133 
127 Thread mode example:                              134 Thread mode example:
                                                   >> 135 --------------------
128                                                   136 
129 perf-stat reports the scaled counts for hybrid    137 perf-stat reports the scaled counts for hybrid event and with a percentage
130 displayed. The percentage is the event's runni    138 displayed. The percentage is the event's running time/enabling time.
131                                                   139 
132 One example, 'triad_loop' runs on cpu16 (atom     140 One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
133 scaled value for core cycles is 160,444,092 an    141 scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
134                                                   142 
135 perf stat -e cycles \-- taskset -c 16 ./triad_ !! 143 perf stat -e cycles -- taskset -c 16 ./triad_loop
136                                                   144 
137 As previous, two events are created.              145 As previous, two events are created.
138                                                   146 
139 ----------------------------------------------    147 ------------------------------------------------------------
140 perf_event_attr:                                  148 perf_event_attr:
141   size                             120            149   size                             120
142   config                           0x400000000    150   config                           0x400000000
143   sample_type                      IDENTIFIER     151   sample_type                      IDENTIFIER
144   read_format                      TOTAL_TIME_    152   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
145   disabled                         1              153   disabled                         1
146   inherit                          1              154   inherit                          1
147   enable_on_exec                   1              155   enable_on_exec                   1
148   exclude_guest                    1              156   exclude_guest                    1
149 ----------------------------------------------    157 ------------------------------------------------------------
150                                                   158 
151 and                                               159 and
152                                                   160 
153 ----------------------------------------------    161 ------------------------------------------------------------
154 perf_event_attr:                                  162 perf_event_attr:
155   size                             120            163   size                             120
156   config                           0x800000000    164   config                           0x800000000
157   sample_type                      IDENTIFIER     165   sample_type                      IDENTIFIER
158   read_format                      TOTAL_TIME_    166   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
159   disabled                         1              167   disabled                         1
160   inherit                          1              168   inherit                          1
161   enable_on_exec                   1              169   enable_on_exec                   1
162   exclude_guest                    1              170   exclude_guest                    1
163 ----------------------------------------------    171 ------------------------------------------------------------
164                                                   172 
165  Performance counter stats for 'taskset -c 16     173  Performance counter stats for 'taskset -c 16 ./triad_loop':
166                                                   174 
167        233,066,666      cpu_core/cycles/          175        233,066,666      cpu_core/cycles/                                              (0.43%)
168        604,097,080      cpu_atom/cycles/          176        604,097,080      cpu_atom/cycles/                                              (99.57%)
169                                                   177 
170 perf-record:                                      178 perf-record:
                                                   >> 179 ------------
171                                                   180 
172 If there is no '-e' specified in perf record,     181 If there is no '-e' specified in perf record, on hybrid platform,
173 it creates two default 'cycles' and adds them     182 it creates two default 'cycles' and adds them to event list. One
174 is for core, the other is for atom.               183 is for core, the other is for atom.
175                                                   184 
176 perf-stat:                                        185 perf-stat:
                                                   >> 186 ----------
177                                                   187 
178 If there is no '-e' specified in perf stat, on    188 If there is no '-e' specified in perf stat, on hybrid platform,
179 besides of software events, following events a    189 besides of software events, following events are created and
180 added to event list in order.                     190 added to event list in order.
181                                                   191 
182 cpu_core/cycles/,                                 192 cpu_core/cycles/,
183 cpu_atom/cycles/,                                 193 cpu_atom/cycles/,
184 cpu_core/instructions/,                           194 cpu_core/instructions/,
185 cpu_atom/instructions/,                           195 cpu_atom/instructions/,
186 cpu_core/branches/,                               196 cpu_core/branches/,
187 cpu_atom/branches/,                               197 cpu_atom/branches/,
188 cpu_core/branch-misses/,                          198 cpu_core/branch-misses/,
189 cpu_atom/branch-misses/                           199 cpu_atom/branch-misses/
190                                                   200 
191 Of course, both perf-stat and perf-record supp    201 Of course, both perf-stat and perf-record support to enable
192 hybrid event with a specific pmu.                 202 hybrid event with a specific pmu.
193                                                   203 
194 e.g.                                              204 e.g.
195 perf stat -e cpu_core/cycles/                     205 perf stat -e cpu_core/cycles/
196 perf stat -e cpu_atom/cycles/                     206 perf stat -e cpu_atom/cycles/
197 perf stat -e cpu_core/r1a/                        207 perf stat -e cpu_core/r1a/
198 perf stat -e cpu_atom/L1-icache-loads/            208 perf stat -e cpu_atom/L1-icache-loads/
199 perf stat -e cpu_core/cycles/,cpu_atom/instruc    209 perf stat -e cpu_core/cycles/,cpu_atom/instructions/
200 perf stat -e '{cpu_core/cycles/,cpu_core/instr    210 perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
201                                                   211 
202 But '{cpu_core/cycles/,cpu_atom/instructions/}    212 But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
203 warning and disable grouping, because the pmus    213 warning and disable grouping, because the pmus in group are
204 not matched (cpu_core vs. cpu_atom).              214 not matched (cpu_core vs. cpu_atom).
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php