1 Intel hybrid support 2 -------------------- 3 Support for Intel hybrid events within perf to 4 5 For some Intel platforms, such as AlderLake, w 6 it consists of atom cpu and core cpu. Each cpu 7 Part of events are available on core cpu, part 8 on atom cpu and even part of events are availa 9 10 Kernel exports two new cpu pmus via sysfs: 11 /sys/devices/cpu_core 12 /sys/devices/cpu_atom 13 14 The 'cpus' files are created under the directo 15 16 cat /sys/devices/cpu_core/cpus 17 0-15 18 19 cat /sys/devices/cpu_atom/cpus 20 16-23 21 22 It indicates cpu0-cpu15 are core cpus and cpu1 23 24 As before, use perf-list to list the symbolic 25 26 perf list 27 28 inst_retired.any 29 [Fixed Counter: Counts the number of i 30 inst_retired.any 31 [Number of instructions retired. Fixed 32 33 The 'Unit: xxx' is added to brief description 34 the event is belong to. Same event name but wi 35 be supported. 36 37 Enable hybrid event with a specific pmu 38 39 To enable a core only event or atom only event 40 41 cpu_core/<event name>/ 42 or 43 cpu_atom/<event name>/ 44 45 For example, count the 'cycles' event on core 46 47 perf stat -e cpu_core/cycles/ 48 49 Create two events for one hardware event autom 50 51 When creating one event and the event is avail 52 two events are created automatically. One is f 53 core. Most of hardware events and cache events 54 cpu_core and cpu_atom. 55 56 For hardware events, they have pre-defined con 57 But on hybrid platform, kernel needs to know w 58 (from atom or from core). The original perf ev 59 can't carry pmu information. So now this type 60 type. The PMU type ID is stored at attr.config 61 62 PMU type ID is retrieved from sysfs. 63 /sys/devices/cpu_atom/type 64 /sys/devices/cpu_core/type 65 66 The new attr.config layout for PERF_TYPE_HARDW 67 68 PERF_TYPE_HARDWARE: 0xEEEEEEEE 69 AA: hardwa 70 EEEEEEEE: 71 72 Cache event is similar. The type PERF_TYPE_HW_ 73 PMU aware type. The PMU type ID is stored at a 74 75 The new attr.config layout for PERF_TYPE_HW_CA 76 77 PERF_TYPE_HW_CACHE: 0xEEEEEEEE 78 BB: hardwa 79 CC: hardwa 80 DD: hardwa 81 EEEEEEEE: 82 83 When enabling a hardware event without specifi 84 perf stat -e cycles -a (use system-wide in thi 85 are created automatically. 86 87 -------------------------------------------- 88 perf_event_attr: 89 size 120 90 config 0x4000000 91 sample_type IDENTIFIE 92 read_format TOTAL_TIM 93 disabled 1 94 inherit 1 95 exclude_guest 1 96 -------------------------------------------- 97 98 and 99 100 -------------------------------------------- 101 perf_event_attr: 102 size 120 103 config 0x8000000 104 sample_type IDENTIFIE 105 read_format TOTAL_TIM 106 disabled 1 107 inherit 1 108 exclude_guest 1 109 -------------------------------------------- 110 111 type 0 is PERF_TYPE_HARDWARE. 112 0x4 in 0x400000000 indicates it's cpu_core pmu 113 0x8 in 0x800000000 indicates it's cpu_atom pmu 114 115 The kernel creates 'cycles' (0x400000000) on c 116 and create 'cycles' (0x800000000) on cpu16-cpu 117 118 For perf-stat result, it displays two events: 119 120 Performance counter stats for 'system wide': 121 122 6,744,979 cpu_core/cycles/ 123 1,965,552 cpu_atom/cycles/ 124 125 The first 'cycles' is core event, the second ' 126 127 Thread mode example: 128 129 perf-stat reports the scaled counts for hybrid 130 displayed. The percentage is the event's runni 131 132 One example, 'triad_loop' runs on cpu16 (atom 133 scaled value for core cycles is 160,444,092 an 134 135 perf stat -e cycles \-- taskset -c 16 ./triad_ 136 137 As previous, two events are created. 138 139 ---------------------------------------------- 140 perf_event_attr: 141 size 120 142 config 0x400000000 143 sample_type IDENTIFIER 144 read_format TOTAL_TIME_ 145 disabled 1 146 inherit 1 147 enable_on_exec 1 148 exclude_guest 1 149 ---------------------------------------------- 150 151 and 152 153 ---------------------------------------------- 154 perf_event_attr: 155 size 120 156 config 0x800000000 157 sample_type IDENTIFIER 158 read_format TOTAL_TIME_ 159 disabled 1 160 inherit 1 161 enable_on_exec 1 162 exclude_guest 1 163 ---------------------------------------------- 164 165 Performance counter stats for 'taskset -c 16 166 167 233,066,666 cpu_core/cycles/ 168 604,097,080 cpu_atom/cycles/ 169 170 perf-record: 171 172 If there is no '-e' specified in perf record, 173 it creates two default 'cycles' and adds them 174 is for core, the other is for atom. 175 176 perf-stat: 177 178 If there is no '-e' specified in perf stat, on 179 besides of software events, following events a 180 added to event list in order. 181 182 cpu_core/cycles/, 183 cpu_atom/cycles/, 184 cpu_core/instructions/, 185 cpu_atom/instructions/, 186 cpu_core/branches/, 187 cpu_atom/branches/, 188 cpu_core/branch-misses/, 189 cpu_atom/branch-misses/ 190 191 Of course, both perf-stat and perf-record supp 192 hybrid event with a specific pmu. 193 194 e.g. 195 perf stat -e cpu_core/cycles/ 196 perf stat -e cpu_atom/cycles/ 197 perf stat -e cpu_core/r1a/ 198 perf stat -e cpu_atom/L1-icache-loads/ 199 perf stat -e cpu_core/cycles/,cpu_atom/instruc 200 perf stat -e '{cpu_core/cycles/,cpu_core/instr 201 202 But '{cpu_core/cycles/,cpu_atom/instructions/} 203 warning and disable grouping, because the pmus 204 not matched (cpu_core vs. cpu_atom).
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.