~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/admin-guide/perf/hisi-pmu.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 ======================================================
  2 HiSilicon SoC uncore Performance Monitoring Unit (PMU)
  3 ======================================================
  4 
  5 The HiSilicon SoC chip includes various independent system device PMUs
  6 such as L3 cache (L3C), Hydra Home Agent (HHA) and DDRC. These PMUs are
  7 independent and have hardware logic to gather statistics and performance
  8 information.
  9 
 10 The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster
 11 (CCL) is made up of 4 cpu cores sharing one L3 cache; each CPU die is
 12 called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has
 13 two HHAs (0 - 1) and four DDRCs (0 - 3), respectively.
 14 
 15 HiSilicon SoC uncore PMU driver
 16 -------------------------------
 17 
 18 Each device PMU has separate registers for event counting, control and
 19 interrupt, and the PMU driver shall register perf PMU drivers like L3C,
 20 HHA and DDRC etc. The available events and configuration options shall
 21 be described in the sysfs, see:
 22 
 23 /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>.
 24 The "perf list" command shall list the available events from sysfs.
 25 
 26 Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU
 27 name will appear in event listing as hisi_sccl<sccl-id>_module<index-id>.
 28 where "sccl-id" is the identifier of the SCCL and "index-id" is the index of
 29 module.
 30 
 31 e.g. hisi_sccl3_l3c0/rd_hit_cpipe is READ_HIT_CPIPE event of L3C index #0 in
 32 SCCL ID #3.
 33 
 34 e.g. hisi_sccl1_hha0/rx_operations is RX_OPERATIONS event of HHA index #0 in
 35 SCCL ID #1.
 36 
 37 The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
 38 ID used to count the uncore PMU event.
 39 
 40 Example usage of perf::
 41 
 42   $# perf list
 43   hisi_sccl3_l3c0/rd_hit_cpipe/ [kernel PMU event]
 44   ------------------------------------------
 45   hisi_sccl3_l3c0/wr_hit_cpipe/ [kernel PMU event]
 46   ------------------------------------------
 47   hisi_sccl1_l3c0/rd_hit_cpipe/ [kernel PMU event]
 48   ------------------------------------------
 49   hisi_sccl1_l3c0/wr_hit_cpipe/ [kernel PMU event]
 50   ------------------------------------------
 51 
 52   $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
 53   $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5
 54 
 55 For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
 56 as PMU v1, but some new functions are added to the hardware.
 57 
 58 1. L3C PMU supports filtering by core/thread within the cluster which can be
 59 specified as a bitmap::
 60 
 61   $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5
 62 
 63 This will only count the operations from core/thread 0 and 1 in this cluster.
 64 
 65 2. Tracetag allow the user to chose to count only read, write or atomic
 66 operations via the tt_req parameeter in perf. The default value counts all
 67 operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
 68 represents write operations, 3'b110 represents atomic store operations and
 69 3'b111 represents atomic non-store operations, other values are reserved::
 70 
 71   $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5
 72 
 73 This will only count the read operations in this cluster.
 74 
 75 3. Datasrc allows the user to check where the data comes from. It is 5 bits.
 76 Some important codes are as follows:
 77 
 78 - 5'b00001: comes from L3C in this die;
 79 - 5'b01000: comes from L3C in the cross-die;
 80 - 5'b01001: comes from L3C which is in another socket;
 81 - 5'b01110: comes from the local DDR;
 82 - 5'b01111: comes from the cross-die DDR;
 83 - 5'b10000: comes from cross-socket DDR;
 84 
 85 etc, it is mainly helpful to find that the data source is nearest from the CPU
 86 cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
 87 configured in perf command::
 88 
 89   $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
 90   hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5
 91 
 92 4. Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
 93 contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
 94 clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
 95 SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
 96 CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
 97 
 98 - 5'b00000: I/O_MGMT_ICL;
 99 - 5'b00001: Network_ICL;
100 - 5'b00011: HAC_ICL;
101 - 5'b10000: PCIe_ICL;
102 
103 5. uring_channel: UC PMU events 0x47~0x59 supports filtering by tx request
104 uring channel. It is 2 bits. Some important codes are as follows:
105 
106 - 2'b11: count the events which sent to the uring_ext (MATA) channel;
107 - 2'b01: is the same as 2'b11;
108 - 2'b10: count the events which sent to the uring (non-MATA) channel;
109 - 2'b00: default value, count the events which sent to the both uring and
110   uring_ext channel;
111 
112 Users could configure IDs to count data come from specific CCL/ICL, by setting
113 srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
114 tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
115 check the bit when matching against the srcid_cmd/tgtid_cmd.
116 
117 If all of these options are disabled, it can works by the default value that
118 doesn't distinguish the filter condition and ID information and will return
119 the total counter values in the PMU counters.
120 
121 The current driver does not support sampling. So "perf record" is unsupported.
122 Also attach to a task is unsupported as the events are all uncore.
123 
124 Note: Please contact the maintainer for a complete list of events supported for
125 the PMU devices in the SoC and its information if needed.

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php