~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/tools/perf/Documentation/examples.txt

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /tools/perf/Documentation/examples.txt (Version linux-6.12-rc7) and /tools/perf/Documentation/examples.txt (Version linux-6.2.16)


  1                                                     1 
  2                 ------------------------------      2                 ------------------------------
  3                 ****** perf by examples ******      3                 ****** perf by examples ******
  4                 ------------------------------      4                 ------------------------------
  5                                                     5 
  6 [ From an e-mail by Ingo Molnar, https://lore.      6 [ From an e-mail by Ingo Molnar, https://lore.kernel.org/lkml/20090804195717.GA5998@elte.hu ]
  7                                                     7 
  8                                                     8 
  9 First, discovery/enumeration of available coun      9 First, discovery/enumeration of available counters can be done via
 10 'perf list':                                       10 'perf list':
 11                                                    11 
 12 titan:~> perf list                                 12 titan:~> perf list
 13   [...]                                            13   [...]
 14   kmem:kmalloc                             [Tr     14   kmem:kmalloc                             [Tracepoint event]
 15   kmem:kmem_cache_alloc                    [Tr     15   kmem:kmem_cache_alloc                    [Tracepoint event]
 16   kmem:kmalloc_node                        [Tr     16   kmem:kmalloc_node                        [Tracepoint event]
 17   kmem:kmem_cache_alloc_node               [Tr     17   kmem:kmem_cache_alloc_node               [Tracepoint event]
 18   kmem:kfree                               [Tr     18   kmem:kfree                               [Tracepoint event]
 19   kmem:kmem_cache_free                     [Tr     19   kmem:kmem_cache_free                     [Tracepoint event]
 20   kmem:mm_page_free                        [Tr     20   kmem:mm_page_free                        [Tracepoint event]
 21   kmem:mm_page_free_batched                [Tr     21   kmem:mm_page_free_batched                [Tracepoint event]
 22   kmem:mm_page_alloc                       [Tr     22   kmem:mm_page_alloc                       [Tracepoint event]
 23   kmem:mm_page_alloc_zone_locked           [Tr     23   kmem:mm_page_alloc_zone_locked           [Tracepoint event]
 24   kmem:mm_page_pcpu_drain                  [Tr     24   kmem:mm_page_pcpu_drain                  [Tracepoint event]
 25   kmem:mm_page_alloc_extfrag               [Tr     25   kmem:mm_page_alloc_extfrag               [Tracepoint event]
 26                                                    26 
 27 Then any (or all) of the above event sources c     27 Then any (or all) of the above event sources can be activated and
 28 measured. For example the page alloc/free prop     28 measured. For example the page alloc/free properties of a 'hackbench
 29 run' are:                                          29 run' are:
 30                                                    30 
 31  titan:~> perf stat -e kmem:mm_page_pcpu_drain     31  titan:~> perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc
 32  -e kmem:mm_page_free_batched -e kmem:mm_page_     32  -e kmem:mm_page_free_batched -e kmem:mm_page_free ./hackbench 10
 33  Time: 0.575                                       33  Time: 0.575
 34                                                    34 
 35  Performance counter stats for './hackbench 10     35  Performance counter stats for './hackbench 10':
 36                                                    36 
 37           13857  kmem:mm_page_pcpu_drain           37           13857  kmem:mm_page_pcpu_drain
 38           27576  kmem:mm_page_alloc                38           27576  kmem:mm_page_alloc
 39            6025  kmem:mm_page_free_batched         39            6025  kmem:mm_page_free_batched
 40           20934  kmem:mm_page_free                 40           20934  kmem:mm_page_free
 41                                                    41 
 42     0.613972165  seconds time elapsed              42     0.613972165  seconds time elapsed
 43                                                    43 
 44 You can observe the statistical properties as      44 You can observe the statistical properties as well, by using the
 45 'repeat the workload N times' feature of perf      45 'repeat the workload N times' feature of perf stat:
 46                                                    46 
 47  titan:~> perf stat --repeat 5 -e kmem:mm_page     47  titan:~> perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e
 48    kmem:mm_page_alloc -e kmem:mm_page_free_bat     48    kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
 49    kmem:mm_page_free ./hackbench 10                49    kmem:mm_page_free ./hackbench 10
 50  Time: 0.627                                       50  Time: 0.627
 51  Time: 0.644                                       51  Time: 0.644
 52  Time: 0.564                                       52  Time: 0.564
 53  Time: 0.559                                       53  Time: 0.559
 54  Time: 0.626                                       54  Time: 0.626
 55                                                    55 
 56  Performance counter stats for './hackbench 10     56  Performance counter stats for './hackbench 10' (5 runs):
 57                                                    57 
 58           12920  kmem:mm_page_pcpu_drain    (      58           12920  kmem:mm_page_pcpu_drain    ( +-   3.359% )
 59           25035  kmem:mm_page_alloc         (      59           25035  kmem:mm_page_alloc         ( +-   3.783% )
 60            6104  kmem:mm_page_free_batched  (      60            6104  kmem:mm_page_free_batched  ( +-   0.934% )
 61           18376  kmem:mm_page_free          (      61           18376  kmem:mm_page_free          ( +-   4.941% )
 62                                                    62 
 63     0.643954516  seconds time elapsed   ( +-       63     0.643954516  seconds time elapsed   ( +-   2.363% )
 64                                                    64 
 65 Furthermore, these tracepoints can be used to      65 Furthermore, these tracepoints can be used to sample the workload as
 66 well. For example the page allocations done by     66 well. For example the page allocations done by a 'git gc' can be
 67 captured the following way:                        67 captured the following way:
 68                                                    68 
 69  titan:~/git> perf record -e kmem:mm_page_allo     69  titan:~/git> perf record -e kmem:mm_page_alloc -c 1 ./git gc
 70  Counting objects: 1148, done.                     70  Counting objects: 1148, done.
 71  Delta compression using up to 2 threads.          71  Delta compression using up to 2 threads.
 72  Compressing objects: 100% (450/450), done.        72  Compressing objects: 100% (450/450), done.
 73  Writing objects: 100% (1148/1148), done.          73  Writing objects: 100% (1148/1148), done.
 74  Total 1148 (delta 690), reused 1148 (delta 69     74  Total 1148 (delta 690), reused 1148 (delta 690)
 75  [ perf record: Captured and wrote 0.267 MB pe     75  [ perf record: Captured and wrote 0.267 MB perf.data (~11679 samples) ]
 76                                                    76 
 77 To check which functions generated page alloca     77 To check which functions generated page allocations:
 78                                                    78 
 79  titan:~/git> perf report                          79  titan:~/git> perf report
 80  # Samples: 10646                                  80  # Samples: 10646
 81  #                                                 81  #
 82  # Overhead          Command               Sha     82  # Overhead          Command               Shared Object
 83  # ........  ...............  ................     83  # ........  ...............  ..........................
 84  #                                                 84  #
 85     23.57%       git-repack  /lib64/libc-2.5.s     85     23.57%       git-repack  /lib64/libc-2.5.so
 86     21.81%              git  /lib64/libc-2.5.s     86     21.81%              git  /lib64/libc-2.5.so
 87     14.59%              git  ./git                 87     14.59%              git  ./git
 88     11.79%       git-repack  ./git                 88     11.79%       git-repack  ./git
 89      7.12%              git  /lib64/ld-2.5.so      89      7.12%              git  /lib64/ld-2.5.so
 90      3.16%       git-repack  /lib64/libpthread     90      3.16%       git-repack  /lib64/libpthread-2.5.so
 91      2.09%       git-repack  /bin/bash             91      2.09%       git-repack  /bin/bash
 92      1.97%               rm  /lib64/libc-2.5.s     92      1.97%               rm  /lib64/libc-2.5.so
 93      1.39%               mv  /lib64/ld-2.5.so      93      1.39%               mv  /lib64/ld-2.5.so
 94      1.37%               mv  /lib64/libc-2.5.s     94      1.37%               mv  /lib64/libc-2.5.so
 95      1.12%       git-repack  /lib64/ld-2.5.so      95      1.12%       git-repack  /lib64/ld-2.5.so
 96      0.95%               rm  /lib64/ld-2.5.so      96      0.95%               rm  /lib64/ld-2.5.so
 97      0.90%  git-update-serv  /lib64/libc-2.5.s     97      0.90%  git-update-serv  /lib64/libc-2.5.so
 98      0.73%  git-update-serv  /lib64/ld-2.5.so      98      0.73%  git-update-serv  /lib64/ld-2.5.so
 99      0.68%             perf  /lib64/libpthread     99      0.68%             perf  /lib64/libpthread-2.5.so
100      0.64%       git-repack  /usr/lib64/libz.s    100      0.64%       git-repack  /usr/lib64/libz.so.1.2.3
101                                                   101 
102 Or to see it on a more finegrained level:         102 Or to see it on a more finegrained level:
103                                                   103 
104 titan:~/git> perf report --sort comm,dso,symbo    104 titan:~/git> perf report --sort comm,dso,symbol
105 # Samples: 10646                                  105 # Samples: 10646
106 #                                                 106 #
107 # Overhead          Command               Shar    107 # Overhead          Command               Shared Object  Symbol
108 # ........  ...............  .................    108 # ........  ...............  ..........................  ......
109 #                                                 109 #
110      9.35%       git-repack  ./git                110      9.35%       git-repack  ./git                       [.] insert_obj_hash
111      9.12%              git  ./git                111      9.12%              git  ./git                       [.] insert_obj_hash
112      7.31%              git  /lib64/libc-2.5.s    112      7.31%              git  /lib64/libc-2.5.so          [.] memcpy
113      6.34%       git-repack  /lib64/libc-2.5.s    113      6.34%       git-repack  /lib64/libc-2.5.so          [.] _int_malloc
114      6.24%       git-repack  /lib64/libc-2.5.s    114      6.24%       git-repack  /lib64/libc-2.5.so          [.] memcpy
115      5.82%       git-repack  /lib64/libc-2.5.s    115      5.82%       git-repack  /lib64/libc-2.5.so          [.] __GI___fork
116      5.47%              git  /lib64/libc-2.5.s    116      5.47%              git  /lib64/libc-2.5.so          [.] _int_malloc
117      2.99%              git  /lib64/libc-2.5.s    117      2.99%              git  /lib64/libc-2.5.so          [.] memset
118                                                   118 
119 Furthermore, call-graph sampling can be done t    119 Furthermore, call-graph sampling can be done too, of page
120 allocations - to see precisely what kind of pa    120 allocations - to see precisely what kind of page allocations there
121 are:                                              121 are:
122                                                   122 
123  titan:~/git> perf record -g -e kmem:mm_page_a    123  titan:~/git> perf record -g -e kmem:mm_page_alloc -c 1 ./git gc
124  Counting objects: 1148, done.                    124  Counting objects: 1148, done.
125  Delta compression using up to 2 threads.         125  Delta compression using up to 2 threads.
126  Compressing objects: 100% (450/450), done.       126  Compressing objects: 100% (450/450), done.
127  Writing objects: 100% (1148/1148), done.         127  Writing objects: 100% (1148/1148), done.
128  Total 1148 (delta 690), reused 1148 (delta 69    128  Total 1148 (delta 690), reused 1148 (delta 690)
129  [ perf record: Captured and wrote 0.963 MB pe    129  [ perf record: Captured and wrote 0.963 MB perf.data (~42069 samples) ]
130                                                   130 
131  titan:~/git> perf report -g                      131  titan:~/git> perf report -g
132  # Samples: 10686                                 132  # Samples: 10686
133  #                                                133  #
134  # Overhead          Command               Sha    134  # Overhead          Command               Shared Object
135  # ........  ...............  ................    135  # ........  ...............  ..........................
136  #                                                136  #
137     23.25%       git-repack  /lib64/libc-2.5.s    137     23.25%       git-repack  /lib64/libc-2.5.so
138                 |                                 138                 |
139                 |--50.00%-- _int_free             139                 |--50.00%-- _int_free
140                 |                                 140                 |
141                 |--37.50%-- __GI___fork           141                 |--37.50%-- __GI___fork
142                 |          make_child             142                 |          make_child
143                 |                                 143                 |
144                 |--12.50%-- ptmalloc_unlock_al    144                 |--12.50%-- ptmalloc_unlock_all2
145                 |          make_child             145                 |          make_child
146                 |                                 146                 |
147                  --6.25%-- __GI_strcpy            147                  --6.25%-- __GI_strcpy
148     21.61%              git  /lib64/libc-2.5.s    148     21.61%              git  /lib64/libc-2.5.so
149                 |                                 149                 |
150                 |--30.00%-- __GI_read             150                 |--30.00%-- __GI_read
151                 |          |                      151                 |          |
152                 |           --83.33%-- git_con    152                 |           --83.33%-- git_config_from_file
153                 |                     git_conf    153                 |                     git_config
154                 |                     |           154                 |                     |
155    [...]                                          155    [...]
156                                                   156 
157 Or you can observe the whole system's page all    157 Or you can observe the whole system's page allocations for 10
158 seconds:                                          158 seconds:
159                                                   159 
160 titan:~/git> perf stat -a -e kmem:mm_page_pcpu    160 titan:~/git> perf stat -a -e kmem:mm_page_pcpu_drain -e
161 kmem:mm_page_alloc -e kmem:mm_page_free_batche    161 kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
162 kmem:mm_page_free sleep 10                        162 kmem:mm_page_free sleep 10
163                                                   163 
164  Performance counter stats for 'sleep 10':        164  Performance counter stats for 'sleep 10':
165                                                   165 
166          171585  kmem:mm_page_pcpu_drain          166          171585  kmem:mm_page_pcpu_drain
167          322114  kmem:mm_page_alloc               167          322114  kmem:mm_page_alloc
168           73623  kmem:mm_page_free_batched        168           73623  kmem:mm_page_free_batched
169          254115  kmem:mm_page_free                169          254115  kmem:mm_page_free
170                                                   170 
171    10.000591410  seconds time elapsed             171    10.000591410  seconds time elapsed
172                                                   172 
173 Or observe how fluctuating the page allocation    173 Or observe how fluctuating the page allocations are, via statistical
174 analysis done over ten 1-second intervals:        174 analysis done over ten 1-second intervals:
175                                                   175 
176  titan:~/git> perf stat --repeat 10 -a -e kmem    176  titan:~/git> perf stat --repeat 10 -a -e kmem:mm_page_pcpu_drain -e
177    kmem:mm_page_alloc -e kmem:mm_page_free_bat    177    kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
178    kmem:mm_page_free sleep 1                      178    kmem:mm_page_free sleep 1
179                                                   179 
180  Performance counter stats for 'sleep 1' (10 r    180  Performance counter stats for 'sleep 1' (10 runs):
181                                                   181 
182           17254  kmem:mm_page_pcpu_drain    (     182           17254  kmem:mm_page_pcpu_drain    ( +-   3.709% )
183           34394  kmem:mm_page_alloc         (     183           34394  kmem:mm_page_alloc         ( +-   4.617% )
184            7509  kmem:mm_page_free_batched  (     184            7509  kmem:mm_page_free_batched  ( +-   4.820% )
185           25653  kmem:mm_page_free          (     185           25653  kmem:mm_page_free          ( +-   3.672% )
186                                                   186 
187     1.058135029  seconds time elapsed   ( +-      187     1.058135029  seconds time elapsed   ( +-   3.089% )
188                                                   188 
189 Or you can annotate the recorded 'git gc' run     189 Or you can annotate the recorded 'git gc' run on a per symbol basis
190 and check which instructions/source-code gener    190 and check which instructions/source-code generated page allocations:
191                                                   191 
192  titan:~/git> perf annotate __GI___fork           192  titan:~/git> perf annotate __GI___fork
193  ---------------------------------------------    193  ------------------------------------------------
194   Percent |      Source code & Disassembly of     194   Percent |      Source code & Disassembly of libc-2.5.so
195  ---------------------------------------------    195  ------------------------------------------------
196           :                                       196           :
197           :                                       197           :
198           :      Disassembly of section .plt:     198           :      Disassembly of section .plt:
199           :      Disassembly of section .text:    199           :      Disassembly of section .text:
200           :                                       200           :
201           :      00000031a2e95560 <__fork>:       201           :      00000031a2e95560 <__fork>:
202  [...]                                            202  [...]
203      0.00 :        31a2e95602:   b8 38 00 00 0    203      0.00 :        31a2e95602:   b8 38 00 00 00          mov    $0x38,%eax
204      0.00 :        31a2e95607:   0f 05            204      0.00 :        31a2e95607:   0f 05                   syscall
205     83.42 :        31a2e95609:   48 3d 00 f0 f    205     83.42 :        31a2e95609:   48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax
206      0.00 :        31a2e9560f:   0f 87 4d 01 0    206      0.00 :        31a2e9560f:   0f 87 4d 01 00 00       ja     31a2e95762 <__fork+0x202>
207      0.00 :        31a2e95615:   85 c0            207      0.00 :        31a2e95615:   85 c0                   test   %eax,%eax
208                                                   208 
209 ( this shows that 83.42% of __GI___fork's page    209 ( this shows that 83.42% of __GI___fork's page allocations come from
210   the 0x38 system call it performs. )             210   the 0x38 system call it performs. )
211                                                   211 
212 etc. etc. - a lot more is possible. I could li    212 etc. etc. - a lot more is possible. I could list a dozen of
213 other different usecases straight away - neith    213 other different usecases straight away - neither of which is
214 possible via /proc/vmstat.                        214 possible via /proc/vmstat.
215                                                   215 
216 /proc/vmstat is not in the same league really,    216 /proc/vmstat is not in the same league really, in terms of
217 expressive power of system analysis and perfor    217 expressive power of system analysis and performance
218 analysis.                                         218 analysis.
219                                                   219 
220 All that the above results needed were those n    220 All that the above results needed were those new tracepoints
221 in include/tracing/events/kmem.h.                 221 in include/tracing/events/kmem.h.
222                                                   222 
223         Ingo                                      223         Ingo
224                                                   224 
225                                                   225 
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php