~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/trace/tracepoint-analysis.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/trace/tracepoint-analysis.rst (Version linux-6.11.5) and /Documentation/trace/tracepoint-analysis.rst (Version linux-5.4.284)


  1 ==============================================      1 =========================================================
  2 Notes on Analysing Behaviour Using Events and       2 Notes on Analysing Behaviour Using Events and Tracepoints
  3 ==============================================      3 =========================================================
  4 :Author: Mel Gorman (PCL information heavily b      4 :Author: Mel Gorman (PCL information heavily based on email from Ingo Molnar)
  5                                                     5 
  6 1. Introduction                                     6 1. Introduction
  7 ===============                                     7 ===============
  8                                                     8 
  9 Tracepoints (see Documentation/trace/tracepoin      9 Tracepoints (see Documentation/trace/tracepoints.rst) can be used without
 10 creating custom kernel modules to register pro     10 creating custom kernel modules to register probe functions using the event
 11 tracing infrastructure.                            11 tracing infrastructure.
 12                                                    12 
 13 Simplistically, tracepoints represent importan     13 Simplistically, tracepoints represent important events that can be
 14 taken in conjunction with other tracepoints to     14 taken in conjunction with other tracepoints to build a "Big Picture" of
 15 what is going on within the system. There are      15 what is going on within the system. There are a large number of methods for
 16 gathering and interpreting these events. Lacki     16 gathering and interpreting these events. Lacking any current Best Practises,
 17 this document describes some of the methods th     17 this document describes some of the methods that can be used.
 18                                                    18 
 19 This document assumes that debugfs is mounted      19 This document assumes that debugfs is mounted on /sys/kernel/debug and that
 20 the appropriate tracing options have been conf     20 the appropriate tracing options have been configured into the kernel. It is
 21 assumed that the PCL tool tools/perf has been      21 assumed that the PCL tool tools/perf has been installed and is in your path.
 22                                                    22 
 23 2. Listing Available Events                        23 2. Listing Available Events
 24 ===========================                        24 ===========================
 25                                                    25 
 26 2.1 Standard Utilities                             26 2.1 Standard Utilities
 27 ----------------------                             27 ----------------------
 28                                                    28 
 29 All possible events are visible from /sys/kern !!  29 All possible events are visible from /sys/kernel/debug/tracing/events. Simply
 30 calling::                                          30 calling::
 31                                                    31 
 32   $ find /sys/kernel/tracing/events -type d    !!  32   $ find /sys/kernel/debug/tracing/events -type d
 33                                                    33 
 34 will give a fair indication of the number of e     34 will give a fair indication of the number of events available.
 35                                                    35 
 36 2.2 PCL (Performance Counters for Linux)           36 2.2 PCL (Performance Counters for Linux)
 37 ----------------------------------------           37 ----------------------------------------
 38                                                    38 
 39 Discovery and enumeration of all counters and      39 Discovery and enumeration of all counters and events, including tracepoints,
 40 are available with the perf tool. Getting a li     40 are available with the perf tool. Getting a list of available events is a
 41 simple case of::                                   41 simple case of::
 42                                                    42 
 43   $ perf list 2>&1 | grep Tracepoint               43   $ perf list 2>&1 | grep Tracepoint
 44   ext4:ext4_free_inode                     [Tr     44   ext4:ext4_free_inode                     [Tracepoint event]
 45   ext4:ext4_request_inode                  [Tr     45   ext4:ext4_request_inode                  [Tracepoint event]
 46   ext4:ext4_allocate_inode                 [Tr     46   ext4:ext4_allocate_inode                 [Tracepoint event]
 47   ext4:ext4_write_begin                    [Tr     47   ext4:ext4_write_begin                    [Tracepoint event]
 48   ext4:ext4_ordered_write_end              [Tr     48   ext4:ext4_ordered_write_end              [Tracepoint event]
 49   [ .... remaining output snipped .... ]           49   [ .... remaining output snipped .... ]
 50                                                    50 
 51                                                    51 
 52 3. Enabling Events                                 52 3. Enabling Events
 53 ==================                                 53 ==================
 54                                                    54 
 55 3.1 System-Wide Event Enabling                     55 3.1 System-Wide Event Enabling
 56 ------------------------------                     56 ------------------------------
 57                                                    57 
 58 See Documentation/trace/events.rst for a prope     58 See Documentation/trace/events.rst for a proper description on how events
 59 can be enabled system-wide. A short example of     59 can be enabled system-wide. A short example of enabling all events related
 60 to page allocation would look something like::     60 to page allocation would look something like::
 61                                                    61 
 62   $ for i in `find /sys/kernel/tracing/events  !!  62   $ for i in `find /sys/kernel/debug/tracing/events -name "enable" | grep mm_`; do echo 1 > $i; done
 63                                                    63 
 64 3.2 System-Wide Event Enabling with SystemTap      64 3.2 System-Wide Event Enabling with SystemTap
 65 ---------------------------------------------      65 ---------------------------------------------
 66                                                    66 
 67 In SystemTap, tracepoints are accessible using     67 In SystemTap, tracepoints are accessible using the kernel.trace() function
 68 call. The following is an example that reports     68 call. The following is an example that reports every 5 seconds what processes
 69 were allocating the pages.                         69 were allocating the pages.
 70 ::                                                 70 ::
 71                                                    71 
 72   global page_allocs                               72   global page_allocs
 73                                                    73 
 74   probe kernel.trace("mm_page_alloc") {            74   probe kernel.trace("mm_page_alloc") {
 75         page_allocs[execname()]++                  75         page_allocs[execname()]++
 76   }                                                76   }
 77                                                    77 
 78   function print_count() {                         78   function print_count() {
 79         printf ("%-25s %-s\n", "#Pages Allocat     79         printf ("%-25s %-s\n", "#Pages Allocated", "Process Name")
 80         foreach (proc in page_allocs-)             80         foreach (proc in page_allocs-)
 81                 printf("%-25d %s\n", page_allo     81                 printf("%-25d %s\n", page_allocs[proc], proc)
 82         printf ("\n")                              82         printf ("\n")
 83         delete page_allocs                         83         delete page_allocs
 84   }                                                84   }
 85                                                    85 
 86   probe timer.s(5) {                               86   probe timer.s(5) {
 87           print_count()                            87           print_count()
 88   }                                                88   }
 89                                                    89 
 90 3.3 System-Wide Event Enabling with PCL            90 3.3 System-Wide Event Enabling with PCL
 91 ---------------------------------------            91 ---------------------------------------
 92                                                    92 
 93 By specifying the -a switch and analysing slee     93 By specifying the -a switch and analysing sleep, the system-wide events
 94 for a duration of time can be examined.            94 for a duration of time can be examined.
 95 ::                                                 95 ::
 96                                                    96 
 97  $ perf stat -a \                                  97  $ perf stat -a \
 98         -e kmem:mm_page_alloc -e kmem:mm_page_     98         -e kmem:mm_page_alloc -e kmem:mm_page_free \
 99         -e kmem:mm_page_free_batched \             99         -e kmem:mm_page_free_batched \
100         sleep 10                                  100         sleep 10
101  Performance counter stats for 'sleep 10':        101  Performance counter stats for 'sleep 10':
102                                                   102 
103            9630  kmem:mm_page_alloc               103            9630  kmem:mm_page_alloc
104            2143  kmem:mm_page_free                104            2143  kmem:mm_page_free
105            7424  kmem:mm_page_free_batched        105            7424  kmem:mm_page_free_batched
106                                                   106 
107    10.002577764  seconds time elapsed             107    10.002577764  seconds time elapsed
108                                                   108 
109 Similarly, one could execute a shell and exit     109 Similarly, one could execute a shell and exit it as desired to get a report
110 at that point.                                    110 at that point.
111                                                   111 
112 3.4 Local Event Enabling                          112 3.4 Local Event Enabling
113 ------------------------                          113 ------------------------
114                                                   114 
115 Documentation/trace/ftrace.rst describes how t    115 Documentation/trace/ftrace.rst describes how to enable events on a per-thread
116 basis using set_ftrace_pid.                       116 basis using set_ftrace_pid.
117                                                   117 
118 3.5 Local Event Enablement with PCL               118 3.5 Local Event Enablement with PCL
119 -----------------------------------               119 -----------------------------------
120                                                   120 
121 Events can be activated and tracked for the du    121 Events can be activated and tracked for the duration of a process on a local
122 basis using PCL such as follows.                  122 basis using PCL such as follows.
123 ::                                                123 ::
124                                                   124 
125   $ perf stat -e kmem:mm_page_alloc -e kmem:mm    125   $ perf stat -e kmem:mm_page_alloc -e kmem:mm_page_free \
126                  -e kmem:mm_page_free_batched     126                  -e kmem:mm_page_free_batched ./hackbench 10
127   Time: 0.909                                     127   Time: 0.909
128                                                   128 
129     Performance counter stats for './hackbench    129     Performance counter stats for './hackbench 10':
130                                                   130 
131           17803  kmem:mm_page_alloc               131           17803  kmem:mm_page_alloc
132           12398  kmem:mm_page_free                132           12398  kmem:mm_page_free
133            4827  kmem:mm_page_free_batched        133            4827  kmem:mm_page_free_batched
134                                                   134 
135     0.973913387  seconds time elapsed             135     0.973913387  seconds time elapsed
136                                                   136 
137 4. Event Filtering                                137 4. Event Filtering
138 ==================                                138 ==================
139                                                   139 
140 Documentation/trace/ftrace.rst covers in-depth    140 Documentation/trace/ftrace.rst covers in-depth how to filter events in
141 ftrace.  Obviously using grep and awk of trace    141 ftrace.  Obviously using grep and awk of trace_pipe is an option as well
142 as any script reading trace_pipe.                 142 as any script reading trace_pipe.
143                                                   143 
144 5. Analysing Event Variances with PCL             144 5. Analysing Event Variances with PCL
145 =====================================             145 =====================================
146                                                   146 
147 Any workload can exhibit variances between run    147 Any workload can exhibit variances between runs and it can be important
148 to know what the standard deviation is. By and    148 to know what the standard deviation is. By and large, this is left to the
149 performance analyst to do it by hand. In the e    149 performance analyst to do it by hand. In the event that the discrete event
150 occurrences are useful to the performance anal    150 occurrences are useful to the performance analyst, then perf can be used.
151 ::                                                151 ::
152                                                   152 
153   $ perf stat --repeat 5 -e kmem:mm_page_alloc    153   $ perf stat --repeat 5 -e kmem:mm_page_alloc -e kmem:mm_page_free
154                         -e kmem:mm_page_free_b    154                         -e kmem:mm_page_free_batched ./hackbench 10
155   Time: 0.890                                     155   Time: 0.890
156   Time: 0.895                                     156   Time: 0.895
157   Time: 0.915                                     157   Time: 0.915
158   Time: 1.001                                     158   Time: 1.001
159   Time: 0.899                                     159   Time: 0.899
160                                                   160 
161    Performance counter stats for './hackbench     161    Performance counter stats for './hackbench 10' (5 runs):
162                                                   162 
163           16630  kmem:mm_page_alloc         (     163           16630  kmem:mm_page_alloc         ( +-   3.542% )
164           11486  kmem:mm_page_free          (     164           11486  kmem:mm_page_free          ( +-   4.771% )
165            4730  kmem:mm_page_free_batched  (     165            4730  kmem:mm_page_free_batched  ( +-   2.325% )
166                                                   166 
167     0.982653002  seconds time elapsed   ( +-      167     0.982653002  seconds time elapsed   ( +-   1.448% )
168                                                   168 
169 In the event that some higher-level event is r    169 In the event that some higher-level event is required that depends on some
170 aggregation of discrete events, then a script     170 aggregation of discrete events, then a script would need to be developed.
171                                                   171 
172 Using --repeat, it is also possible to view ho    172 Using --repeat, it is also possible to view how events are fluctuating over
173 time on a system-wide basis using -a and sleep    173 time on a system-wide basis using -a and sleep.
174 ::                                                174 ::
175                                                   175 
176   $ perf stat -e kmem:mm_page_alloc -e kmem:mm    176   $ perf stat -e kmem:mm_page_alloc -e kmem:mm_page_free \
177                 -e kmem:mm_page_free_batched \    177                 -e kmem:mm_page_free_batched \
178                 -a --repeat 10 \                  178                 -a --repeat 10 \
179                 sleep 1                           179                 sleep 1
180   Performance counter stats for 'sleep 1' (10     180   Performance counter stats for 'sleep 1' (10 runs):
181                                                   181 
182            1066  kmem:mm_page_alloc         (     182            1066  kmem:mm_page_alloc         ( +-  26.148% )
183             182  kmem:mm_page_free          (     183             182  kmem:mm_page_free          ( +-   5.464% )
184             890  kmem:mm_page_free_batched  (     184             890  kmem:mm_page_free_batched  ( +-  30.079% )
185                                                   185 
186     1.002251757  seconds time elapsed   ( +-      186     1.002251757  seconds time elapsed   ( +-   0.005% )
187                                                   187 
188 6. Higher-Level Analysis with Helper Scripts      188 6. Higher-Level Analysis with Helper Scripts
189 ============================================      189 ============================================
190                                                   190 
191 When events are enabled the events that are tr    191 When events are enabled the events that are triggering can be read from
192 /sys/kernel/tracing/trace_pipe in human-readab !! 192 /sys/kernel/debug/tracing/trace_pipe in human-readable format although binary
193 options exist as well. By post-processing the     193 options exist as well. By post-processing the output, further information can
194 be gathered on-line as appropriate. Examples o    194 be gathered on-line as appropriate. Examples of post-processing might include
195                                                   195 
196   - Reading information from /proc for the PID    196   - Reading information from /proc for the PID that triggered the event
197   - Deriving a higher-level event from a serie    197   - Deriving a higher-level event from a series of lower-level events.
198   - Calculating latencies between two events      198   - Calculating latencies between two events
199                                                   199 
200 Documentation/trace/postprocess/trace-pageallo    200 Documentation/trace/postprocess/trace-pagealloc-postprocess.pl is an example
201 script that can read trace_pipe from STDIN or     201 script that can read trace_pipe from STDIN or a copy of a trace. When used
202 on-line, it can be interrupted once to generat    202 on-line, it can be interrupted once to generate a report without exiting
203 and twice to exit.                                203 and twice to exit.
204                                                   204 
205 Simplistically, the script just reads STDIN an    205 Simplistically, the script just reads STDIN and counts up events but it
206 also can do more such as                          206 also can do more such as
207                                                   207 
208   - Derive high-level events from many low-lev    208   - Derive high-level events from many low-level events. If a number of pages
209     are freed to the main allocator from the p    209     are freed to the main allocator from the per-CPU lists, it recognises
210     that as one per-CPU drain even though ther    210     that as one per-CPU drain even though there is no specific tracepoint
211     for that event                                211     for that event
212   - It can aggregate based on PID or individua    212   - It can aggregate based on PID or individual process number
213   - In the event memory is getting externally     213   - In the event memory is getting externally fragmented, it reports
214     on whether the fragmentation event was sev    214     on whether the fragmentation event was severe or moderate.
215   - When receiving an event about a PID, it ca    215   - When receiving an event about a PID, it can record who the parent was so
216     that if large numbers of events are coming    216     that if large numbers of events are coming from very short-lived
217     processes, the parent process responsible     217     processes, the parent process responsible for creating all the helpers
218     can be identified                             218     can be identified
219                                                   219 
220 7. Lower-Level Analysis with PCL                  220 7. Lower-Level Analysis with PCL
221 ================================                  221 ================================
222                                                   222 
223 There may also be a requirement to identify wh    223 There may also be a requirement to identify what functions within a program
224 were generating events within the kernel. To b    224 were generating events within the kernel. To begin this sort of analysis, the
225 data must be recorded. At the time of writing,    225 data must be recorded. At the time of writing, this required root:
226 ::                                                226 ::
227                                                   227 
228   $ perf record -c 1 \                            228   $ perf record -c 1 \
229         -e kmem:mm_page_alloc -e kmem:mm_page_    229         -e kmem:mm_page_alloc -e kmem:mm_page_free \
230         -e kmem:mm_page_free_batched \            230         -e kmem:mm_page_free_batched \
231         ./hackbench 10                            231         ./hackbench 10
232   Time: 0.894                                     232   Time: 0.894
233   [ perf record: Captured and wrote 0.733 MB p    233   [ perf record: Captured and wrote 0.733 MB perf.data (~32010 samples) ]
234                                                   234 
235 Note the use of '-c 1' to set the event period    235 Note the use of '-c 1' to set the event period to sample. The default sample
236 period is quite high to minimise overhead but     236 period is quite high to minimise overhead but the information collected can be
237 very coarse as a result.                          237 very coarse as a result.
238                                                   238 
239 This record outputted a file called perf.data     239 This record outputted a file called perf.data which can be analysed using
240 perf report.                                      240 perf report.
241 ::                                                241 ::
242                                                   242 
243   $ perf report                                   243   $ perf report
244   # Samples: 30922                                244   # Samples: 30922
245   #                                               245   #
246   # Overhead    Command                     Sh    246   # Overhead    Command                     Shared Object
247   # ........  .........  .....................    247   # ........  .........  ................................
248   #                                               248   #
249       87.27%  hackbench  [vdso]                   249       87.27%  hackbench  [vdso]
250        6.85%  hackbench  /lib/i686/cmov/libc-2    250        6.85%  hackbench  /lib/i686/cmov/libc-2.9.so
251        2.62%  hackbench  /lib/ld-2.9.so           251        2.62%  hackbench  /lib/ld-2.9.so
252        1.52%       perf  [vdso]                   252        1.52%       perf  [vdso]
253        1.22%  hackbench  ./hackbench              253        1.22%  hackbench  ./hackbench
254        0.48%  hackbench  [kernel]                 254        0.48%  hackbench  [kernel]
255        0.02%       perf  /lib/i686/cmov/libc-2    255        0.02%       perf  /lib/i686/cmov/libc-2.9.so
256        0.01%       perf  /usr/bin/perf            256        0.01%       perf  /usr/bin/perf
257        0.01%       perf  /lib/ld-2.9.so           257        0.01%       perf  /lib/ld-2.9.so
258        0.00%  hackbench  /lib/i686/cmov/libpth    258        0.00%  hackbench  /lib/i686/cmov/libpthread-2.9.so
259   #                                               259   #
260   # (For more details, try: perf report --sort    260   # (For more details, try: perf report --sort comm,dso,symbol)
261   #                                               261   #
262                                                   262 
263 According to this, the vast majority of events    263 According to this, the vast majority of events triggered on events
264 within the VDSO. With simple binaries, this wi    264 within the VDSO. With simple binaries, this will often be the case so let's
265 take a slightly different example. In the cour    265 take a slightly different example. In the course of writing this, it was
266 noticed that X was generating an insane amount    266 noticed that X was generating an insane amount of page allocations so let's look
267 at it:                                            267 at it:
268 ::                                                268 ::
269                                                   269 
270   $ perf record -c 1 -f \                         270   $ perf record -c 1 -f \
271                 -e kmem:mm_page_alloc -e kmem:    271                 -e kmem:mm_page_alloc -e kmem:mm_page_free \
272                 -e kmem:mm_page_free_batched \    272                 -e kmem:mm_page_free_batched \
273                 -p `pidof X`                      273                 -p `pidof X`
274                                                   274 
275 This was interrupted after a few seconds and      275 This was interrupted after a few seconds and
276 ::                                                276 ::
277                                                   277 
278   $ perf report                                   278   $ perf report
279   # Samples: 27666                                279   # Samples: 27666
280   #                                               280   #
281   # Overhead  Command                             281   # Overhead  Command                            Shared Object
282   # ........  .......  .......................    282   # ........  .......  .......................................
283   #                                               283   #
284       51.95%     Xorg  [vdso]                     284       51.95%     Xorg  [vdso]
285       47.95%     Xorg  /opt/gfx-test/lib/libpi    285       47.95%     Xorg  /opt/gfx-test/lib/libpixman-1.so.0.13.1
286        0.09%     Xorg  /lib/i686/cmov/libc-2.9    286        0.09%     Xorg  /lib/i686/cmov/libc-2.9.so
287        0.01%     Xorg  [kernel]                   287        0.01%     Xorg  [kernel]
288   #                                               288   #
289   # (For more details, try: perf report --sort    289   # (For more details, try: perf report --sort comm,dso,symbol)
290   #                                               290   #
291                                                   291 
292 So, almost half of the events are occurring in    292 So, almost half of the events are occurring in a library. To get an idea which
293 symbol:                                           293 symbol:
294 ::                                                294 ::
295                                                   295 
296   $ perf report --sort comm,dso,symbol            296   $ perf report --sort comm,dso,symbol
297   # Samples: 27666                                297   # Samples: 27666
298   #                                               298   #
299   # Overhead  Command                             299   # Overhead  Command                            Shared Object  Symbol
300   # ........  .......  .......................    300   # ........  .......  .......................................  ......
301   #                                               301   #
302       51.95%     Xorg  [vdso]                     302       51.95%     Xorg  [vdso]                                   [.] 0x000000ffffe424
303       47.93%     Xorg  /opt/gfx-test/lib/libpi    303       47.93%     Xorg  /opt/gfx-test/lib/libpixman-1.so.0.13.1  [.] pixmanFillsse2
304        0.09%     Xorg  /lib/i686/cmov/libc-2.9    304        0.09%     Xorg  /lib/i686/cmov/libc-2.9.so               [.] _int_malloc
305        0.01%     Xorg  /opt/gfx-test/lib/libpi    305        0.01%     Xorg  /opt/gfx-test/lib/libpixman-1.so.0.13.1  [.] pixman_region32_copy_f
306        0.01%     Xorg  [kernel]                   306        0.01%     Xorg  [kernel]                                 [k] read_hpet
307        0.01%     Xorg  /opt/gfx-test/lib/libpi    307        0.01%     Xorg  /opt/gfx-test/lib/libpixman-1.so.0.13.1  [.] get_fast_path
308        0.00%     Xorg  [kernel]                   308        0.00%     Xorg  [kernel]                                 [k] ftrace_trace_userstack
309                                                   309 
310 To see where within the function pixmanFillsse    310 To see where within the function pixmanFillsse2 things are going wrong:
311 ::                                                311 ::
312                                                   312 
313   $ perf annotate pixmanFillsse2                  313   $ perf annotate pixmanFillsse2
314   [ ... ]                                         314   [ ... ]
315     0.00 :         34eeb:       0f 18 08          315     0.00 :         34eeb:       0f 18 08                prefetcht0 (%eax)
316          :      }                                 316          :      }
317          :                                        317          :
318          :      extern __inline void __attribu    318          :      extern __inline void __attribute__((__gnu_inline__, __always_inline__, _
319          :      _mm_store_si128 (__m128i *__P,    319          :      _mm_store_si128 (__m128i *__P, __m128i __B) :      {
320          :        *__P = __B;                     320          :        *__P = __B;
321    12.40 :         34eee:       66 0f 7f 80 40    321    12.40 :         34eee:       66 0f 7f 80 40 ff ff    movdqa %xmm0,-0xc0(%eax)
322     0.00 :         34ef5:       ff                322     0.00 :         34ef5:       ff
323    12.40 :         34ef6:       66 0f 7f 80 50    323    12.40 :         34ef6:       66 0f 7f 80 50 ff ff    movdqa %xmm0,-0xb0(%eax)
324     0.00 :         34efd:       ff                324     0.00 :         34efd:       ff
325    12.39 :         34efe:       66 0f 7f 80 60    325    12.39 :         34efe:       66 0f 7f 80 60 ff ff    movdqa %xmm0,-0xa0(%eax)
326     0.00 :         34f05:       ff                326     0.00 :         34f05:       ff
327    12.67 :         34f06:       66 0f 7f 80 70    327    12.67 :         34f06:       66 0f 7f 80 70 ff ff    movdqa %xmm0,-0x90(%eax)
328     0.00 :         34f0d:       ff                328     0.00 :         34f0d:       ff
329    12.58 :         34f0e:       66 0f 7f 40 80    329    12.58 :         34f0e:       66 0f 7f 40 80          movdqa %xmm0,-0x80(%eax)
330    12.31 :         34f13:       66 0f 7f 40 90    330    12.31 :         34f13:       66 0f 7f 40 90          movdqa %xmm0,-0x70(%eax)
331    12.40 :         34f18:       66 0f 7f 40 a0    331    12.40 :         34f18:       66 0f 7f 40 a0          movdqa %xmm0,-0x60(%eax)
332    12.31 :         34f1d:       66 0f 7f 40 b0    332    12.31 :         34f1d:       66 0f 7f 40 b0          movdqa %xmm0,-0x50(%eax)
333                                                   333 
334 At a glance, it looks like the time is being s    334 At a glance, it looks like the time is being spent copying pixmaps to
335 the card.  Further investigation would be need    335 the card.  Further investigation would be needed to determine why pixmaps
336 are being copied around so much but a starting    336 are being copied around so much but a starting point would be to take an
337 ancient build of libpixmap out of the library     337 ancient build of libpixmap out of the library path where it was totally
338 forgotten about from months ago!                  338 forgotten about from months ago!
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php