~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/arch/x86/orc-unwinder.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0
  2 
  3 ============
  4 ORC unwinder
  5 ============
  6 
  7 Overview
  8 ========
  9 
 10 The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
 11 similar in concept to a DWARF unwinder.  The difference is that the
 12 format of the ORC data is much simpler than DWARF, which in turn allows
 13 the ORC unwinder to be much simpler and faster.
 14 
 15 The ORC data consists of unwind tables which are generated by objtool.
 16 They contain out-of-band data which is used by the in-kernel ORC
 17 unwinder.  Objtool generates the ORC data by first doing compile-time
 18 stack metadata validation (CONFIG_STACK_VALIDATION).  After analyzing
 19 all the code paths of a .o file, it determines information about the
 20 stack state at each instruction address in the file and outputs that
 21 information to the .orc_unwind and .orc_unwind_ip sections.
 22 
 23 The per-object ORC sections are combined at link time and are sorted and
 24 post-processed at boot time.  The unwinder uses the resulting data to
 25 correlate instruction addresses with their stack states at run time.
 26 
 27 
 28 ORC vs frame pointers
 29 =====================
 30 
 31 With frame pointers enabled, GCC adds instrumentation code to every
 32 function in the kernel.  The kernel's .text size increases by about
 33 3.2%, resulting in a broad kernel-wide slowdown.  Measurements by Mel
 34 Gorman [1]_ have shown a slowdown of 5-10% for some workloads.
 35 
 36 In contrast, the ORC unwinder has no effect on text size or runtime
 37 performance, because the debuginfo is out of band.  So if you disable
 38 frame pointers and enable the ORC unwinder, you get a nice performance
 39 improvement across the board, and still have reliable stack traces.
 40 
 41 Ingo Molnar says:
 42 
 43   "Note that it's not just a performance improvement, but also an
 44   instruction cache locality improvement: 3.2% .text savings almost
 45   directly transform into a similarly sized reduction in cache
 46   footprint. That can transform to even higher speedups for workloads
 47   whose cache locality is borderline."
 48 
 49 Another benefit of ORC compared to frame pointers is that it can
 50 reliably unwind across interrupts and exceptions.  Frame pointer based
 51 unwinds can sometimes skip the caller of the interrupted function, if it
 52 was a leaf function or if the interrupt hit before the frame pointer was
 53 saved.
 54 
 55 The main disadvantage of the ORC unwinder compared to frame pointers is
 56 that it needs more memory to store the ORC unwind tables: roughly 2-4MB
 57 depending on the kernel config.
 58 
 59 
 60 ORC vs DWARF
 61 ============
 62 
 63 ORC debuginfo's advantage over DWARF itself is that it's much simpler.
 64 It gets rid of the complex DWARF CFI state machine and also gets rid of
 65 the tracking of unnecessary registers.  This allows the unwinder to be
 66 much simpler, meaning fewer bugs, which is especially important for
 67 mission critical oops code.
 68 
 69 The simpler debuginfo format also enables the unwinder to be much faster
 70 than DWARF, which is important for perf and lockdep.  In a basic
 71 performance test by Jiri Slaby [2]_, the ORC unwinder was about 20x
 72 faster than an out-of-tree DWARF unwinder.  (Note: That measurement was
 73 taken before some performance tweaks were added, which doubled
 74 performance, so the speedup over DWARF may be closer to 40x.)
 75 
 76 The ORC data format does have a few downsides compared to DWARF.  ORC
 77 unwind tables take up ~50% more RAM (+1.3MB on an x86 defconfig kernel)
 78 than DWARF-based eh_frame tables.
 79 
 80 Another potential downside is that, as GCC evolves, it's conceivable
 81 that the ORC data may end up being *too* simple to describe the state of
 82 the stack for certain optimizations.  But IMO this is unlikely because
 83 GCC saves the frame pointer for any unusual stack adjustments it does,
 84 so I suspect we'll really only ever need to keep track of the stack
 85 pointer and the frame pointer between call frames.  But even if we do
 86 end up having to track all the registers DWARF tracks, at least we will
 87 still be able to control the format, e.g. no complex state machines.
 88 
 89 
 90 ORC unwind table generation
 91 ===========================
 92 
 93 The ORC data is generated by objtool.  With the existing compile-time
 94 stack metadata validation feature, objtool already follows all code
 95 paths, and so it already has all the information it needs to be able to
 96 generate ORC data from scratch.  So it's an easy step to go from stack
 97 validation to ORC data generation.
 98 
 99 It should be possible to instead generate the ORC data with a simple
100 tool which converts DWARF to ORC data.  However, such a solution would
101 be incomplete due to the kernel's extensive use of asm, inline asm, and
102 special sections like exception tables.
103 
104 That could be rectified by manually annotating those special code paths
105 using GNU assembler .cfi annotations in .S files, and homegrown
106 annotations for inline asm in .c files.  But asm annotations were tried
107 in the past and were found to be unmaintainable.  They were often
108 incorrect/incomplete and made the code harder to read and keep updated.
109 And based on looking at glibc code, annotating inline asm in .c files
110 might be even worse.
111 
112 Objtool still needs a few annotations, but only in code which does
113 unusual things to the stack like entry code.  And even then, far fewer
114 annotations are needed than what DWARF would need, so they're much more
115 maintainable than DWARF CFI annotations.
116 
117 So the advantages of using objtool to generate ORC data are that it
118 gives more accurate debuginfo, with very few annotations.  It also
119 insulates the kernel from toolchain bugs which can be very painful to
120 deal with in the kernel since we often have to workaround issues in
121 older versions of the toolchain for years.
122 
123 The downside is that the unwinder now becomes dependent on objtool's
124 ability to reverse engineer GCC code flow.  If GCC optimizations become
125 too complicated for objtool to follow, the ORC data generation might
126 stop working or become incomplete.  (It's worth noting that livepatch
127 already has such a dependency on objtool's ability to follow GCC code
128 flow.)
129 
130 If newer versions of GCC come up with some optimizations which break
131 objtool, we may need to revisit the current implementation.  Some
132 possible solutions would be asking GCC to make the optimizations more
133 palatable, or having objtool use DWARF as an additional input, or
134 creating a GCC plugin to assist objtool with its analysis.  But for now,
135 objtool follows GCC code quite well.
136 
137 
138 Unwinder implementation details
139 ===============================
140 
141 Objtool generates the ORC data by integrating with the compile-time
142 stack metadata validation feature, which is described in detail in
143 tools/objtool/Documentation/objtool.txt.  After analyzing all
144 the code paths of a .o file, it creates an array of orc_entry structs,
145 and a parallel array of instruction addresses associated with those
146 structs, and writes them to the .orc_unwind and .orc_unwind_ip sections
147 respectively.
148 
149 The ORC data is split into the two arrays for performance reasons, to
150 make the searchable part of the data (.orc_unwind_ip) more compact.  The
151 arrays are sorted in parallel at boot time.
152 
153 Performance is further improved by the use of a fast lookup table which
154 is created at runtime.  The fast lookup table associates a given address
155 with a range of indices for the .orc_unwind table, so that only a small
156 subset of the table needs to be searched.
157 
158 
159 Etymology
160 =========
161 
162 Orcs, fearsome creatures of medieval folklore, are the Dwarves' natural
163 enemies.  Similarly, the ORC unwinder was created in opposition to the
164 complexity and slowness of DWARF.
165 
166 "Although Orcs rarely consider multiple solutions to a problem, they do
167 excel at getting things done because they are creatures of action, not
168 thought." [3]_  Similarly, unlike the esoteric DWARF unwinder, the
169 veracious ORC unwinder wastes no time or siloconic effort decoding
170 variable-length zero-extended unsigned-integer byte-coded
171 state-machine-based debug information entries.
172 
173 Similar to how Orcs frequently unravel the well-intentioned plans of
174 their adversaries, the ORC unwinder frequently unravels stacks with
175 brutal, unyielding efficiency.
176 
177 ORC stands for Oops Rewind Capability.
178 
179 
180 .. [1] https://lore.kernel.org/r/20170602104048.jkkzssljsompjdwy@suse.de
181 .. [2] https://lore.kernel.org/r/d2ca5435-6386-29b8-db87-7f227c2b713a@suse.cz
182 .. [3] http://dustin.wikidot.com/half-orcs-and-orcs

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php