1 Assembler Annotations 2 ===================== 3 4 Copyright (c) 2017-2019 Jiri Slaby 5 6 This document describes the new macros for annotation of data and code in 7 assembly. In particular, it contains information about ``SYM_FUNC_START``, 8 ``SYM_FUNC_END``, ``SYM_CODE_START``, and similar. 9 10 Rationale 11 --------- 12 Some code like entries, trampolines, or boot code needs to be written in 13 assembly. The same as in C, such code is grouped into functions and 14 accompanied with data. Standard assemblers do not force users into precisely 15 marking these pieces as code, data, or even specifying their length. 16 Nevertheless, assemblers provide developers with such annotations to aid 17 debuggers throughout assembly. On top of that, developers also want to mark 18 some functions as *global* in order to be visible outside of their translation 19 units. 20 21 Over time, the Linux kernel has adopted macros from various projects (like 22 ``binutils``) to facilitate such annotations. So for historic reasons, 23 developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other 24 annotations in assembly. Due to the lack of their documentation, the macros 25 are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was 26 intended to denote the beginning of global symbols (be it data or code). 27 ``END`` used to mark the end of data or end of special functions with 28 *non-standard* calling convention. In contrast, ``ENDPROC`` should annotate 29 only ends of *standard* functions. 30 31 When these macros are used correctly, they help assemblers generate a nice 32 object with both sizes and types set correctly. For example, the result of 33 ``arch/x86/lib/putuser.S``:: 34 35 Num: Value Size Type Bind Vis Ndx Name 36 25: 0000000000000000 33 FUNC GLOBAL DEFAULT 1 __put_user_1 37 29: 0000000000000030 37 FUNC GLOBAL DEFAULT 1 __put_user_2 38 32: 0000000000000060 36 FUNC GLOBAL DEFAULT 1 __put_user_4 39 35: 0000000000000090 37 FUNC GLOBAL DEFAULT 1 __put_user_8 40 41 This is not only important for debugging purposes. When there are properly 42 annotated objects like this, tools can be run on them to generate more useful 43 information. In particular, on properly annotated objects, ``objtool`` can be 44 run to check and fix the object if needed. Currently, ``objtool`` can report 45 missing frame pointer setup/destruction in functions. It can also 46 automatically generate annotations for the ORC unwinder 47 (Documentation/arch/x86/orc-unwinder.rst) 48 for most code. Both of these are especially important to support reliable 49 stack traces which are in turn necessary for kernel live patching 50 (Documentation/livepatch/livepatch.rst). 51 52 Caveat and Discussion 53 --------------------- 54 As one might realize, there were only three macros previously. That is indeed 55 insufficient to cover all the combinations of cases: 56 57 * standard/non-standard function 58 * code/data 59 * global/local symbol 60 61 There was a discussion_ and instead of extending the current ``ENTRY/END*`` 62 macros, it was decided that brand new macros should be introduced instead:: 63 64 So how about using macro names that actually show the purpose, instead 65 of importing all the crappy, historic, essentially randomly chosen 66 debug symbol macro names from the binutils and older kernels? 67 68 .. _discussion: https://lore.kernel.org/r/20170217104757.28588-1-jslaby@suse.cz 69 70 Macros Description 71 ------------------ 72 73 The new macros are prefixed with the ``SYM_`` prefix and can be divided into 74 three main groups: 75 76 1. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with 77 standard C calling conventions. For example, on x86, this means that the 78 stack contains a return address at the predefined place and a return from 79 the function can happen in a standard way. When frame pointers are enabled, 80 save/restore of frame pointer shall happen at the start/end of a function, 81 respectively, too. 82 83 Checking tools like ``objtool`` should ensure such marked functions conform 84 to these rules. The tools can also easily annotate these functions with 85 debugging information (like *ORC data*) automatically. 86 87 2. ``SYM_CODE_*`` -- special functions called with special stack. Be it 88 interrupt handlers with special stack content, trampolines, or startup 89 functions. 90 91 Checking tools mostly ignore checking of these functions. But some debug 92 information still can be generated automatically. For correct debug data, 93 this code needs hints like ``UNWIND_HINT_REGS`` provided by developers. 94 95 3. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to 96 ``.text``. Data do not contain instructions, so they have to be treated 97 specially by the tools: they should not treat the bytes as instructions, 98 nor assign any debug information to them. 99 100 Instruction Macros 101 ~~~~~~~~~~~~~~~~~~ 102 This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above. 103 104 ``objtool`` requires that all code must be contained in an ELF symbol. Symbol 105 names that have a ``.L`` prefix do not emit symbol table entries. ``.L`` 106 prefixed symbols can be used within a code region, but should be avoided for 107 denoting a range of code via ``SYM_*_START/END`` annotations. 108 109 * ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the 110 most frequent markings**. They are used for functions with standard calling 111 conventions -- global and local. Like in C, they both align the functions to 112 architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants 113 for special cases where developers do not want this implicit alignment. 114 115 ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are 116 also offered as an assembler counterpart to the *weak* attribute known from 117 C. 118 119 All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks 120 the sequence of instructions as a function and computes its size to the 121 generated object file. Second, it also eases checking and processing such 122 object files as the tools can trivially find exact function boundaries. 123 124 So in most cases, developers should write something like in the following 125 example, having some asm instructions in between the macros, of course:: 126 127 SYM_FUNC_START(memset) 128 ... asm insns ... 129 SYM_FUNC_END(memset) 130 131 In fact, this kind of annotation corresponds to the now deprecated ``ENTRY`` 132 and ``ENDPROC`` macros. 133 134 * ``SYM_FUNC_ALIAS``, ``SYM_FUNC_ALIAS_LOCAL``, and ``SYM_FUNC_ALIAS_WEAK`` can 135 be used to define multiple names for a function. The typical use is:: 136 137 SYM_FUNC_START(__memset) 138 ... asm insns ... 139 SYN_FUNC_END(__memset) 140 SYM_FUNC_ALIAS(memset, __memset) 141 142 In this example, one can call ``__memset`` or ``memset`` with the same 143 result, except the debug information for the instructions is generated to 144 the object file only once -- for the non-``ALIAS`` case. 145 146 * ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in 147 special cases -- if you know what you are doing. This is used exclusively 148 for interrupt handlers and similar where the calling convention is not the C 149 one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC`` 150 category above:: 151 152 SYM_CODE_START_LOCAL(bad_put_user) 153 ... asm insns ... 154 SYM_CODE_END(bad_put_user) 155 156 Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``. 157 158 To some extent, this category corresponds to deprecated ``ENTRY`` and 159 ``END``. Except ``END`` had several other meanings too. 160 161 * ``SYM_INNER_LABEL*`` is used to denote a label inside some 162 ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``. They are very similar 163 to C labels, except they can be made global. An example of use:: 164 165 SYM_CODE_START(ftrace_caller) 166 /* save_mcount_regs fills in first two parameters */ 167 ... 168 169 SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL) 170 /* Load the ftrace_ops into the 3rd parameter */ 171 ... 172 173 SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) 174 call ftrace_stub 175 ... 176 retq 177 SYM_CODE_END(ftrace_caller) 178 179 Data Macros 180 ~~~~~~~~~~~ 181 Similar to instructions, there is a couple of macros to describe data in the 182 assembly. 183 184 * ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data 185 and shall be used in conjunction with either ``SYM_DATA_END``, or 186 ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that 187 people can use ``lstack`` and (local) ``lstack_end`` in the following 188 example:: 189 190 SYM_DATA_START_LOCAL(lstack) 191 .skip 4096 192 SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end) 193 194 * ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line 195 data:: 196 197 SYM_DATA(HEAP, .long rm_heap) 198 SYM_DATA(heap_end, .long rm_stack) 199 200 In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END`` 201 internally. 202 203 Support Macros 204 ~~~~~~~~~~~~~~ 205 All the above reduce themselves to some invocation of ``SYM_START``, 206 ``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using 207 these. 208 209 Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also 210 ``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a 211 symbol marked by them. They are used either in ``_LABEL`` variants of the 212 earlier macros, or in ``SYM_START``. 213 214 215 Overriding Macros 216 ~~~~~~~~~~~~~~~~~ 217 Architecture can also override any of the macros in their own 218 ``asm/linkage.h``, including macros specifying the type of a symbol 219 (``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``). As every macro 220 described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough 221 to define the macros differently in the aforementioned architecture-dependent 222 header.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.