1 .. SPDX-License-Identifier: GPL-2.0 2 3 ============= 4 Kernel Stacks 5 ============= 6 7 Kernel stacks on x86-64 bit 8 =========================== 9 10 Most of the text from Keith Owens, hacked by A 11 12 x86_64 page size (PAGE_SIZE) is 4K. 13 14 Like all other architectures, x86_64 has a ker 15 active thread. These thread stacks are THREAD 16 These stacks contain useful data as long as a 17 zombie. While the thread is in user space the 18 except for the thread_info structure at the bo 19 20 In addition to the per thread stacks, there ar 21 associated with each CPU. These stacks are on 22 is in control on that CPU; when a CPU returns 23 specialized stacks contain no useful data. Th 24 25 * Interrupt stack. IRQ_STACK_SIZE 26 27 Used for external hardware interrupts. If t 28 hardware interrupt (i.e. not a nested hardwa 29 kernel switches from the current task to the 30 the split thread and interrupt stacks on i38 31 for kernel interrupt processing without havi 32 of every per thread stack. 33 34 The interrupt stack is also used when proces 35 36 Switching to the kernel interrupt stack is don 37 per CPU interrupt nest counter. This is needed 38 hardware stacks cannot nest without races. 39 40 x86_64 also has a feature which is not availab 41 to automatically switch to a new stack for des 42 double fault or NMI, which makes it easier to 43 events on x86_64. This feature is called the 44 (IST). There can be up to 7 IST entries per C 45 index into the Task State Segment (TSS). The I 46 point to dedicated stacks; each stack can be a 47 48 An IST is selected by a non-zero value in the 49 interrupt-gate descriptor. When an interrupt 50 loads such a descriptor, the hardware automati 51 pointer based on the IST value, then invokes t 52 the interrupt came from user mode, then the in 53 will switch back to the per-thread stack. If 54 nested IST interrupts then the handler must ad 55 entry to and exit from the interrupt handler. 56 done, e.g. for debug exceptions.) 57 58 Events with different IST codes (i.e. with dif 59 nested. For example, a debug interrupt can sa 60 NMI. arch/x86_64/kernel/entry.S::paranoidentr 61 pointers on entry to and exit from all IST eve 62 IST events with the same code to be nested. H 63 stack size allocated to an IST assumes no nest 64 If that assumption is ever broken then the sta 65 66 The currently assigned IST stacks are: 67 68 * ESTACK_DF. EXCEPTION_STKSZ (PAGE_SIZE). 69 70 Used for interrupt 8 - Double Fault Exceptio 71 72 Invoked when handling one exception causes a 73 when the kernel is very confused (e.g. kerne 74 Using a separate stack allows the kernel to 75 in many cases to still output an oops. 76 77 * ESTACK_NMI. EXCEPTION_STKSZ (PAGE_SIZE). 78 79 Used for non-maskable interrupts (NMI). 80 81 NMI can be delivered at any time, including 82 middle of switching stacks. Using IST for N 83 assumptions about the previous state of the 84 85 * ESTACK_DB. EXCEPTION_STKSZ (PAGE_SIZE). 86 87 Used for hardware debug interrupts (interrup 88 debug interrupts (INT3). 89 90 When debugging a kernel, debug interrupts (b 91 software) can occur at any time. Using IST 92 avoids making assumptions about the previous 93 stack. 94 95 To handle nested #DB correctly there exist t 96 #DB entry the IST stackpointer for #DB is sw 97 so a nested #DB starts from a clean stack. T 98 the IST stackpointer to a guard hole to catc 99 100 * ESTACK_MCE. EXCEPTION_STKSZ (PAGE_SIZE). 101 102 Used for interrupt 18 - Machine Check Except 103 104 MCE can be delivered at any time, including 105 middle of switching stacks. Using IST for M 106 assumptions about the previous state of the 107 108 For more details see the Intel IA32 or AMD AMD 109 110 111 Printing backtraces on x86 112 ========================== 113 114 The question about the '?' preceding function 115 keeps popping up, here's an indepth explanatio 116 stares at print_context_stack() and the whole 117 arch/x86/kernel/dumpstack.c. 118 119 Adapted from Ingo's mail, Message-ID: <20150521 120 121 We always scan the full kernel stack for retur 122 the kernel stack(s) [1]_, from stack top to st 123 anything that 'looks like' a kernel text addre 124 125 If it fits into the frame pointer chain, we pr 126 mark, knowing that it's part of the real backt 127 128 If the address does not fit into our expected 129 still print it, but we print a '?'. It can mea 130 131 - either the address is not part of the call 132 values on the kernel stack, from earlier fu 133 the common case. 134 135 - or it is part of the call chain, but the fr 136 up properly within the function, so we don' 137 138 This way we will always print out the real cal 139 entries), regardless of whether the frame poin 140 or not - but in most cases we'll get the call 141 entries printed are strictly in stack order, s 142 information from that as well. 143 144 The most important property of this method is 145 information: we always strive to print _all_ a 146 that look like kernel text addresses, so if de 147 we still print out the real call chain as well 148 marks than ideal. 149 150 .. [1] For things like IRQ and IST stacks, we 151 the right order, and try to cross from 152 reconstructing the call chain. This wor
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.