1 .. SPDX-License-Identifier: GPL-2.0 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 =============== 3 =============== 4 Boot Interrupts 4 Boot Interrupts 5 =============== 5 =============== 6 6 7 :Author: - Sean V Kelley <sean.v.kelley@linux.i 7 :Author: - Sean V Kelley <sean.v.kelley@linux.intel.com> 8 8 9 Overview 9 Overview 10 ======== 10 ======== 11 11 12 On PCI Express, interrupts are represented wit 12 On PCI Express, interrupts are represented with either MSI or inbound 13 interrupt messages (Assert_INTx/Deassert_INTx) 13 interrupt messages (Assert_INTx/Deassert_INTx). The integrated IO-APIC in a 14 given Core IO converts the legacy interrupt me 14 given Core IO converts the legacy interrupt messages from PCI Express to 15 MSI interrupts. If the IO-APIC is disabled (v 15 MSI interrupts. If the IO-APIC is disabled (via the mask bits in the 16 IO-APIC table entries), the messages are route 16 IO-APIC table entries), the messages are routed to the legacy PCH. This 17 in-band interrupt mechanism was traditionally 17 in-band interrupt mechanism was traditionally necessary for systems that 18 did not support the IO-APIC and for boot. Inte 18 did not support the IO-APIC and for boot. Intel in the past has used the 19 term "boot interrupts" to describe this mechan 19 term "boot interrupts" to describe this mechanism. Further, the PCI Express 20 protocol describes this in-band legacy wire-in 20 protocol describes this in-band legacy wire-interrupt INTx mechanism for 21 I/O devices to signal PCI-style level interrup 21 I/O devices to signal PCI-style level interrupts. The subsequent paragraphs 22 describe problems with the Core IO handling of 22 describe problems with the Core IO handling of INTx message routing to the 23 PCH and mitigation within BIOS and the OS. 23 PCH and mitigation within BIOS and the OS. 24 24 25 25 26 Issue 26 Issue 27 ===== 27 ===== 28 28 29 When in-band legacy INTx messages are forwarde 29 When in-band legacy INTx messages are forwarded to the PCH, they in turn 30 trigger a new interrupt for which the OS likel 30 trigger a new interrupt for which the OS likely lacks a handler. When an 31 interrupt goes unhandled over time, they are t 31 interrupt goes unhandled over time, they are tracked by the Linux kernel as 32 Spurious Interrupts. The IRQ will be disabled 32 Spurious Interrupts. The IRQ will be disabled by the Linux kernel after it 33 reaches a specific count with the error "nobod 33 reaches a specific count with the error "nobody cared". This disabled IRQ 34 now prevents valid usage by an existing interr 34 now prevents valid usage by an existing interrupt which may happen to share 35 the IRQ line:: 35 the IRQ line:: 36 36 37 irq 19: nobody cared (try booting with the " 37 irq 19: nobody cared (try booting with the "irqpoll" option) 38 CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted 38 CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted: 4.14.87-rt49-02410-g4a640ec-dirty #1 39 Hardware name: National Instruments NI PXIe- 39 Hardware name: National Instruments NI PXIe-8880/NI PXIe-8880, BIOS 2.1.5f1 01/09/2020 40 Call Trace: 40 Call Trace: 41 41 42 <IRQ> 42 <IRQ> 43 ? dump_stack+0x46/0x5e 43 ? dump_stack+0x46/0x5e 44 ? __report_bad_irq+0x2e/0xb0 44 ? __report_bad_irq+0x2e/0xb0 45 ? note_interrupt+0x242/0x290 45 ? note_interrupt+0x242/0x290 46 ? nNIKAL100_memoryRead16+0x8/0x10 [nikal] 46 ? nNIKAL100_memoryRead16+0x8/0x10 [nikal] 47 ? handle_irq_event_percpu+0x55/0x70 47 ? handle_irq_event_percpu+0x55/0x70 48 ? handle_irq_event+0x4f/0x80 48 ? handle_irq_event+0x4f/0x80 49 ? handle_fasteoi_irq+0x81/0x180 49 ? handle_fasteoi_irq+0x81/0x180 50 ? handle_irq+0x1c/0x30 50 ? handle_irq+0x1c/0x30 51 ? do_IRQ+0x41/0xd0 51 ? do_IRQ+0x41/0xd0 52 ? common_interrupt+0x84/0x84 52 ? common_interrupt+0x84/0x84 53 </IRQ> 53 </IRQ> 54 54 55 handlers: 55 handlers: 56 irq_default_primary_handler threaded usb_hcd 56 irq_default_primary_handler threaded usb_hcd_irq 57 Disabling IRQ #19 57 Disabling IRQ #19 58 58 59 59 60 Conditions 60 Conditions 61 ========== 61 ========== 62 62 63 The use of threaded interrupts is the most lik 63 The use of threaded interrupts is the most likely condition to trigger 64 this problem today. Threaded interrupts may no !! 64 this problem today. Threaded interrupts may not be reenabled after the IRQ 65 handler wakes. These "one shot" conditions mea 65 handler wakes. These "one shot" conditions mean that the threaded interrupt 66 needs to keep the interrupt line masked until 66 needs to keep the interrupt line masked until the threaded handler has run. 67 Especially when dealing with high data rate in 67 Especially when dealing with high data rate interrupts, the thread needs to 68 run to completion; otherwise some handlers wil 68 run to completion; otherwise some handlers will end up in stack overflows 69 since the interrupt of the issuing device is s 69 since the interrupt of the issuing device is still active. 70 70 71 Affected Chipsets 71 Affected Chipsets 72 ================= 72 ================= 73 73 74 The legacy interrupt forwarding mechanism exis 74 The legacy interrupt forwarding mechanism exists today in a number of 75 devices including but not limited to chipsets 75 devices including but not limited to chipsets from AMD/ATI, Broadcom, and 76 Intel. Changes made through the mitigations be 76 Intel. Changes made through the mitigations below have been applied to 77 drivers/pci/quirks.c 77 drivers/pci/quirks.c 78 78 79 Starting with ICX there are no longer any IO-A 79 Starting with ICX there are no longer any IO-APICs in the Core IO's 80 devices. IO-APIC is only in the PCH. Devices 80 devices. IO-APIC is only in the PCH. Devices connected to the Core IO's 81 PCIe Root Ports will use native MSI/MSI-X mech 81 PCIe Root Ports will use native MSI/MSI-X mechanisms. 82 82 83 Mitigations 83 Mitigations 84 =========== 84 =========== 85 85 86 The mitigations take the form of PCI quirks. T 86 The mitigations take the form of PCI quirks. The preference has been to 87 first identify and make use of a means to disa 87 first identify and make use of a means to disable the routing to the PCH. 88 In such a case a quirk to disable boot interru 88 In such a case a quirk to disable boot interrupt generation can be 89 added. [1]_ 89 added. [1]_ 90 90 91 Intel® 6300ESB I/O Controller Hub 91 Intel® 6300ESB I/O Controller Hub 92 Alternate Base Address Register: 92 Alternate Base Address Register: 93 BIE: Boot Interrupt Enable 93 BIE: Boot Interrupt Enable 94 94 95 == =========================== 95 == =========================== 96 0 Boot interrupt is enabled. 96 0 Boot interrupt is enabled. 97 1 Boot interrupt is disabled. 97 1 Boot interrupt is disabled. 98 == =========================== 98 == =========================== 99 99 100 Intel® Sandy Bridge through Sky Lake based Xe 100 Intel® Sandy Bridge through Sky Lake based Xeon servers: 101 Coherent Interface Protocol Interrupt Contro 101 Coherent Interface Protocol Interrupt Control 102 dis_intx_route2pch/dis_intx_route2ich/dis_i 102 dis_intx_route2pch/dis_intx_route2ich/dis_intx_route2dmi2: 103 When this bit is set. Local INTx mes 103 When this bit is set. Local INTx messages received from the 104 Intel® Quick Data DMA/PCI Express p 104 Intel® Quick Data DMA/PCI Express ports are not routed to legacy 105 PCH - they are either converted into 105 PCH - they are either converted into MSI via the integrated IO-APIC 106 (if the IO-APIC mask bit is clear in 106 (if the IO-APIC mask bit is clear in the appropriate entries) 107 or cause no further action (when mas 107 or cause no further action (when mask bit is set) 108 108 109 In the absence of a way to directly disable th 109 In the absence of a way to directly disable the routing, another approach 110 has been to make use of PCI Interrupt pin to I 110 has been to make use of PCI Interrupt pin to INTx routing tables for 111 purposes of redirecting the interrupt handler 111 purposes of redirecting the interrupt handler to the rerouted interrupt 112 line by default. Therefore, on chipsets where 112 line by default. Therefore, on chipsets where this INTx routing cannot be 113 disabled, the Linux kernel will reroute the va 113 disabled, the Linux kernel will reroute the valid interrupt to its legacy 114 interrupt. This redirection of the handler wil 114 interrupt. This redirection of the handler will prevent the occurrence of 115 the spurious interrupt detection which would o 115 the spurious interrupt detection which would ordinarily disable the IRQ 116 line due to excessive unhandled counts. [2]_ 116 line due to excessive unhandled counts. [2]_ 117 117 118 The config option X86_REROUTE_FOR_BROKEN_BOOT_ 118 The config option X86_REROUTE_FOR_BROKEN_BOOT_IRQS exists to enable (or 119 disable) the redirection of the interrupt hand 119 disable) the redirection of the interrupt handler to the PCH interrupt 120 line. The option can be overridden by either p 120 line. The option can be overridden by either pci=ioapicreroute or 121 pci=noioapicreroute. [3]_ 121 pci=noioapicreroute. [3]_ 122 122 123 123 124 More Documentation 124 More Documentation 125 ================== 125 ================== 126 126 127 There is an overview of the legacy interrupt h 127 There is an overview of the legacy interrupt handling in several datasheets 128 (6300ESB and 6700PXH below). While largely the 128 (6300ESB and 6700PXH below). While largely the same, it provides insight 129 into the evolution of its handling with chipse 129 into the evolution of its handling with chipsets. 130 130 131 Example of disabling of the boot interrupt 131 Example of disabling of the boot interrupt 132 ------------------------------------------ 132 ------------------------------------------ 133 133 134 - Intel® 6300ESB I/O Controller Hub (Do 134 - Intel® 6300ESB I/O Controller Hub (Document # 300641-004US) 135 5.7.3 Boot Interrupt 135 5.7.3 Boot Interrupt 136 https://www.intel.com/content/dam/doc/ 136 https://www.intel.com/content/dam/doc/datasheet/6300esb-io-controller-hub-datasheet.pdf 137 137 138 - Intel® Xeon® Processor E5-1600/2400/ 138 - Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families 139 Datasheet - Volume 2: Registers (Docum 139 Datasheet - Volume 2: Registers (Document # 330784-003) 140 6.6.41 cipintrc Coherent Interface Pro 140 6.6.41 cipintrc Coherent Interface Protocol Interrupt Control 141 https://www.intel.com/content/dam/www/ 141 https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf 142 142 143 Example of handler rerouting 143 Example of handler rerouting 144 ---------------------------- 144 ---------------------------- 145 145 146 - Intel® 6700PXH 64-bit PCI Hub (Docume 146 - Intel® 6700PXH 64-bit PCI Hub (Document # 302628) 147 2.15.2 PCI Express Legacy INTx Support 147 2.15.2 PCI Express Legacy INTx Support and Boot Interrupt 148 https://www.intel.com/content/dam/doc/ 148 https://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf 149 149 150 150 151 If you have any legacy PCI interrupt questions 151 If you have any legacy PCI interrupt questions that aren't answered, email me. 152 152 153 Cheers, 153 Cheers, 154 Sean V Kelley 154 Sean V Kelley 155 sean.v.kelley@linux.intel.com 155 sean.v.kelley@linux.intel.com 156 156 157 .. [1] https://lore.kernel.org/r/1213194918190 157 .. [1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/ 158 .. [2] https://lore.kernel.org/r/1213194918209 158 .. [2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/ 159 .. [3] https://lore.kernel.org/r/487C8EA7.6020 159 .. [3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.