1 PCIe Device AER statistics 1 PCIe Device AER statistics 2 -------------------------- 2 -------------------------- 3 3 4 These attributes show up under all the devices 4 These attributes show up under all the devices that are AER capable. These 5 statistical counters indicate the errors "as s 5 statistical counters indicate the errors "as seen/reported by the device". 6 Note that this may mean that if an endpoint is 6 Note that this may mean that if an endpoint is causing problems, the AER 7 counters may increment at its link partner (e. 7 counters may increment at its link partner (e.g. root port) because the 8 errors may be "seen" / reported by the link pa 8 errors may be "seen" / reported by the link partner and not the 9 problematic endpoint itself (which may report 9 problematic endpoint itself (which may report all counters as 0 as it never 10 saw any problems). 10 saw any problems). 11 11 12 What: /sys/bus/pci/devices/<dev>/aer 12 What: /sys/bus/pci/devices/<dev>/aer_dev_correctable 13 Date: July 2018 13 Date: July 2018 14 KernelVersion: 4.19.0 14 KernelVersion: 4.19.0 15 Contact: linux-pci@vger.kernel.org, raj 15 Contact: linux-pci@vger.kernel.org, rajatja@google.com 16 Description: List of correctable errors see 16 Description: List of correctable errors seen and reported by this 17 PCI device using ERR_COR. Note 17 PCI device using ERR_COR. Note that since multiple errors may 18 be reported using a single ERR 18 be reported using a single ERR_COR message, thus 19 TOTAL_ERR_COR at the end of th 19 TOTAL_ERR_COR at the end of the file may not match the actual 20 total of all the errors in the 20 total of all the errors in the file. Sample output:: 21 21 22 localhost /sys/devices/pci 22 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable 23 Receiver Error 2 23 Receiver Error 2 24 Bad TLP 0 24 Bad TLP 0 25 Bad DLLP 0 25 Bad DLLP 0 26 RELAY_NUM Rollover 0 26 RELAY_NUM Rollover 0 27 Replay Timer Timeout 0 27 Replay Timer Timeout 0 28 Advisory Non-Fatal 0 28 Advisory Non-Fatal 0 29 Corrected Internal Error 0 29 Corrected Internal Error 0 30 Header Log Overflow 0 30 Header Log Overflow 0 31 TOTAL_ERR_COR 2 31 TOTAL_ERR_COR 2 32 32 33 What: /sys/bus/pci/devices/<dev>/aer 33 What: /sys/bus/pci/devices/<dev>/aer_dev_fatal 34 Date: July 2018 34 Date: July 2018 35 KernelVersion: 4.19.0 35 KernelVersion: 4.19.0 36 Contact: linux-pci@vger.kernel.org, raj 36 Contact: linux-pci@vger.kernel.org, rajatja@google.com 37 Description: List of uncorrectable fatal er 37 Description: List of uncorrectable fatal errors seen and reported by this 38 PCI device using ERR_FATAL. No 38 PCI device using ERR_FATAL. Note that since multiple errors may 39 be reported using a single ERR 39 be reported using a single ERR_FATAL message, thus 40 TOTAL_ERR_FATAL at the end of 40 TOTAL_ERR_FATAL at the end of the file may not match the actual 41 total of all the errors in the 41 total of all the errors in the file. Sample output:: 42 42 43 localhost /sys/devices/pci 43 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal 44 Undefined 0 44 Undefined 0 45 Data Link Protocol 0 45 Data Link Protocol 0 46 Surprise Down Error 0 46 Surprise Down Error 0 47 Poisoned TLP 0 47 Poisoned TLP 0 48 Flow Control Protocol 0 48 Flow Control Protocol 0 49 Completion Timeout 0 49 Completion Timeout 0 50 Completer Abort 0 50 Completer Abort 0 51 Unexpected Completion 0 51 Unexpected Completion 0 52 Receiver Overflow 0 52 Receiver Overflow 0 53 Malformed TLP 0 53 Malformed TLP 0 54 ECRC 0 54 ECRC 0 55 Unsupported Request 0 55 Unsupported Request 0 56 ACS Violation 0 56 ACS Violation 0 57 Uncorrectable Internal Err 57 Uncorrectable Internal Error 0 58 MC Blocked TLP 0 58 MC Blocked TLP 0 59 AtomicOp Egress Blocked 0 59 AtomicOp Egress Blocked 0 60 TLP Prefix Blocked Error 0 60 TLP Prefix Blocked Error 0 61 TOTAL_ERR_FATAL 0 61 TOTAL_ERR_FATAL 0 62 62 63 What: /sys/bus/pci/devices/<dev>/aer 63 What: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal 64 Date: July 2018 64 Date: July 2018 65 KernelVersion: 4.19.0 65 KernelVersion: 4.19.0 66 Contact: linux-pci@vger.kernel.org, raj 66 Contact: linux-pci@vger.kernel.org, rajatja@google.com 67 Description: List of uncorrectable nonfatal 67 Description: List of uncorrectable nonfatal errors seen and reported by this 68 PCI device using ERR_NONFATAL. 68 PCI device using ERR_NONFATAL. Note that since multiple errors 69 may be reported using a single 69 may be reported using a single ERR_FATAL message, thus 70 TOTAL_ERR_NONFATAL at the end 70 TOTAL_ERR_NONFATAL at the end of the file may not match the 71 actual total of all the errors 71 actual total of all the errors in the file. Sample output:: 72 72 73 localhost /sys/devices/pci 73 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal 74 Undefined 0 74 Undefined 0 75 Data Link Protocol 0 75 Data Link Protocol 0 76 Surprise Down Error 0 76 Surprise Down Error 0 77 Poisoned TLP 0 77 Poisoned TLP 0 78 Flow Control Protocol 0 78 Flow Control Protocol 0 79 Completion Timeout 0 79 Completion Timeout 0 80 Completer Abort 0 80 Completer Abort 0 81 Unexpected Completion 0 81 Unexpected Completion 0 82 Receiver Overflow 0 82 Receiver Overflow 0 83 Malformed TLP 0 83 Malformed TLP 0 84 ECRC 0 84 ECRC 0 85 Unsupported Request 0 85 Unsupported Request 0 86 ACS Violation 0 86 ACS Violation 0 87 Uncorrectable Internal Err 87 Uncorrectable Internal Error 0 88 MC Blocked TLP 0 88 MC Blocked TLP 0 89 AtomicOp Egress Blocked 0 89 AtomicOp Egress Blocked 0 90 TLP Prefix Blocked Error 0 90 TLP Prefix Blocked Error 0 91 TOTAL_ERR_NONFATAL 0 91 TOTAL_ERR_NONFATAL 0 92 92 93 PCIe Rootport AER statistics 93 PCIe Rootport AER statistics 94 ---------------------------- 94 ---------------------------- 95 95 96 These attributes show up under only the rootpo 96 These attributes show up under only the rootports (or root complex event 97 collectors) that are AER capable. These indica 97 collectors) that are AER capable. These indicate the number of error messages as 98 "reported to" the rootport. Please note that t 98 "reported to" the rootport. Please note that the rootports also transmit 99 (internally) the ERR_* messages for errors see 99 (internally) the ERR_* messages for errors seen by the internal rootport PCI 100 device, so these counters include them and are 100 device, so these counters include them and are thus cumulative of all the error 101 messages on the PCI hierarchy originating at t 101 messages on the PCI hierarchy originating at that root port. 102 102 103 What: /sys/bus/pci/devices/<dev>/aer 103 What: /sys/bus/pci/devices/<dev>/aer_rootport_total_err_cor 104 Date: July 2018 104 Date: July 2018 105 KernelVersion: 4.19.0 105 KernelVersion: 4.19.0 106 Contact: linux-pci@vger.kernel.org, raj 106 Contact: linux-pci@vger.kernel.org, rajatja@google.com 107 Description: Total number of ERR_COR messag 107 Description: Total number of ERR_COR messages reported to rootport. 108 108 109 What: /sys/bus/pci/devices/<dev>/aer 109 What: /sys/bus/pci/devices/<dev>/aer_rootport_total_err_fatal 110 Date: July 2018 110 Date: July 2018 111 KernelVersion: 4.19.0 111 KernelVersion: 4.19.0 112 Contact: linux-pci@vger.kernel.org, raj 112 Contact: linux-pci@vger.kernel.org, rajatja@google.com 113 Description: Total number of ERR_FATAL mess 113 Description: Total number of ERR_FATAL messages reported to rootport. 114 114 115 What: /sys/bus/pci/devices/<dev>/aer 115 What: /sys/bus/pci/devices/<dev>/aer_rootport_total_err_nonfatal 116 Date: July 2018 116 Date: July 2018 117 KernelVersion: 4.19.0 117 KernelVersion: 4.19.0 118 Contact: linux-pci@vger.kernel.org, raj 118 Contact: linux-pci@vger.kernel.org, rajatja@google.com 119 Description: Total number of ERR_NONFATAL m 119 Description: Total number of ERR_NONFATAL messages reported to rootport.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.