1 ========================= 1 ========================= 2 Hardware Latency Detector 2 Hardware Latency Detector 3 ========================= 3 ========================= 4 4 5 Introduction 5 Introduction 6 ------------- 6 ------------- 7 7 8 The tracer hwlat_detector is a special purpose 8 The tracer hwlat_detector is a special purpose tracer that is used to 9 detect large system latencies induced by the b 9 detect large system latencies induced by the behavior of certain underlying 10 hardware or firmware, independent of Linux its 10 hardware or firmware, independent of Linux itself. The code was developed 11 originally to detect SMIs (System Management I 11 originally to detect SMIs (System Management Interrupts) on x86 systems, 12 however there is nothing x86 specific about th 12 however there is nothing x86 specific about this patchset. It was 13 originally written for use by the "RT" patch s 13 originally written for use by the "RT" patch since the Real Time 14 kernel is highly latency sensitive. 14 kernel is highly latency sensitive. 15 15 16 SMIs are not serviced by the Linux kernel, whi 16 SMIs are not serviced by the Linux kernel, which means that it does not 17 even know that they are occurring. SMIs are in !! 17 even know that they are occuring. SMIs are instead set up by BIOS code 18 and are serviced by BIOS code, usually for "cr 18 and are serviced by BIOS code, usually for "critical" events such as 19 management of thermal sensors and fans. Someti 19 management of thermal sensors and fans. Sometimes though, SMIs are used for 20 other tasks and those tasks can spend an inord 20 other tasks and those tasks can spend an inordinate amount of time in the 21 handler (sometimes measured in milliseconds). 21 handler (sometimes measured in milliseconds). Obviously this is a problem if 22 you are trying to keep event service latencies 22 you are trying to keep event service latencies down in the microsecond range. 23 23 24 The hardware latency detector works by hogging 24 The hardware latency detector works by hogging one of the cpus for configurable 25 amounts of time (with interrupts disabled), po 25 amounts of time (with interrupts disabled), polling the CPU Time Stamp Counter 26 for some period, then looking for gaps in the 26 for some period, then looking for gaps in the TSC data. Any gap indicates a 27 time when the polling was interrupted and sinc 27 time when the polling was interrupted and since the interrupts are disabled, 28 the only thing that could do that would be an 28 the only thing that could do that would be an SMI or other hardware hiccup 29 (or an NMI, but those can be tracked). 29 (or an NMI, but those can be tracked). 30 30 31 Note that the hwlat detector should *NEVER* be 31 Note that the hwlat detector should *NEVER* be used in a production environment. 32 It is intended to be run manually to determine 32 It is intended to be run manually to determine if the hardware platform has a 33 problem with long system firmware service rout 33 problem with long system firmware service routines. 34 34 35 Usage 35 Usage 36 ------ 36 ------ 37 37 38 Write the ASCII text "hwlat" into the current_ 38 Write the ASCII text "hwlat" into the current_tracer file of the tracing system 39 (mounted at /sys/kernel/tracing or /sys/kernel 39 (mounted at /sys/kernel/tracing or /sys/kernel/tracing). It is possible to 40 redefine the threshold in microseconds (us) ab 40 redefine the threshold in microseconds (us) above which latency spikes will 41 be taken into account. 41 be taken into account. 42 42 43 Example:: 43 Example:: 44 44 45 # echo hwlat > /sys/kernel/tracing/cur 45 # echo hwlat > /sys/kernel/tracing/current_tracer 46 # echo 100 > /sys/kernel/tracing/traci 46 # echo 100 > /sys/kernel/tracing/tracing_thresh 47 47 48 The /sys/kernel/tracing/hwlat_detector interfa 48 The /sys/kernel/tracing/hwlat_detector interface contains the following files: 49 49 50 - width - time period to sample with CPUs he 50 - width - time period to sample with CPUs held (usecs) 51 must be less than the total window 51 must be less than the total window size (enforced) 52 - window - total period of sampling, width b 52 - window - total period of sampling, width being inside (usecs) 53 53 54 By default the width is set to 500,000 and win 54 By default the width is set to 500,000 and window to 1,000,000, meaning that 55 for every 1,000,000 usecs (1s) the hwlat detec 55 for every 1,000,000 usecs (1s) the hwlat detector will spin for 500,000 usecs 56 (0.5s). If tracing_thresh contains zero when h 56 (0.5s). If tracing_thresh contains zero when hwlat tracer is enabled, it will 57 change to a default of 10 usecs. If any latenc 57 change to a default of 10 usecs. If any latencies that exceed the threshold is 58 observed then the data will be written to the 58 observed then the data will be written to the tracing ring buffer. 59 59 60 The minimum sleep time between periods is 1 mi 60 The minimum sleep time between periods is 1 millisecond. Even if width 61 is less than 1 millisecond apart from window, 61 is less than 1 millisecond apart from window, to allow the system to not 62 be totally starved. 62 be totally starved. 63 63 64 If tracing_thresh was zero when hwlat detector 64 If tracing_thresh was zero when hwlat detector was started, it will be set 65 back to zero if another tracer is loaded. Note 65 back to zero if another tracer is loaded. Note, the last value in 66 tracing_thresh that hwlat detector had will be 66 tracing_thresh that hwlat detector had will be saved and this value will 67 be restored in tracing_thresh if it is still z 67 be restored in tracing_thresh if it is still zero when hwlat detector is 68 started again. 68 started again. 69 69 70 The following tracing directory files are used 70 The following tracing directory files are used by the hwlat_detector: 71 71 72 in /sys/kernel/tracing: 72 in /sys/kernel/tracing: 73 73 74 - tracing_threshold - minimum latency valu 74 - tracing_threshold - minimum latency value to be considered (usecs) 75 - tracing_max_latency - maximum hardware lat 75 - tracing_max_latency - maximum hardware latency actually observed (usecs) 76 - tracing_cpumask - the CPUs to move the 76 - tracing_cpumask - the CPUs to move the hwlat thread across 77 - hwlat_detector/width - specified amount of 77 - hwlat_detector/width - specified amount of time to spin within window (usecs) 78 - hwlat_detector/window - amount of ti 78 - hwlat_detector/window - amount of time between (width) runs (usecs) 79 - hwlat_detector/mode - the thread mode << 80 79 81 By default, one hwlat detector's kernel thread !! 80 The hwlat detector's kernel thread will migrate across each CPU specified in 82 specified in cpumask at the beginning of a new !! 81 tracing_cpumask between each window. To limit the migration, either modify 83 fashion. This behavior can be changed by chang !! 82 tracing_cpumask, or modify the hwlat kernel thread (named [hwlatd]) CPU 84 the available options are: !! 83 affinity directly, and the migration will stop. 85 << 86 - none: do not force migration << 87 - round-robin: migrate across each CPU specif << 88 - per-cpu: create one thread for each cpu <<
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.