~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/networking/net_dim.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 ======================================================
  2 Net DIM - Generic Network Dynamic Interrupt Moderation
  3 ======================================================
  4 
  5 :Author: Tal Gilboa <talgi@mellanox.com>
  6 
  7 .. contents:: :depth: 2
  8 
  9 Assumptions
 10 ===========
 11 
 12 This document assumes the reader has basic knowledge in network drivers
 13 and in general interrupt moderation.
 14 
 15 
 16 Introduction
 17 ============
 18 
 19 Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the
 20 interrupt moderation configuration of a channel in order to optimize packet
 21 processing. The mechanism includes an algorithm which decides if and how to
 22 change moderation parameters for a channel, usually by performing an analysis on
 23 runtime data sampled from the system. Net DIM is such a mechanism. In each
 24 iteration of the algorithm, it analyses a given sample of the data, compares it
 25 to the previous sample and if required, it can decide to change some of the
 26 interrupt moderation configuration fields. The data sample is composed of data
 27 bandwidth, the number of packets and the number of events. The time between
 28 samples is also measured. Net DIM compares the current and the previous data and
 29 returns an adjusted interrupt moderation configuration object. In some cases,
 30 the algorithm might decide not to change anything. The configuration fields are
 31 the minimum duration (microseconds) allowed between events and the maximum
 32 number of wanted packets per event. The Net DIM algorithm ascribes importance to
 33 increase bandwidth over reducing interrupt rate.
 34 
 35 
 36 Net DIM Algorithm
 37 =================
 38 
 39 Each iteration of the Net DIM algorithm follows these steps:
 40 
 41 #. Calculates new data sample.
 42 #. Compares it to previous sample.
 43 #. Makes a decision - suggests interrupt moderation configuration fields.
 44 #. Applies a schedule work function, which applies suggested configuration.
 45 
 46 The first two steps are straightforward, both the new and the previous data are
 47 supplied by the driver registered to Net DIM. The previous data is the new data
 48 supplied to the previous iteration. The comparison step checks the difference
 49 between the new and previous data and decides on the result of the last step.
 50 A step would result as "better" if bandwidth increases and as "worse" if
 51 bandwidth reduces. If there is no change in bandwidth, the packet rate is
 52 compared in a similar fashion - increase == "better" and decrease == "worse".
 53 In case there is no change in the packet rate as well, the interrupt rate is
 54 compared. Here the algorithm tries to optimize for lower interrupt rate so an
 55 increase in the interrupt rate is considered "worse" and a decrease is
 56 considered "better". Step #2 has an optimization for avoiding false results: it
 57 only considers a difference between samples as valid if it is greater than a
 58 certain percentage. Also, since Net DIM does not measure anything by itself, it
 59 assumes the data provided by the driver is valid.
 60 
 61 Step #3 decides on the suggested configuration based on the result from step #2
 62 and the internal state of the algorithm. The states reflect the "direction" of
 63 the algorithm: is it going left (reducing moderation), right (increasing
 64 moderation) or standing still. Another optimization is that if a decision
 65 to stay still is made multiple times, the interval between iterations of the
 66 algorithm would increase in order to reduce calculation overhead. Also, after
 67 "parking" on one of the most left or most right decisions, the algorithm may
 68 decide to verify this decision by taking a step in the other direction. This is
 69 done in order to avoid getting stuck in a "deep sleep" scenario. Once a
 70 decision is made, an interrupt moderation configuration is selected from
 71 the predefined profiles.
 72 
 73 The last step is to notify the registered driver that it should apply the
 74 suggested configuration. This is done by scheduling a work function, defined by
 75 the Net DIM API and provided by the registered driver.
 76 
 77 As you can see, Net DIM itself does not actively interact with the system. It
 78 would have trouble making the correct decisions if the wrong data is supplied to
 79 it and it would be useless if the work function would not apply the suggested
 80 configuration. This does, however, allow the registered driver some room for
 81 manoeuvre as it may provide partial data or ignore the algorithm suggestion
 82 under some conditions.
 83 
 84 
 85 Registering a Network Device to DIM
 86 ===================================
 87 
 88 Net DIM API exposes the main function net_dim().
 89 This function is the entry point to the Net
 90 DIM algorithm and has to be called every time the driver would like to check if
 91 it should change interrupt moderation parameters. The driver should provide two
 92 data structures: :c:type:`struct dim <dim>` and
 93 :c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>`
 94 describes the state of DIM for a specific object (RX queue, TX queue,
 95 other queues, etc.). This includes the current selected profile, previous data
 96 samples, the callback function provided by the driver and more.
 97 :c:type:`struct dim_sample <dim_sample>` describes a data sample,
 98 which will be compared to the data sample stored in :c:type:`struct dim <dim>`
 99 in order to decide on the algorithm's next
100 step. The sample should include bytes, packets and interrupts, measured by
101 the driver.
102 
103 In order to use Net DIM from a networking driver, the driver needs to call the
104 main net_dim() function. The recommended method is to call net_dim() on each
105 interrupt. Since Net DIM has a built-in moderation and it might decide to skip
106 iterations under certain conditions, there is no need to moderate the net_dim()
107 calls as well. As mentioned above, the driver needs to provide an object of type
108 :c:type:`struct dim <dim>` to the net_dim() function call. It is advised for
109 each entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its
110 data structure and use it as the main Net DIM API object.
111 The :c:type:`struct dim_sample <dim_sample>` should hold the latest
112 bytes, packets and interrupts count. No need to perform any calculations, just
113 include the raw data.
114 
115 The net_dim() call itself does not return anything. Instead Net DIM relies on
116 the driver to provide a callback function, which is called when the algorithm
117 decides to make a change in the interrupt moderation parameters. This callback
118 will be scheduled and run in a separate thread in order not to add overhead to
119 the data flow. After the work is done, Net DIM algorithm needs to be set to
120 the proper state in order to move to the next iteration.
121 
122 
123 Example
124 =======
125 
126 The following code demonstrates how to register a driver to Net DIM. The actual
127 usage is not complete but it should make the outline of the usage clear.
128 
129 .. code-block:: c
130 
131   #include <linux/dim.h>
132 
133   /* Callback for net DIM to schedule on a decision to change moderation */
134   void my_driver_do_dim_work(struct work_struct *work)
135   {
136         /* Get struct dim from struct work_struct */
137         struct dim *dim = container_of(work, struct dim,
138                                        work);
139         /* Do interrupt moderation related stuff */
140         ...
141 
142         /* Signal net DIM work is done and it should move to next iteration */
143         dim->state = DIM_START_MEASURE;
144   }
145 
146   /* My driver's interrupt handler */
147   int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
148   {
149         ...
150         /* A struct to hold current measured data */
151         struct dim_sample dim_sample;
152         ...
153         /* Initiate data sample struct with current data */
154         dim_update_sample(my_entity->events,
155                           my_entity->packets,
156                           my_entity->bytes,
157                           &dim_sample);
158         /* Call net DIM */
159         net_dim(&my_entity->dim, dim_sample);
160         ...
161   }
162 
163   /* My entity's initialization function (my_entity was already allocated) */
164   int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
165   {
166         ...
167         /* Initiate struct work_struct with my driver's callback function */
168         INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
169         ...
170   }
171 
172 
173 Tuning DIM
174 ==========
175 
176 Net DIM serves a range of network devices and delivers excellent acceleration
177 benefits. Yet, it has been observed that some preset configurations of DIM may
178 not align seamlessly with the varying specifications of network devices, and
179 this discrepancy has been identified as a factor to the suboptimal performance
180 outcomes of DIM-enabled network devices, related to a mismatch in profiles.
181 
182 To address this issue, Net DIM introduces a per-device control to modify and
183 access a device's ``rx-profile`` and ``tx-profile`` parameters:
184 Assume that the target network device is named ethx, and ethx only declares
185 support for RX profile setting and supports modification of ``usec`` field
186 and ``pkts`` field (See the data structure:
187 :c:type:`struct dim_cq_moder <dim_cq_moder>`).
188 
189 You can use ethtool to modify the current RX DIM profile where all
190 values are 64::
191 
192     $ ethtool -C ethx rx-profile 1,1,n_2,2,n_3,n,n_n,4,n_n,n,n
193 
194 ``n`` means do not modify this field, and ``_`` separates structure
195 elements of the profile array.
196 
197 Querying the current profiles using::
198 
199     $ ethtool -c ethx
200     ...
201     rx-profile:
202     {.usec =   1, .pkts =   1, .comps = n/a,},
203     {.usec =   2, .pkts =   2, .comps = n/a,},
204     {.usec =   3, .pkts =  64, .comps = n/a,},
205     {.usec =  64, .pkts =   4, .comps = n/a,},
206     {.usec =  64, .pkts =  64, .comps = n/a,}
207     tx-profile:   n/a
208 
209 If the network device does not support specific fields of DIM profiles,
210 the corresponding ``n/a`` will display. If the ``n/a`` field is being
211 modified, error messages will be reported.
212 
213 
214 Dynamic Interrupt Moderation (DIM) library API
215 ==============================================
216 
217 .. kernel-doc:: include/linux/dim.h
218     :internal:

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php