~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/driver-api/xillybus.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/driver-api/xillybus.rst (Version linux-6.12-rc7) and /Documentation/driver-api/xillybus.rst (Version linux-6.8.12)


  1 ==========================================          1 ==========================================
  2 Xillybus driver for generic FPGA interface          2 Xillybus driver for generic FPGA interface
  3 ==========================================          3 ==========================================
  4                                                     4 
  5 :Author: Eli Billauer, Xillybus Ltd. (http://x      5 :Author: Eli Billauer, Xillybus Ltd. (http://xillybus.com)
  6 :Email:  eli.billauer@gmail.com or as advertis      6 :Email:  eli.billauer@gmail.com or as advertised on Xillybus' site.
  7                                                     7 
  8 .. Contents:                                        8 .. Contents:
  9                                                     9 
 10  - Introduction                                    10  - Introduction
 11   -- Background                                    11   -- Background
 12   -- Xillybus Overview                             12   -- Xillybus Overview
 13                                                    13 
 14  - Usage                                           14  - Usage
 15   -- User interface                                15   -- User interface
 16   -- Synchronization                               16   -- Synchronization
 17   -- Seekable pipes                                17   -- Seekable pipes
 18                                                    18 
 19  - Internals                                       19  - Internals
 20   -- Source code organization                      20   -- Source code organization
 21   -- Pipe attributes                               21   -- Pipe attributes
 22   -- Host never reads from the FPGA                22   -- Host never reads from the FPGA
 23   -- Channels, pipes, and the message channel      23   -- Channels, pipes, and the message channel
 24   -- Data streaming                                24   -- Data streaming
 25   -- Data granularity                              25   -- Data granularity
 26   -- Probing                                       26   -- Probing
 27   -- Buffer allocation                             27   -- Buffer allocation
 28   -- The "nonempty" message (supporting poll)      28   -- The "nonempty" message (supporting poll)
 29                                                    29 
 30                                                    30 
 31 Introduction                                       31 Introduction
 32 ============                                       32 ============
 33                                                    33 
 34 Background                                         34 Background
 35 ----------                                         35 ----------
 36                                                    36 
 37 An FPGA (Field Programmable Gate Array) is a p     37 An FPGA (Field Programmable Gate Array) is a piece of logic hardware, which
 38 can be programmed to become virtually anything     38 can be programmed to become virtually anything that is usually found as a
 39 dedicated chipset: For instance, a display ada     39 dedicated chipset: For instance, a display adapter, network interface card,
 40 or even a processor with its peripherals. FPGA     40 or even a processor with its peripherals. FPGAs are the LEGO of hardware:
 41 Based upon certain building blocks, you make y     41 Based upon certain building blocks, you make your own toys the way you like
 42 them. It's usually pointless to reimplement so     42 them. It's usually pointless to reimplement something that is already
 43 available on the market as a chipset, so FPGAs     43 available on the market as a chipset, so FPGAs are mostly used when some
 44 special functionality is needed, and the produ     44 special functionality is needed, and the production volume is relatively low
 45 (hence not justifying the development of an AS     45 (hence not justifying the development of an ASIC).
 46                                                    46 
 47 The challenge with FPGAs is that everything is     47 The challenge with FPGAs is that everything is implemented at a very low
 48 level, even lower than assembly language. In o     48 level, even lower than assembly language. In order to allow FPGA designers to
 49 focus on their specific project, and not reinv     49 focus on their specific project, and not reinvent the wheel over and over
 50 again, pre-designed building blocks, IP cores,     50 again, pre-designed building blocks, IP cores, are often used. These are the
 51 FPGA parallels of library functions. IP cores      51 FPGA parallels of library functions. IP cores may implement certain
 52 mathematical functions, a functional unit (e.g     52 mathematical functions, a functional unit (e.g. a USB interface), an entire
 53 processor (e.g. ARM) or anything that might co     53 processor (e.g. ARM) or anything that might come handy. Think of them as a
 54 building block, with electrical wires dangling     54 building block, with electrical wires dangling on the sides for connection to
 55 other blocks.                                      55 other blocks.
 56                                                    56 
 57 One of the daunting tasks in FPGA design is co     57 One of the daunting tasks in FPGA design is communicating with a fullblown
 58 operating system (actually, with the processor     58 operating system (actually, with the processor running it): Implementing the
 59 low-level bus protocol and the somewhat higher     59 low-level bus protocol and the somewhat higher-level interface with the host
 60 (registers, interrupts, DMA etc.) is a project     60 (registers, interrupts, DMA etc.) is a project in itself. When the FPGA's
 61 function is a well-known one (e.g. a video ada     61 function is a well-known one (e.g. a video adapter card, or a NIC), it can
 62 make sense to design the FPGA's interface logi     62 make sense to design the FPGA's interface logic specifically for the project.
 63 A special driver is then written to present th     63 A special driver is then written to present the FPGA as a well-known interface
 64 to the kernel and/or user space. In that case,     64 to the kernel and/or user space. In that case, there is no reason to treat the
 65 FPGA differently than any device on the bus.       65 FPGA differently than any device on the bus.
 66                                                    66 
 67 It's however common that the desired data comm     67 It's however common that the desired data communication doesn't fit any well-
 68 known peripheral function. Also, the effort of     68 known peripheral function. Also, the effort of designing an elegant
 69 abstraction for the data exchange is often con     69 abstraction for the data exchange is often considered too big. In those cases,
 70 a quicker and possibly less elegant solution i     70 a quicker and possibly less elegant solution is sought: The driver is
 71 effectively written as a user space program, l     71 effectively written as a user space program, leaving the kernel space part
 72 with just elementary data transport. This stil     72 with just elementary data transport. This still requires designing some
 73 interface logic for the FPGA, and write a simp     73 interface logic for the FPGA, and write a simple ad-hoc driver for the kernel.
 74                                                    74 
 75 Xillybus Overview                                  75 Xillybus Overview
 76 -----------------                                  76 -----------------
 77                                                    77 
 78 Xillybus is an IP core and a Linux driver. Tog     78 Xillybus is an IP core and a Linux driver. Together, they form a kit for
 79 elementary data transport between an FPGA and      79 elementary data transport between an FPGA and the host, providing pipe-like
 80 data streams with a straightforward user inter     80 data streams with a straightforward user interface. It's intended as a low-
 81 effort solution for mixed FPGA-host projects,      81 effort solution for mixed FPGA-host projects, for which it makes sense to
 82 have the project-specific part of the driver r     82 have the project-specific part of the driver running in a user-space program.
 83                                                    83 
 84 Since the communication requirements may vary      84 Since the communication requirements may vary significantly from one FPGA
 85 project to another (the number of data pipes n     85 project to another (the number of data pipes needed in each direction and
 86 their attributes), there isn't one specific ch     86 their attributes), there isn't one specific chunk of logic being the Xillybus
 87 IP core. Rather, the IP core is configured and     87 IP core. Rather, the IP core is configured and built based upon a
 88 specification given by its end user.               88 specification given by its end user.
 89                                                    89 
 90 Xillybus presents independent data streams, wh     90 Xillybus presents independent data streams, which resemble pipes or TCP/IP
 91 communication to the user. At the host side, a     91 communication to the user. At the host side, a character device file is used
 92 just like any pipe file. On the FPGA side, har     92 just like any pipe file. On the FPGA side, hardware FIFOs are used to stream
 93 the data. This is contrary to a common method      93 the data. This is contrary to a common method of communicating through fixed-
 94 sized buffers (even though such buffers are us     94 sized buffers (even though such buffers are used by Xillybus under the hood).
 95 There may be more than a hundred of these stre     95 There may be more than a hundred of these streams on a single IP core, but
 96 also no more than one, depending on the config     96 also no more than one, depending on the configuration.
 97                                                    97 
 98 In order to ease the deployment of the Xillybu     98 In order to ease the deployment of the Xillybus IP core, it contains a simple
 99 data structure which completely defines the co     99 data structure which completely defines the core's configuration. The Linux
100 driver fetches this data structure during its     100 driver fetches this data structure during its initialization process, and sets
101 up the DMA buffers and character devices accor    101 up the DMA buffers and character devices accordingly. As a result, a single
102 driver is used to work out of the box with any    102 driver is used to work out of the box with any Xillybus IP core.
103                                                   103 
104 The data structure just mentioned should not b    104 The data structure just mentioned should not be confused with PCI's
105 configuration space or the Flattened Device Tr    105 configuration space or the Flattened Device Tree.
106                                                   106 
107 Usage                                             107 Usage
108 =====                                             108 =====
109                                                   109 
110 User interface                                    110 User interface
111 --------------                                    111 --------------
112                                                   112 
113 On the host, all interface with Xillybus is do    113 On the host, all interface with Xillybus is done through /dev/xillybus_*
114 device files, which are generated automaticall    114 device files, which are generated automatically as the drivers loads. The
115 names of these files depend on the IP core tha    115 names of these files depend on the IP core that is loaded in the FPGA (see
116 Probing below). To communicate with the FPGA,     116 Probing below). To communicate with the FPGA, open the device file that
117 corresponds to the hardware FIFO you want to s    117 corresponds to the hardware FIFO you want to send data or receive data from,
118 and use plain write() or read() calls, just li    118 and use plain write() or read() calls, just like with a regular pipe. In
119 particular, it makes perfect sense to go::        119 particular, it makes perfect sense to go::
120                                                   120 
121         $ cat mydata > /dev/xillybus_thisfifo     121         $ cat mydata > /dev/xillybus_thisfifo
122                                                   122 
123         $ cat /dev/xillybus_thatfifo > hisdata    123         $ cat /dev/xillybus_thatfifo > hisdata
124                                                   124 
125 possibly pressing CTRL-C as some stage, even t    125 possibly pressing CTRL-C as some stage, even though the xillybus_* pipes have
126 the capability to send an EOF (but may not use    126 the capability to send an EOF (but may not use it).
127                                                   127 
128 The driver and hardware are designed to behave    128 The driver and hardware are designed to behave sensibly as pipes, including:
129                                                   129 
130 * Supporting non-blocking I/O (by setting O_NO    130 * Supporting non-blocking I/O (by setting O_NONBLOCK on open() ).
131                                                   131 
132 * Supporting poll() and select().                 132 * Supporting poll() and select().
133                                                   133 
134 * Being bandwidth efficient under load (using     134 * Being bandwidth efficient under load (using DMA) but also handle small
135   pieces of data sent across (like TCP/IP) by     135   pieces of data sent across (like TCP/IP) by autoflushing.
136                                                   136 
137 A device file can be read only, write only or     137 A device file can be read only, write only or bidirectional. Bidirectional
138 device files are treated like two independent     138 device files are treated like two independent pipes (except for sharing a
139 "channel" structure in the implementation code    139 "channel" structure in the implementation code).
140                                                   140 
141 Synchronization                                   141 Synchronization
142 ---------------                                   142 ---------------
143                                                   143 
144 Xillybus pipes are configured (on the IP core)    144 Xillybus pipes are configured (on the IP core) to be either synchronous or
145 asynchronous. For a synchronous pipe, write()     145 asynchronous. For a synchronous pipe, write() returns successfully only after
146 some data has been submitted and acknowledged     146 some data has been submitted and acknowledged by the FPGA. This slows down
147 bulk data transfers, and is nearly impossible     147 bulk data transfers, and is nearly impossible for use with streams that
148 require data at a constant rate: There is no d    148 require data at a constant rate: There is no data transmitted to the FPGA
149 between write() calls, in particular when the     149 between write() calls, in particular when the process loses the CPU.
150                                                   150 
151 When a pipe is configured asynchronous, write(    151 When a pipe is configured asynchronous, write() returns if there was enough
152 room in the buffers to store any of the data i    152 room in the buffers to store any of the data in the buffers.
153                                                   153 
154 For FPGA to host pipes, asynchronous pipes all    154 For FPGA to host pipes, asynchronous pipes allow data transfer from the FPGA
155 as soon as the respective device file is opene    155 as soon as the respective device file is opened, regardless of if the data
156 has been requested by a read() call. On synchr    156 has been requested by a read() call. On synchronous pipes, only the amount
157 of data requested by a read() call is transmit    157 of data requested by a read() call is transmitted.
158                                                   158 
159 In summary, for synchronous pipes, data betwee    159 In summary, for synchronous pipes, data between the host and FPGA is
160 transmitted only to satisfy the read() or writ    160 transmitted only to satisfy the read() or write() call currently handled
161 by the driver, and those calls wait for the tr    161 by the driver, and those calls wait for the transmission to complete before
162 returning.                                        162 returning.
163                                                   163 
164 Note that the synchronization attribute has no    164 Note that the synchronization attribute has nothing to do with the possibility
165 that read() or write() completes less bytes th    165 that read() or write() completes less bytes than requested. There is a
166 separate configuration flag ("allowpartial") t    166 separate configuration flag ("allowpartial") that determines whether such a
167 partial completion is allowed.                    167 partial completion is allowed.
168                                                   168 
169 Seekable pipes                                    169 Seekable pipes
170 --------------                                    170 --------------
171                                                   171 
172 A synchronous pipe can be configured to have t    172 A synchronous pipe can be configured to have the stream's position exposed
173 to the user logic at the FPGA. Such a pipe is     173 to the user logic at the FPGA. Such a pipe is also seekable on the host API.
174 With this feature, a memory or register interf    174 With this feature, a memory or register interface can be attached on the
175 FPGA side to the seekable stream. Reading or w    175 FPGA side to the seekable stream. Reading or writing to a certain address in
176 the attached memory is done by seeking to the     176 the attached memory is done by seeking to the desired address, and calling
177 read() or write() as required.                    177 read() or write() as required.
178                                                   178 
179                                                   179 
180 Internals                                         180 Internals
181 =========                                         181 =========
182                                                   182 
183 Source code organization                          183 Source code organization
184 ------------------------                          184 ------------------------
185                                                   185 
186 The Xillybus driver consists of a core module,    186 The Xillybus driver consists of a core module, xillybus_core.c, and modules
187 that depend on the specific bus interface (xil    187 that depend on the specific bus interface (xillybus_of.c and xillybus_pcie.c).
188                                                   188 
189 The bus specific modules are those probed when    189 The bus specific modules are those probed when a suitable device is found by
190 the kernel. Since the DMA mapping and synchron    190 the kernel. Since the DMA mapping and synchronization functions, which are bus
191 dependent by their nature, are used by the cor    191 dependent by their nature, are used by the core module, a
192 xilly_endpoint_hardware structure is passed to    192 xilly_endpoint_hardware structure is passed to the core module on
193 initialization. This structure is populated wi    193 initialization. This structure is populated with pointers to wrapper functions
194 which execute the DMA-related operations on th    194 which execute the DMA-related operations on the bus.
195                                                   195 
196 Pipe attributes                                   196 Pipe attributes
197 ---------------                                   197 ---------------
198                                                   198 
199 Each pipe has a number of attributes which are    199 Each pipe has a number of attributes which are set when the FPGA component
200 (IP core) is built. They are fetched from the     200 (IP core) is built. They are fetched from the IDT (the data structure which
201 defines the core's configuration, see Probing     201 defines the core's configuration, see Probing below) by xilly_setupchannels()
202 in xillybus_core.c as follows:                    202 in xillybus_core.c as follows:
203                                                   203 
204 * is_writebuf: The pipe's direction. A non-zer    204 * is_writebuf: The pipe's direction. A non-zero value means it's an FPGA to
205   host pipe (the FPGA "writes").                  205   host pipe (the FPGA "writes").
206                                                   206 
207 * channelnum: The pipe's identification number    207 * channelnum: The pipe's identification number in communication between the
208   host and FPGA.                                  208   host and FPGA.
209                                                   209 
210 * format: The underlying data width. See Data     210 * format: The underlying data width. See Data Granularity below.
211                                                   211 
212 * allowpartial: A non-zero value means that a     212 * allowpartial: A non-zero value means that a read() or write() (whichever
213   applies) may return with less than the reque    213   applies) may return with less than the requested number of bytes. The common
214   choice is a non-zero value, to match standar    214   choice is a non-zero value, to match standard UNIX behavior.
215                                                   215 
216 * synchronous: A non-zero value means that the    216 * synchronous: A non-zero value means that the pipe is synchronous. See
217   Synchronization above.                          217   Synchronization above.
218                                                   218 
219 * bufsize: Each DMA buffer's size. Always a po    219 * bufsize: Each DMA buffer's size. Always a power of two.
220                                                   220 
221 * bufnum: The number of buffers allocated for     221 * bufnum: The number of buffers allocated for this pipe. Always a power of two.
222                                                   222 
223 * exclusive_open: A non-zero value forces excl    223 * exclusive_open: A non-zero value forces exclusive opening of the associated
224   device file. If the device file is bidirecti    224   device file. If the device file is bidirectional, and already opened only in
225   one direction, the opposite direction may be    225   one direction, the opposite direction may be opened once.
226                                                   226 
227 * seekable: A non-zero value indicates that th    227 * seekable: A non-zero value indicates that the pipe is seekable. See
228   Seekable pipes above.                           228   Seekable pipes above.
229                                                   229 
230 * supports_nonempty: A non-zero value (which i    230 * supports_nonempty: A non-zero value (which is typical) indicates that the
231   hardware will send the messages that are nec    231   hardware will send the messages that are necessary to support select() and
232   poll() for this pipe.                           232   poll() for this pipe.
233                                                   233 
234 Host never reads from the FPGA                    234 Host never reads from the FPGA
235 ------------------------------                    235 ------------------------------
236                                                   236 
237 Even though PCI Express is hotpluggable in gen    237 Even though PCI Express is hotpluggable in general, a typical motherboard
238 doesn't expect a card to go away all of the su    238 doesn't expect a card to go away all of the sudden. But since the PCIe card
239 is based upon reprogrammable logic, a sudden d    239 is based upon reprogrammable logic, a sudden disappearance from the bus is
240 quite likely as a result of an accidental repr    240 quite likely as a result of an accidental reprogramming of the FPGA while the
241 host is up. In practice, nothing happens immed    241 host is up. In practice, nothing happens immediately in such a situation. But
242 if the host attempts to read from an address t    242 if the host attempts to read from an address that is mapped to the PCI Express
243 device, that leads to an immediate freeze of t    243 device, that leads to an immediate freeze of the system on some motherboards,
244 even though the PCIe standard requires a grace    244 even though the PCIe standard requires a graceful recovery.
245                                                   245 
246 In order to avoid these freezes, the Xillybus     246 In order to avoid these freezes, the Xillybus driver refrains completely from
247 reading from the device's register space. All     247 reading from the device's register space. All communication from the FPGA to
248 the host is done through DMA. In particular, t    248 the host is done through DMA. In particular, the Interrupt Service Routine
249 doesn't follow the common practice of checking    249 doesn't follow the common practice of checking a status register when it's
250 invoked. Rather, the FPGA prepares a small buf    250 invoked. Rather, the FPGA prepares a small buffer which contains short
251 messages, which inform the host what the inter    251 messages, which inform the host what the interrupt was about.
252                                                   252 
253 This mechanism is used on non-PCIe buses as we    253 This mechanism is used on non-PCIe buses as well for the sake of uniformity.
254                                                   254 
255                                                   255 
256 Channels, pipes, and the message channel          256 Channels, pipes, and the message channel
257 ----------------------------------------          257 ----------------------------------------
258                                                   258 
259 Each of the (possibly bidirectional) pipes pre    259 Each of the (possibly bidirectional) pipes presented to the user is allocated
260 a data channel between the FPGA and the host.     260 a data channel between the FPGA and the host. The distinction between channels
261 and pipes is necessary only because of channel    261 and pipes is necessary only because of channel 0, which is used for interrupt-
262 related messages from the FPGA, and has no pip    262 related messages from the FPGA, and has no pipe attached to it.
263                                                   263 
264 Data streaming                                    264 Data streaming
265 --------------                                    265 --------------
266                                                   266 
267 Even though a non-segmented data stream is pre    267 Even though a non-segmented data stream is presented to the user at both
268 sides, the implementation relies on a set of D    268 sides, the implementation relies on a set of DMA buffers which is allocated
269 for each channel. For the sake of illustration    269 for each channel. For the sake of illustration, let's take the FPGA to host
270 direction: As data streams into the respective    270 direction: As data streams into the respective channel's interface in the
271 FPGA, the Xillybus IP core writes it to one of    271 FPGA, the Xillybus IP core writes it to one of the DMA buffers. When the
272 buffer is full, the FPGA informs the host abou    272 buffer is full, the FPGA informs the host about that (appending a
273 XILLYMSG_OPCODE_RELEASEBUF message channel 0 a    273 XILLYMSG_OPCODE_RELEASEBUF message channel 0 and sending an interrupt if
274 necessary). The host responds by making the da    274 necessary). The host responds by making the data available for reading through
275 the character device. When all data has been r    275 the character device. When all data has been read, the host writes on the
276 FPGA's buffer control register, allowing the b    276 FPGA's buffer control register, allowing the buffer's overwriting. Flow
277 control mechanisms exist on both sides to prev    277 control mechanisms exist on both sides to prevent underflows and overflows.
278                                                   278 
279 This is not good enough for creating a TCP/IP-    279 This is not good enough for creating a TCP/IP-like stream: If the data flow
280 stops momentarily before a DMA buffer is fille    280 stops momentarily before a DMA buffer is filled, the intuitive expectation is
281 that the partial data in buffer will arrive an    281 that the partial data in buffer will arrive anyhow, despite the buffer not
282 being completed. This is implemented by adding    282 being completed. This is implemented by adding a field in the
283 XILLYMSG_OPCODE_RELEASEBUF message, through wh    283 XILLYMSG_OPCODE_RELEASEBUF message, through which the FPGA informs not just
284 which buffer is submitted, but how much data i    284 which buffer is submitted, but how much data it contains.
285                                                   285 
286 But the FPGA will submit a partially filled bu    286 But the FPGA will submit a partially filled buffer only if directed to do so
287 by the host. This situation occurs when the re    287 by the host. This situation occurs when the read() method has been blocking
288 for XILLY_RX_TIMEOUT jiffies (currently 10 ms)    288 for XILLY_RX_TIMEOUT jiffies (currently 10 ms), after which the host commands
289 the FPGA to submit a DMA buffer as soon as it     289 the FPGA to submit a DMA buffer as soon as it can. This timeout mechanism
290 balances between bus bandwidth efficiency (pre    290 balances between bus bandwidth efficiency (preventing a lot of partially
291 filled buffers being sent) and a latency held     291 filled buffers being sent) and a latency held fairly low for tails of data.
292                                                   292 
293 A similar setting is used in the host to FPGA     293 A similar setting is used in the host to FPGA direction. The handling of
294 partial DMA buffers is somewhat different, tho    294 partial DMA buffers is somewhat different, though. The user can tell the
295 driver to submit all data it has in the buffer    295 driver to submit all data it has in the buffers to the FPGA, by issuing a
296 write() with the byte count set to zero. This     296 write() with the byte count set to zero. This is similar to a flush request,
297 but it doesn't block. There is also an autoflu    297 but it doesn't block. There is also an autoflushing mechanism, which triggers
298 an equivalent flush roughly XILLY_RX_TIMEOUT j    298 an equivalent flush roughly XILLY_RX_TIMEOUT jiffies after the last write().
299 This allows the user to be oblivious about the    299 This allows the user to be oblivious about the underlying buffering mechanism
300 and yet enjoy a stream-like interface.            300 and yet enjoy a stream-like interface.
301                                                   301 
302 Note that the issue of partial buffer flushing    302 Note that the issue of partial buffer flushing is irrelevant for pipes having
303 the "synchronous" attribute nonzero, since syn    303 the "synchronous" attribute nonzero, since synchronous pipes don't allow data
304 to lay around in the DMA buffers between read(    304 to lay around in the DMA buffers between read() and write() anyhow.
305                                                   305 
306 Data granularity                                  306 Data granularity
307 ----------------                                  307 ----------------
308                                                   308 
309 The data arrives or is sent at the FPGA as 8,     309 The data arrives or is sent at the FPGA as 8, 16 or 32 bit wide words, as
310 configured by the "format" attribute. Whenever    310 configured by the "format" attribute. Whenever possible, the driver attempts
311 to hide this when the pipe is accessed differe    311 to hide this when the pipe is accessed differently from its natural alignment.
312 For example, reading single bytes from a pipe     312 For example, reading single bytes from a pipe with 32 bit granularity works
313 with no issues. Writing single bytes to pipes     313 with no issues. Writing single bytes to pipes with 16 or 32 bit granularity
314 will also work, but the driver can't send part    314 will also work, but the driver can't send partially completed words to the
315 FPGA, so the transmission of up to one word ma    315 FPGA, so the transmission of up to one word may be held until it's fully
316 occupied with user data.                          316 occupied with user data.
317                                                   317 
318 This somewhat complicates the handling of host    318 This somewhat complicates the handling of host to FPGA streams, because
319 when a buffer is flushed, it may contain up to    319 when a buffer is flushed, it may contain up to 3 bytes don't form a word in
320 the FPGA, and hence can't be sent. To prevent     320 the FPGA, and hence can't be sent. To prevent loss of data, these leftover
321 bytes need to be moved to the next buffer. The    321 bytes need to be moved to the next buffer. The parts in xillybus_core.c
322 that mention "leftovers" in some way are relat    322 that mention "leftovers" in some way are related to this complication.
323                                                   323 
324 Probing                                           324 Probing
325 -------                                           325 -------
326                                                   326 
327 As mentioned earlier, the number of pipes that    327 As mentioned earlier, the number of pipes that are created when the driver
328 loads and their attributes depend on the Xilly    328 loads and their attributes depend on the Xillybus IP core in the FPGA. During
329 the driver's initialization, a blob containing    329 the driver's initialization, a blob containing configuration info, the
330 Interface Description Table (IDT), is sent fro    330 Interface Description Table (IDT), is sent from the FPGA to the host. The
331 bootstrap process is done in three phases:        331 bootstrap process is done in three phases:
332                                                   332 
333 1. Acquire the length of the IDT, so a buffer     333 1. Acquire the length of the IDT, so a buffer can be allocated for it. This
334    is done by sending a quiesce command to the    334    is done by sending a quiesce command to the device, since the acknowledge
335    for this command contains the IDT's buffer     335    for this command contains the IDT's buffer length.
336                                                   336 
337 2. Acquire the IDT itself.                        337 2. Acquire the IDT itself.
338                                                   338 
339 3. Create the interfaces according to the IDT.    339 3. Create the interfaces according to the IDT.
340                                                   340 
341 Buffer allocation                                 341 Buffer allocation
342 -----------------                                 342 -----------------
343                                                   343 
344 In order to simplify the logic that prevents i    344 In order to simplify the logic that prevents illegal boundary crossings of
345 PCIe packets, the following rule applies: If a    345 PCIe packets, the following rule applies: If a buffer is smaller than 4kB,
346 it must not cross a 4kB boundary. Otherwise, i    346 it must not cross a 4kB boundary. Otherwise, it must be 4kB aligned. The
347 xilly_setupchannels() functions allocates thes    347 xilly_setupchannels() functions allocates these buffers by requesting whole
348 pages from the kernel, and diving them into DM    348 pages from the kernel, and diving them into DMA buffers as necessary. Since
349 all buffers' sizes are powers of two, it's pos    349 all buffers' sizes are powers of two, it's possible to pack any set of such
350 buffers, with a maximal waste of one page of m    350 buffers, with a maximal waste of one page of memory.
351                                                   351 
352 All buffers are allocated when the driver is l    352 All buffers are allocated when the driver is loaded. This is necessary,
353 since large continuous physical memory segment    353 since large continuous physical memory segments are sometimes requested,
354 which are more likely to be available when the    354 which are more likely to be available when the system is freshly booted.
355                                                   355 
356 The allocation of buffer memory takes place in    356 The allocation of buffer memory takes place in the same order they appear in
357 the IDT. The driver relies on a rule that the     357 the IDT. The driver relies on a rule that the pipes are sorted with decreasing
358 buffer size in the IDT. If a requested buffer     358 buffer size in the IDT. If a requested buffer is larger or equal to a page,
359 the necessary number of pages is requested fro    359 the necessary number of pages is requested from the kernel, and these are
360 used for this buffer. If the requested buffer     360 used for this buffer. If the requested buffer is smaller than a page, one
361 single page is requested from the kernel, and     361 single page is requested from the kernel, and that page is partially used.
362 Or, if there already is a partially used page     362 Or, if there already is a partially used page at hand, the buffer is packed
363 into that page. It can be shown that all pages    363 into that page. It can be shown that all pages requested from the kernel
364 (except possibly for the last) are 100% utiliz    364 (except possibly for the last) are 100% utilized this way.
365                                                   365 
366 The "nonempty" message (supporting poll)          366 The "nonempty" message (supporting poll)
367 ----------------------------------------          367 ----------------------------------------
368                                                   368 
369 In order to support the "poll" method (and hen    369 In order to support the "poll" method (and hence select() ), there is a small
370 catch regarding the FPGA to host direction: Th    370 catch regarding the FPGA to host direction: The FPGA may have filled a DMA
371 buffer with some data, but not submitted that     371 buffer with some data, but not submitted that buffer. If the host waited for
372 the buffer's submission by the FPGA, there wou    372 the buffer's submission by the FPGA, there would be a possibility that the
373 FPGA side has sent data, but a select() call w    373 FPGA side has sent data, but a select() call would still block, because the
374 host has not received any notification about t    374 host has not received any notification about this. This is solved with
375 XILLYMSG_OPCODE_NONEMPTY messages sent by the     375 XILLYMSG_OPCODE_NONEMPTY messages sent by the FPGA when a channel goes from
376 completely empty to containing some data.         376 completely empty to containing some data.
377                                                   377 
378 These messages are used only to support poll()    378 These messages are used only to support poll() and select(). The IP core can
379 be configured not to send them for a slight re    379 be configured not to send them for a slight reduction of bandwidth.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php