1 ========================================== 1 ========================================== 2 Xillybus driver for generic FPGA interface 2 Xillybus driver for generic FPGA interface 3 ========================================== 3 ========================================== 4 4 5 :Author: Eli Billauer, Xillybus Ltd. (http://x 5 :Author: Eli Billauer, Xillybus Ltd. (http://xillybus.com) 6 :Email: eli.billauer@gmail.com or as advertis 6 :Email: eli.billauer@gmail.com or as advertised on Xillybus' site. 7 7 8 .. Contents: 8 .. Contents: 9 9 10 - Introduction 10 - Introduction 11 -- Background 11 -- Background 12 -- Xillybus Overview 12 -- Xillybus Overview 13 13 14 - Usage 14 - Usage 15 -- User interface 15 -- User interface 16 -- Synchronization 16 -- Synchronization 17 -- Seekable pipes 17 -- Seekable pipes 18 18 19 - Internals 19 - Internals 20 -- Source code organization 20 -- Source code organization 21 -- Pipe attributes 21 -- Pipe attributes 22 -- Host never reads from the FPGA 22 -- Host never reads from the FPGA 23 -- Channels, pipes, and the message channel 23 -- Channels, pipes, and the message channel 24 -- Data streaming 24 -- Data streaming 25 -- Data granularity 25 -- Data granularity 26 -- Probing 26 -- Probing 27 -- Buffer allocation 27 -- Buffer allocation 28 -- The "nonempty" message (supporting poll) 28 -- The "nonempty" message (supporting poll) 29 29 30 30 31 Introduction 31 Introduction 32 ============ 32 ============ 33 33 34 Background 34 Background 35 ---------- 35 ---------- 36 36 37 An FPGA (Field Programmable Gate Array) is a p 37 An FPGA (Field Programmable Gate Array) is a piece of logic hardware, which 38 can be programmed to become virtually anything 38 can be programmed to become virtually anything that is usually found as a 39 dedicated chipset: For instance, a display ada 39 dedicated chipset: For instance, a display adapter, network interface card, 40 or even a processor with its peripherals. FPGA 40 or even a processor with its peripherals. FPGAs are the LEGO of hardware: 41 Based upon certain building blocks, you make y 41 Based upon certain building blocks, you make your own toys the way you like 42 them. It's usually pointless to reimplement so 42 them. It's usually pointless to reimplement something that is already 43 available on the market as a chipset, so FPGAs 43 available on the market as a chipset, so FPGAs are mostly used when some 44 special functionality is needed, and the produ 44 special functionality is needed, and the production volume is relatively low 45 (hence not justifying the development of an AS 45 (hence not justifying the development of an ASIC). 46 46 47 The challenge with FPGAs is that everything is 47 The challenge with FPGAs is that everything is implemented at a very low 48 level, even lower than assembly language. In o 48 level, even lower than assembly language. In order to allow FPGA designers to 49 focus on their specific project, and not reinv 49 focus on their specific project, and not reinvent the wheel over and over 50 again, pre-designed building blocks, IP cores, 50 again, pre-designed building blocks, IP cores, are often used. These are the 51 FPGA parallels of library functions. IP cores 51 FPGA parallels of library functions. IP cores may implement certain 52 mathematical functions, a functional unit (e.g 52 mathematical functions, a functional unit (e.g. a USB interface), an entire 53 processor (e.g. ARM) or anything that might co 53 processor (e.g. ARM) or anything that might come handy. Think of them as a 54 building block, with electrical wires dangling 54 building block, with electrical wires dangling on the sides for connection to 55 other blocks. 55 other blocks. 56 56 57 One of the daunting tasks in FPGA design is co 57 One of the daunting tasks in FPGA design is communicating with a fullblown 58 operating system (actually, with the processor 58 operating system (actually, with the processor running it): Implementing the 59 low-level bus protocol and the somewhat higher 59 low-level bus protocol and the somewhat higher-level interface with the host 60 (registers, interrupts, DMA etc.) is a project 60 (registers, interrupts, DMA etc.) is a project in itself. When the FPGA's 61 function is a well-known one (e.g. a video ada 61 function is a well-known one (e.g. a video adapter card, or a NIC), it can 62 make sense to design the FPGA's interface logi 62 make sense to design the FPGA's interface logic specifically for the project. 63 A special driver is then written to present th 63 A special driver is then written to present the FPGA as a well-known interface 64 to the kernel and/or user space. In that case, 64 to the kernel and/or user space. In that case, there is no reason to treat the 65 FPGA differently than any device on the bus. 65 FPGA differently than any device on the bus. 66 66 67 It's however common that the desired data comm 67 It's however common that the desired data communication doesn't fit any well- 68 known peripheral function. Also, the effort of 68 known peripheral function. Also, the effort of designing an elegant 69 abstraction for the data exchange is often con 69 abstraction for the data exchange is often considered too big. In those cases, 70 a quicker and possibly less elegant solution i 70 a quicker and possibly less elegant solution is sought: The driver is 71 effectively written as a user space program, l 71 effectively written as a user space program, leaving the kernel space part 72 with just elementary data transport. This stil 72 with just elementary data transport. This still requires designing some 73 interface logic for the FPGA, and write a simp 73 interface logic for the FPGA, and write a simple ad-hoc driver for the kernel. 74 74 75 Xillybus Overview 75 Xillybus Overview 76 ----------------- 76 ----------------- 77 77 78 Xillybus is an IP core and a Linux driver. Tog 78 Xillybus is an IP core and a Linux driver. Together, they form a kit for 79 elementary data transport between an FPGA and 79 elementary data transport between an FPGA and the host, providing pipe-like 80 data streams with a straightforward user inter 80 data streams with a straightforward user interface. It's intended as a low- 81 effort solution for mixed FPGA-host projects, 81 effort solution for mixed FPGA-host projects, for which it makes sense to 82 have the project-specific part of the driver r 82 have the project-specific part of the driver running in a user-space program. 83 83 84 Since the communication requirements may vary 84 Since the communication requirements may vary significantly from one FPGA 85 project to another (the number of data pipes n 85 project to another (the number of data pipes needed in each direction and 86 their attributes), there isn't one specific ch 86 their attributes), there isn't one specific chunk of logic being the Xillybus 87 IP core. Rather, the IP core is configured and 87 IP core. Rather, the IP core is configured and built based upon a 88 specification given by its end user. 88 specification given by its end user. 89 89 90 Xillybus presents independent data streams, wh 90 Xillybus presents independent data streams, which resemble pipes or TCP/IP 91 communication to the user. At the host side, a 91 communication to the user. At the host side, a character device file is used 92 just like any pipe file. On the FPGA side, har 92 just like any pipe file. On the FPGA side, hardware FIFOs are used to stream 93 the data. This is contrary to a common method 93 the data. This is contrary to a common method of communicating through fixed- 94 sized buffers (even though such buffers are us 94 sized buffers (even though such buffers are used by Xillybus under the hood). 95 There may be more than a hundred of these stre 95 There may be more than a hundred of these streams on a single IP core, but 96 also no more than one, depending on the config 96 also no more than one, depending on the configuration. 97 97 98 In order to ease the deployment of the Xillybu 98 In order to ease the deployment of the Xillybus IP core, it contains a simple 99 data structure which completely defines the co 99 data structure which completely defines the core's configuration. The Linux 100 driver fetches this data structure during its 100 driver fetches this data structure during its initialization process, and sets 101 up the DMA buffers and character devices accor 101 up the DMA buffers and character devices accordingly. As a result, a single 102 driver is used to work out of the box with any 102 driver is used to work out of the box with any Xillybus IP core. 103 103 104 The data structure just mentioned should not b 104 The data structure just mentioned should not be confused with PCI's 105 configuration space or the Flattened Device Tr 105 configuration space or the Flattened Device Tree. 106 106 107 Usage 107 Usage 108 ===== 108 ===== 109 109 110 User interface 110 User interface 111 -------------- 111 -------------- 112 112 113 On the host, all interface with Xillybus is do 113 On the host, all interface with Xillybus is done through /dev/xillybus_* 114 device files, which are generated automaticall 114 device files, which are generated automatically as the drivers loads. The 115 names of these files depend on the IP core tha 115 names of these files depend on the IP core that is loaded in the FPGA (see 116 Probing below). To communicate with the FPGA, 116 Probing below). To communicate with the FPGA, open the device file that 117 corresponds to the hardware FIFO you want to s 117 corresponds to the hardware FIFO you want to send data or receive data from, 118 and use plain write() or read() calls, just li 118 and use plain write() or read() calls, just like with a regular pipe. In 119 particular, it makes perfect sense to go:: 119 particular, it makes perfect sense to go:: 120 120 121 $ cat mydata > /dev/xillybus_thisfifo 121 $ cat mydata > /dev/xillybus_thisfifo 122 122 123 $ cat /dev/xillybus_thatfifo > hisdata 123 $ cat /dev/xillybus_thatfifo > hisdata 124 124 125 possibly pressing CTRL-C as some stage, even t 125 possibly pressing CTRL-C as some stage, even though the xillybus_* pipes have 126 the capability to send an EOF (but may not use 126 the capability to send an EOF (but may not use it). 127 127 128 The driver and hardware are designed to behave 128 The driver and hardware are designed to behave sensibly as pipes, including: 129 129 130 * Supporting non-blocking I/O (by setting O_NO 130 * Supporting non-blocking I/O (by setting O_NONBLOCK on open() ). 131 131 132 * Supporting poll() and select(). 132 * Supporting poll() and select(). 133 133 134 * Being bandwidth efficient under load (using 134 * Being bandwidth efficient under load (using DMA) but also handle small 135 pieces of data sent across (like TCP/IP) by 135 pieces of data sent across (like TCP/IP) by autoflushing. 136 136 137 A device file can be read only, write only or 137 A device file can be read only, write only or bidirectional. Bidirectional 138 device files are treated like two independent 138 device files are treated like two independent pipes (except for sharing a 139 "channel" structure in the implementation code 139 "channel" structure in the implementation code). 140 140 141 Synchronization 141 Synchronization 142 --------------- 142 --------------- 143 143 144 Xillybus pipes are configured (on the IP core) 144 Xillybus pipes are configured (on the IP core) to be either synchronous or 145 asynchronous. For a synchronous pipe, write() 145 asynchronous. For a synchronous pipe, write() returns successfully only after 146 some data has been submitted and acknowledged 146 some data has been submitted and acknowledged by the FPGA. This slows down 147 bulk data transfers, and is nearly impossible 147 bulk data transfers, and is nearly impossible for use with streams that 148 require data at a constant rate: There is no d 148 require data at a constant rate: There is no data transmitted to the FPGA 149 between write() calls, in particular when the 149 between write() calls, in particular when the process loses the CPU. 150 150 151 When a pipe is configured asynchronous, write( 151 When a pipe is configured asynchronous, write() returns if there was enough 152 room in the buffers to store any of the data i 152 room in the buffers to store any of the data in the buffers. 153 153 154 For FPGA to host pipes, asynchronous pipes all 154 For FPGA to host pipes, asynchronous pipes allow data transfer from the FPGA 155 as soon as the respective device file is opene 155 as soon as the respective device file is opened, regardless of if the data 156 has been requested by a read() call. On synchr 156 has been requested by a read() call. On synchronous pipes, only the amount 157 of data requested by a read() call is transmit 157 of data requested by a read() call is transmitted. 158 158 159 In summary, for synchronous pipes, data betwee 159 In summary, for synchronous pipes, data between the host and FPGA is 160 transmitted only to satisfy the read() or writ 160 transmitted only to satisfy the read() or write() call currently handled 161 by the driver, and those calls wait for the tr 161 by the driver, and those calls wait for the transmission to complete before 162 returning. 162 returning. 163 163 164 Note that the synchronization attribute has no 164 Note that the synchronization attribute has nothing to do with the possibility 165 that read() or write() completes less bytes th 165 that read() or write() completes less bytes than requested. There is a 166 separate configuration flag ("allowpartial") t 166 separate configuration flag ("allowpartial") that determines whether such a 167 partial completion is allowed. 167 partial completion is allowed. 168 168 169 Seekable pipes 169 Seekable pipes 170 -------------- 170 -------------- 171 171 172 A synchronous pipe can be configured to have t 172 A synchronous pipe can be configured to have the stream's position exposed 173 to the user logic at the FPGA. Such a pipe is 173 to the user logic at the FPGA. Such a pipe is also seekable on the host API. 174 With this feature, a memory or register interf 174 With this feature, a memory or register interface can be attached on the 175 FPGA side to the seekable stream. Reading or w 175 FPGA side to the seekable stream. Reading or writing to a certain address in 176 the attached memory is done by seeking to the 176 the attached memory is done by seeking to the desired address, and calling 177 read() or write() as required. 177 read() or write() as required. 178 178 179 179 180 Internals 180 Internals 181 ========= 181 ========= 182 182 183 Source code organization 183 Source code organization 184 ------------------------ 184 ------------------------ 185 185 186 The Xillybus driver consists of a core module, 186 The Xillybus driver consists of a core module, xillybus_core.c, and modules 187 that depend on the specific bus interface (xil 187 that depend on the specific bus interface (xillybus_of.c and xillybus_pcie.c). 188 188 189 The bus specific modules are those probed when 189 The bus specific modules are those probed when a suitable device is found by 190 the kernel. Since the DMA mapping and synchron 190 the kernel. Since the DMA mapping and synchronization functions, which are bus 191 dependent by their nature, are used by the cor 191 dependent by their nature, are used by the core module, a 192 xilly_endpoint_hardware structure is passed to 192 xilly_endpoint_hardware structure is passed to the core module on 193 initialization. This structure is populated wi 193 initialization. This structure is populated with pointers to wrapper functions 194 which execute the DMA-related operations on th 194 which execute the DMA-related operations on the bus. 195 195 196 Pipe attributes 196 Pipe attributes 197 --------------- 197 --------------- 198 198 199 Each pipe has a number of attributes which are 199 Each pipe has a number of attributes which are set when the FPGA component 200 (IP core) is built. They are fetched from the 200 (IP core) is built. They are fetched from the IDT (the data structure which 201 defines the core's configuration, see Probing 201 defines the core's configuration, see Probing below) by xilly_setupchannels() 202 in xillybus_core.c as follows: 202 in xillybus_core.c as follows: 203 203 204 * is_writebuf: The pipe's direction. A non-zer 204 * is_writebuf: The pipe's direction. A non-zero value means it's an FPGA to 205 host pipe (the FPGA "writes"). 205 host pipe (the FPGA "writes"). 206 206 207 * channelnum: The pipe's identification number 207 * channelnum: The pipe's identification number in communication between the 208 host and FPGA. 208 host and FPGA. 209 209 210 * format: The underlying data width. See Data 210 * format: The underlying data width. See Data Granularity below. 211 211 212 * allowpartial: A non-zero value means that a 212 * allowpartial: A non-zero value means that a read() or write() (whichever 213 applies) may return with less than the reque 213 applies) may return with less than the requested number of bytes. The common 214 choice is a non-zero value, to match standar 214 choice is a non-zero value, to match standard UNIX behavior. 215 215 216 * synchronous: A non-zero value means that the 216 * synchronous: A non-zero value means that the pipe is synchronous. See 217 Synchronization above. 217 Synchronization above. 218 218 219 * bufsize: Each DMA buffer's size. Always a po 219 * bufsize: Each DMA buffer's size. Always a power of two. 220 220 221 * bufnum: The number of buffers allocated for 221 * bufnum: The number of buffers allocated for this pipe. Always a power of two. 222 222 223 * exclusive_open: A non-zero value forces excl 223 * exclusive_open: A non-zero value forces exclusive opening of the associated 224 device file. If the device file is bidirecti 224 device file. If the device file is bidirectional, and already opened only in 225 one direction, the opposite direction may be 225 one direction, the opposite direction may be opened once. 226 226 227 * seekable: A non-zero value indicates that th 227 * seekable: A non-zero value indicates that the pipe is seekable. See 228 Seekable pipes above. 228 Seekable pipes above. 229 229 230 * supports_nonempty: A non-zero value (which i 230 * supports_nonempty: A non-zero value (which is typical) indicates that the 231 hardware will send the messages that are nec 231 hardware will send the messages that are necessary to support select() and 232 poll() for this pipe. 232 poll() for this pipe. 233 233 234 Host never reads from the FPGA 234 Host never reads from the FPGA 235 ------------------------------ 235 ------------------------------ 236 236 237 Even though PCI Express is hotpluggable in gen 237 Even though PCI Express is hotpluggable in general, a typical motherboard 238 doesn't expect a card to go away all of the su 238 doesn't expect a card to go away all of the sudden. But since the PCIe card 239 is based upon reprogrammable logic, a sudden d 239 is based upon reprogrammable logic, a sudden disappearance from the bus is 240 quite likely as a result of an accidental repr 240 quite likely as a result of an accidental reprogramming of the FPGA while the 241 host is up. In practice, nothing happens immed 241 host is up. In practice, nothing happens immediately in such a situation. But 242 if the host attempts to read from an address t 242 if the host attempts to read from an address that is mapped to the PCI Express 243 device, that leads to an immediate freeze of t 243 device, that leads to an immediate freeze of the system on some motherboards, 244 even though the PCIe standard requires a grace 244 even though the PCIe standard requires a graceful recovery. 245 245 246 In order to avoid these freezes, the Xillybus 246 In order to avoid these freezes, the Xillybus driver refrains completely from 247 reading from the device's register space. All 247 reading from the device's register space. All communication from the FPGA to 248 the host is done through DMA. In particular, t 248 the host is done through DMA. In particular, the Interrupt Service Routine 249 doesn't follow the common practice of checking 249 doesn't follow the common practice of checking a status register when it's 250 invoked. Rather, the FPGA prepares a small buf 250 invoked. Rather, the FPGA prepares a small buffer which contains short 251 messages, which inform the host what the inter 251 messages, which inform the host what the interrupt was about. 252 252 253 This mechanism is used on non-PCIe buses as we 253 This mechanism is used on non-PCIe buses as well for the sake of uniformity. 254 254 255 255 256 Channels, pipes, and the message channel 256 Channels, pipes, and the message channel 257 ---------------------------------------- 257 ---------------------------------------- 258 258 259 Each of the (possibly bidirectional) pipes pre 259 Each of the (possibly bidirectional) pipes presented to the user is allocated 260 a data channel between the FPGA and the host. 260 a data channel between the FPGA and the host. The distinction between channels 261 and pipes is necessary only because of channel 261 and pipes is necessary only because of channel 0, which is used for interrupt- 262 related messages from the FPGA, and has no pip 262 related messages from the FPGA, and has no pipe attached to it. 263 263 264 Data streaming 264 Data streaming 265 -------------- 265 -------------- 266 266 267 Even though a non-segmented data stream is pre 267 Even though a non-segmented data stream is presented to the user at both 268 sides, the implementation relies on a set of D 268 sides, the implementation relies on a set of DMA buffers which is allocated 269 for each channel. For the sake of illustration 269 for each channel. For the sake of illustration, let's take the FPGA to host 270 direction: As data streams into the respective 270 direction: As data streams into the respective channel's interface in the 271 FPGA, the Xillybus IP core writes it to one of 271 FPGA, the Xillybus IP core writes it to one of the DMA buffers. When the 272 buffer is full, the FPGA informs the host abou 272 buffer is full, the FPGA informs the host about that (appending a 273 XILLYMSG_OPCODE_RELEASEBUF message channel 0 a 273 XILLYMSG_OPCODE_RELEASEBUF message channel 0 and sending an interrupt if 274 necessary). The host responds by making the da 274 necessary). The host responds by making the data available for reading through 275 the character device. When all data has been r 275 the character device. When all data has been read, the host writes on the 276 FPGA's buffer control register, allowing the b !! 276 the FPGA's buffer control register, allowing the buffer's overwriting. Flow 277 control mechanisms exist on both sides to prev 277 control mechanisms exist on both sides to prevent underflows and overflows. 278 278 279 This is not good enough for creating a TCP/IP- 279 This is not good enough for creating a TCP/IP-like stream: If the data flow 280 stops momentarily before a DMA buffer is fille 280 stops momentarily before a DMA buffer is filled, the intuitive expectation is 281 that the partial data in buffer will arrive an 281 that the partial data in buffer will arrive anyhow, despite the buffer not 282 being completed. This is implemented by adding 282 being completed. This is implemented by adding a field in the 283 XILLYMSG_OPCODE_RELEASEBUF message, through wh 283 XILLYMSG_OPCODE_RELEASEBUF message, through which the FPGA informs not just 284 which buffer is submitted, but how much data i 284 which buffer is submitted, but how much data it contains. 285 285 286 But the FPGA will submit a partially filled bu 286 But the FPGA will submit a partially filled buffer only if directed to do so 287 by the host. This situation occurs when the re 287 by the host. This situation occurs when the read() method has been blocking 288 for XILLY_RX_TIMEOUT jiffies (currently 10 ms) 288 for XILLY_RX_TIMEOUT jiffies (currently 10 ms), after which the host commands 289 the FPGA to submit a DMA buffer as soon as it 289 the FPGA to submit a DMA buffer as soon as it can. This timeout mechanism 290 balances between bus bandwidth efficiency (pre 290 balances between bus bandwidth efficiency (preventing a lot of partially 291 filled buffers being sent) and a latency held 291 filled buffers being sent) and a latency held fairly low for tails of data. 292 292 293 A similar setting is used in the host to FPGA 293 A similar setting is used in the host to FPGA direction. The handling of 294 partial DMA buffers is somewhat different, tho 294 partial DMA buffers is somewhat different, though. The user can tell the 295 driver to submit all data it has in the buffer 295 driver to submit all data it has in the buffers to the FPGA, by issuing a 296 write() with the byte count set to zero. This 296 write() with the byte count set to zero. This is similar to a flush request, 297 but it doesn't block. There is also an autoflu 297 but it doesn't block. There is also an autoflushing mechanism, which triggers 298 an equivalent flush roughly XILLY_RX_TIMEOUT j 298 an equivalent flush roughly XILLY_RX_TIMEOUT jiffies after the last write(). 299 This allows the user to be oblivious about the 299 This allows the user to be oblivious about the underlying buffering mechanism 300 and yet enjoy a stream-like interface. 300 and yet enjoy a stream-like interface. 301 301 302 Note that the issue of partial buffer flushing 302 Note that the issue of partial buffer flushing is irrelevant for pipes having 303 the "synchronous" attribute nonzero, since syn 303 the "synchronous" attribute nonzero, since synchronous pipes don't allow data 304 to lay around in the DMA buffers between read( 304 to lay around in the DMA buffers between read() and write() anyhow. 305 305 306 Data granularity 306 Data granularity 307 ---------------- 307 ---------------- 308 308 309 The data arrives or is sent at the FPGA as 8, 309 The data arrives or is sent at the FPGA as 8, 16 or 32 bit wide words, as 310 configured by the "format" attribute. Whenever 310 configured by the "format" attribute. Whenever possible, the driver attempts 311 to hide this when the pipe is accessed differe 311 to hide this when the pipe is accessed differently from its natural alignment. 312 For example, reading single bytes from a pipe 312 For example, reading single bytes from a pipe with 32 bit granularity works 313 with no issues. Writing single bytes to pipes 313 with no issues. Writing single bytes to pipes with 16 or 32 bit granularity 314 will also work, but the driver can't send part 314 will also work, but the driver can't send partially completed words to the 315 FPGA, so the transmission of up to one word ma 315 FPGA, so the transmission of up to one word may be held until it's fully 316 occupied with user data. 316 occupied with user data. 317 317 318 This somewhat complicates the handling of host 318 This somewhat complicates the handling of host to FPGA streams, because 319 when a buffer is flushed, it may contain up to 319 when a buffer is flushed, it may contain up to 3 bytes don't form a word in 320 the FPGA, and hence can't be sent. To prevent 320 the FPGA, and hence can't be sent. To prevent loss of data, these leftover 321 bytes need to be moved to the next buffer. The 321 bytes need to be moved to the next buffer. The parts in xillybus_core.c 322 that mention "leftovers" in some way are relat 322 that mention "leftovers" in some way are related to this complication. 323 323 324 Probing 324 Probing 325 ------- 325 ------- 326 326 327 As mentioned earlier, the number of pipes that 327 As mentioned earlier, the number of pipes that are created when the driver 328 loads and their attributes depend on the Xilly 328 loads and their attributes depend on the Xillybus IP core in the FPGA. During 329 the driver's initialization, a blob containing 329 the driver's initialization, a blob containing configuration info, the 330 Interface Description Table (IDT), is sent fro 330 Interface Description Table (IDT), is sent from the FPGA to the host. The 331 bootstrap process is done in three phases: 331 bootstrap process is done in three phases: 332 332 333 1. Acquire the length of the IDT, so a buffer 333 1. Acquire the length of the IDT, so a buffer can be allocated for it. This 334 is done by sending a quiesce command to the 334 is done by sending a quiesce command to the device, since the acknowledge 335 for this command contains the IDT's buffer 335 for this command contains the IDT's buffer length. 336 336 337 2. Acquire the IDT itself. 337 2. Acquire the IDT itself. 338 338 339 3. Create the interfaces according to the IDT. 339 3. Create the interfaces according to the IDT. 340 340 341 Buffer allocation 341 Buffer allocation 342 ----------------- 342 ----------------- 343 343 344 In order to simplify the logic that prevents i 344 In order to simplify the logic that prevents illegal boundary crossings of 345 PCIe packets, the following rule applies: If a 345 PCIe packets, the following rule applies: If a buffer is smaller than 4kB, 346 it must not cross a 4kB boundary. Otherwise, i 346 it must not cross a 4kB boundary. Otherwise, it must be 4kB aligned. The 347 xilly_setupchannels() functions allocates thes 347 xilly_setupchannels() functions allocates these buffers by requesting whole 348 pages from the kernel, and diving them into DM 348 pages from the kernel, and diving them into DMA buffers as necessary. Since 349 all buffers' sizes are powers of two, it's pos 349 all buffers' sizes are powers of two, it's possible to pack any set of such 350 buffers, with a maximal waste of one page of m 350 buffers, with a maximal waste of one page of memory. 351 351 352 All buffers are allocated when the driver is l 352 All buffers are allocated when the driver is loaded. This is necessary, 353 since large continuous physical memory segment 353 since large continuous physical memory segments are sometimes requested, 354 which are more likely to be available when the 354 which are more likely to be available when the system is freshly booted. 355 355 356 The allocation of buffer memory takes place in 356 The allocation of buffer memory takes place in the same order they appear in 357 the IDT. The driver relies on a rule that the 357 the IDT. The driver relies on a rule that the pipes are sorted with decreasing 358 buffer size in the IDT. If a requested buffer 358 buffer size in the IDT. If a requested buffer is larger or equal to a page, 359 the necessary number of pages is requested fro 359 the necessary number of pages is requested from the kernel, and these are 360 used for this buffer. If the requested buffer 360 used for this buffer. If the requested buffer is smaller than a page, one 361 single page is requested from the kernel, and 361 single page is requested from the kernel, and that page is partially used. 362 Or, if there already is a partially used page 362 Or, if there already is a partially used page at hand, the buffer is packed 363 into that page. It can be shown that all pages 363 into that page. It can be shown that all pages requested from the kernel 364 (except possibly for the last) are 100% utiliz 364 (except possibly for the last) are 100% utilized this way. 365 365 366 The "nonempty" message (supporting poll) 366 The "nonempty" message (supporting poll) 367 ---------------------------------------- 367 ---------------------------------------- 368 368 369 In order to support the "poll" method (and hen 369 In order to support the "poll" method (and hence select() ), there is a small 370 catch regarding the FPGA to host direction: Th 370 catch regarding the FPGA to host direction: The FPGA may have filled a DMA 371 buffer with some data, but not submitted that 371 buffer with some data, but not submitted that buffer. If the host waited for 372 the buffer's submission by the FPGA, there wou 372 the buffer's submission by the FPGA, there would be a possibility that the 373 FPGA side has sent data, but a select() call w 373 FPGA side has sent data, but a select() call would still block, because the 374 host has not received any notification about t 374 host has not received any notification about this. This is solved with 375 XILLYMSG_OPCODE_NONEMPTY messages sent by the 375 XILLYMSG_OPCODE_NONEMPTY messages sent by the FPGA when a channel goes from 376 completely empty to containing some data. 376 completely empty to containing some data. 377 377 378 These messages are used only to support poll() 378 These messages are used only to support poll() and select(). The IP core can 379 be configured not to send them for a slight re 379 be configured not to send them for a slight reduction of bandwidth.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.