1 ============================== 1 ============================== 2 PXA/MMP - DMA Slave controller 2 PXA/MMP - DMA Slave controller 3 ============================== 3 ============================== 4 4 5 Constraints 5 Constraints 6 =========== 6 =========== 7 7 8 a) Transfers hot queuing 8 a) Transfers hot queuing 9 A driver submitting a transfer and issuing it 9 A driver submitting a transfer and issuing it should be granted the transfer 10 is queued even on a running DMA channel. 10 is queued even on a running DMA channel. 11 This implies that the queuing doesn't wait for 11 This implies that the queuing doesn't wait for the previous transfer end, 12 and that the descriptor chaining is not only d 12 and that the descriptor chaining is not only done in the irq/tasklet code 13 triggered by the end of the transfer. 13 triggered by the end of the transfer. 14 A transfer which is submitted and issued on a 14 A transfer which is submitted and issued on a phy doesn't wait for a phy to 15 stop and restart, but is submitted on a "runni 15 stop and restart, but is submitted on a "running channel". The other 16 drivers, especially mmp_pdma waited for the ph 16 drivers, especially mmp_pdma waited for the phy to stop before relaunching 17 a new transfer. 17 a new transfer. 18 18 19 b) All transfers having asked for confirmation 19 b) All transfers having asked for confirmation should be signaled 20 Any issued transfer with DMA_PREP_INTERRUPT sh 20 Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call. 21 This implies that even if an irq/tasklet is tr 21 This implies that even if an irq/tasklet is triggered by end of tx1, but 22 at the time of irq/dma tx2 is already finished 22 at the time of irq/dma tx2 is already finished, tx1->complete() and 23 tx2->complete() should be called. 23 tx2->complete() should be called. 24 24 25 c) Channel running state 25 c) Channel running state 26 A driver should be able to query if a channel 26 A driver should be able to query if a channel is running or not. For the 27 multimedia case, such as video capture, if a t 27 multimedia case, such as video capture, if a transfer is submitted and then 28 a check of the DMA channel reports a "stopped 28 a check of the DMA channel reports a "stopped channel", the transfer should 29 not be issued until the next "start of frame i 29 not be issued until the next "start of frame interrupt", hence the need to 30 know if a channel is in running or stopped sta 30 know if a channel is in running or stopped state. 31 31 32 d) Bandwidth guarantee 32 d) Bandwidth guarantee 33 The PXA architecture has 4 levels of DMAs prio 33 The PXA architecture has 4 levels of DMAs priorities : high, normal, low. 34 The high priorities get twice as much bandwidt 34 The high priorities get twice as much bandwidth as the normal, which get twice 35 as much as the low priorities. 35 as much as the low priorities. 36 A driver should be able to request a priority, 36 A driver should be able to request a priority, especially the real-time 37 ones such as pxa_camera with (big) throughputs 37 ones such as pxa_camera with (big) throughputs. 38 38 39 Design 39 Design 40 ====== 40 ====== 41 a) Virtual channels 41 a) Virtual channels 42 Same concept as in sa11x0 driver, ie. a driver 42 Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual 43 channel" linked to the requestor line, and the 43 channel" linked to the requestor line, and the physical DMA channel is 44 assigned on the fly when the transfer is issue 44 assigned on the fly when the transfer is issued. 45 45 46 b) Transfer anatomy for a scatter-gather trans 46 b) Transfer anatomy for a scatter-gather transfer 47 47 48 :: 48 :: 49 49 50 +------------+-----+---------------+------- 50 +------------+-----+---------------+----------------+-----------------+ 51 | desc-sg[0] | ... | desc-sg[last] | status 51 | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker | 52 +------------+-----+---------------+------- 52 +------------+-----+---------------+----------------+-----------------+ 53 53 54 This structure is pointed by dma->sg_cpu. 54 This structure is pointed by dma->sg_cpu. 55 The descriptors are used as follows : 55 The descriptors are used as follows : 56 56 57 - desc-sg[i]: i-th descriptor, transferrin 57 - desc-sg[i]: i-th descriptor, transferring the i-th sg 58 element to the video buffer scatter gath 58 element to the video buffer scatter gather 59 59 60 - status updater 60 - status updater 61 Transfers a single u32 to a well known d 61 Transfers a single u32 to a well known dma coherent memory to leave 62 a trace that this transfer is done. The 62 a trace that this transfer is done. The "well known" is unique per 63 physical channel, meaning that a read of 63 physical channel, meaning that a read of this value will tell which 64 is the last finished transfer at that po 64 is the last finished transfer at that point in time. 65 65 66 - finisher: has ddadr=DADDR_STOP, dcmd=END 66 - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN 67 67 68 - linker: has ddadr= desc-sg[0] of next tr 68 - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0 69 69 70 c) Transfers hot-chaining 70 c) Transfers hot-chaining 71 Suppose the running chain is: 71 Suppose the running chain is: 72 72 73 :: 73 :: 74 74 75 Buffer 1 Buffer 2 75 Buffer 1 Buffer 2 76 +---------+----+---+ +----+----+----+---+ 76 +---------+----+---+ +----+----+----+---+ 77 | d0 | .. | dN | l | | d0 | .. | dN | f | 77 | d0 | .. | dN | l | | d0 | .. | dN | f | 78 +---------+----+-|-+ ^----+----+----+---+ 78 +---------+----+-|-+ ^----+----+----+---+ 79 | | 79 | | 80 +----+ 80 +----+ 81 81 82 After a call to dmaengine_submit(b3), the chai 82 After a call to dmaengine_submit(b3), the chain will look like: 83 83 84 :: 84 :: 85 85 86 Buffer 1 Buffer 2 86 Buffer 1 Buffer 2 Buffer 3 87 +---------+----+---+ +----+----+----+---+ 87 +---------+----+---+ +----+----+----+---+ +----+----+----+---+ 88 | d0 | .. | dN | l | | d0 | .. | dN | l | 88 | d0 | .. | dN | l | | d0 | .. | dN | l | | d0 | .. | dN | f | 89 +---------+----+-|-+ ^----+----+----+-|-+ 89 +---------+----+-|-+ ^----+----+----+-|-+ ^----+----+----+---+ 90 | | | 90 | | | | 91 +----+ +--- 91 +----+ +----+ 92 new_l 92 new_link 93 93 94 If while new_link was created the DMA channel 94 If while new_link was created the DMA channel stopped, it is _not_ 95 restarted. Hot-chaining doesn't break the assu 95 restarted. Hot-chaining doesn't break the assumption that 96 dma_async_issue_pending() is to be used to ens 96 dma_async_issue_pending() is to be used to ensure the transfer is actually started. 97 97 98 One exception to this rule : 98 One exception to this rule : 99 99 100 - if Buffer1 and Buffer2 had all their address 100 - if Buffer1 and Buffer2 had all their addresses 8 bytes aligned 101 101 102 - and if Buffer3 has at least one address not 102 - and if Buffer3 has at least one address not 4 bytes aligned 103 103 104 - then hot-chaining cannot happen, as the chan 104 - then hot-chaining cannot happen, as the channel must be stopped, the 105 "align bit" must be set, and the channel res 105 "align bit" must be set, and the channel restarted As a consequence, 106 such a transfer tx_submit() will be queued o 106 such a transfer tx_submit() will be queued on the submitted queue, and 107 this specific case if the DMA is already run 107 this specific case if the DMA is already running in aligned mode. 108 108 109 d) Transfers completion updater 109 d) Transfers completion updater 110 Each time a transfer is completed on a channel 110 Each time a transfer is completed on a channel, an interrupt might be 111 generated or not, up to the client's request. 111 generated or not, up to the client's request. But in each case, the last 112 descriptor of a transfer, the "status updater" 112 descriptor of a transfer, the "status updater", will write the latest 113 transfer being completed into the physical cha 113 transfer being completed into the physical channel's completion mark. 114 114 115 This will speed up residue calculation, for la 115 This will speed up residue calculation, for large transfers such as video 116 buffers which hold around 6k descriptors or mo 116 buffers which hold around 6k descriptors or more. This also allows without 117 any lock to find out what is the latest comple 117 any lock to find out what is the latest completed transfer in a running 118 DMA chain. 118 DMA chain. 119 119 120 e) Transfers completion, irq and tasklet 120 e) Transfers completion, irq and tasklet 121 When a transfer flagged as "DMA_PREP_INTERRUPT 121 When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq 122 is raised. Upon this interrupt, a tasklet is s 122 is raised. Upon this interrupt, a tasklet is scheduled for the physical 123 channel. 123 channel. 124 124 125 The tasklet is responsible for : 125 The tasklet is responsible for : 126 126 127 - reading the physical channel last updater ma 127 - reading the physical channel last updater mark 128 128 129 - calling all the transfer callbacks of finish 129 - calling all the transfer callbacks of finished transfers, based on 130 that mark, and each transfer flags. 130 that mark, and each transfer flags. 131 131 132 If a transfer is completed while this handling 132 If a transfer is completed while this handling is done, a dma irq will 133 be raised, and the tasklet will be scheduled o 133 be raised, and the tasklet will be scheduled once again, having a new 134 updater mark. 134 updater mark. 135 135 136 f) Residue 136 f) Residue 137 Residue granularity will be descriptor based. 137 Residue granularity will be descriptor based. The issued but not completed 138 transfers will be scanned for all of their des 138 transfers will be scanned for all of their descriptors against the 139 currently running descriptor. 139 currently running descriptor. 140 140 141 g) Most complicated case of driver's tx queues 141 g) Most complicated case of driver's tx queues 142 The most tricky situation is when : 142 The most tricky situation is when : 143 143 144 - there are not "acked" transfers (tx0) 144 - there are not "acked" transfers (tx0) 145 145 146 - a driver submitted an aligned tx1, not chai 146 - a driver submitted an aligned tx1, not chained 147 147 148 - a driver submitted an aligned tx2 => tx2 is 148 - a driver submitted an aligned tx2 => tx2 is cold chained to tx1 149 149 150 - a driver issued tx1+tx2 => channel is runni 150 - a driver issued tx1+tx2 => channel is running in aligned mode 151 151 152 - a driver submitted an aligned tx3 => tx3 is 152 - a driver submitted an aligned tx3 => tx3 is hot-chained 153 153 154 - a driver submitted an unaligned tx4 => tx4 154 - a driver submitted an unaligned tx4 => tx4 is put in submitted queue, 155 not chained 155 not chained 156 156 157 - a driver issued tx4 => tx4 is put in issued 157 - a driver issued tx4 => tx4 is put in issued queue, not chained 158 158 159 - a driver submitted an aligned tx5 => tx5 is 159 - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not 160 chained 160 chained 161 161 162 - a driver submitted an aligned tx6 => tx6 is 162 - a driver submitted an aligned tx6 => tx6 is put in submitted queue, 163 cold chained to tx5 163 cold chained to tx5 164 164 165 This translates into (after tx4 is issued) : 165 This translates into (after tx4 is issued) : 166 166 167 - issued queue 167 - issued queue 168 168 169 :: 169 :: 170 170 171 +-----+ +-----+ +-----+ +-----+ 171 +-----+ +-----+ +-----+ +-----+ 172 | tx1 | | tx2 | | tx3 | | tx4 | 172 | tx1 | | tx2 | | tx3 | | tx4 | 173 +---|-+ ^---|-+ ^-----+ +-----+ 173 +---|-+ ^---|-+ ^-----+ +-----+ 174 | | | | 174 | | | | 175 +---+ +---+ 175 +---+ +---+ 176 - submitted queue 176 - submitted queue 177 +-----+ +-----+ 177 +-----+ +-----+ 178 | tx5 | | tx6 | 178 | tx5 | | tx6 | 179 +---|-+ ^-----+ 179 +---|-+ ^-----+ 180 | | 180 | | 181 +---+ 181 +---+ 182 182 183 - completed queue : empty 183 - completed queue : empty 184 184 185 - allocated queue : tx0 185 - allocated queue : tx0 186 186 187 It should be noted that after tx3 is completed 187 It should be noted that after tx3 is completed, the channel is stopped, and 188 restarted in "unaligned mode" to handle tx4. 188 restarted in "unaligned mode" to handle tx4. 189 189 190 Author: Robert Jarzmik <robert.jarzmik@free.fr> 190 Author: Robert Jarzmik <robert.jarzmik@free.fr>
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.