~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/driver-api/mmc/mmc-async-req.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 ========================
  2 MMC Asynchronous Request
  3 ========================
  4 
  5 Rationale
  6 =========
  7 
  8 How significant is the cache maintenance overhead?
  9 
 10 It depends. Fast eMMC and multiple cache levels with speculative cache
 11 pre-fetch makes the cache overhead relatively significant. If the DMA
 12 preparations for the next request are done in parallel with the current
 13 transfer, the DMA preparation overhead would not affect the MMC performance.
 14 
 15 The intention of non-blocking (asynchronous) MMC requests is to minimize the
 16 time between when an MMC request ends and another MMC request begins.
 17 
 18 Using mmc_wait_for_req(), the MMC controller is idle while dma_map_sg and
 19 dma_unmap_sg are processing. Using non-blocking MMC requests makes it
 20 possible to prepare the caches for next job in parallel with an active
 21 MMC request.
 22 
 23 MMC block driver
 24 ================
 25 
 26 The mmc_blk_issue_rw_rq() in the MMC block driver is made non-blocking.
 27 
 28 The increase in throughput is proportional to the time it takes to
 29 prepare (major part of preparations are dma_map_sg() and dma_unmap_sg())
 30 a request and how fast the memory is. The faster the MMC/SD is the
 31 more significant the prepare request time becomes. Roughly the expected
 32 performance gain is 5% for large writes and 10% on large reads on a L2 cache
 33 platform. In power save mode, when clocks run on a lower frequency, the DMA
 34 preparation may cost even more. As long as these slower preparations are run
 35 in parallel with the transfer performance won't be affected.
 36 
 37 Details on measurements from IOZone and mmc_test
 38 ================================================
 39 
 40 https://wiki.linaro.org/WorkingGroups/Kernel/Specs/StoragePerfMMC-async-req
 41 
 42 MMC core API extension
 43 ======================
 44 
 45 There is one new public function mmc_start_req().
 46 
 47 It starts a new MMC command request for a host. The function isn't
 48 truly non-blocking. If there is an ongoing async request it waits
 49 for completion of that request and starts the new one and returns. It
 50 doesn't wait for the new request to complete. If there is no ongoing
 51 request it starts the new request and returns immediately.
 52 
 53 MMC host extensions
 54 ===================
 55 
 56 There are two optional members in the mmc_host_ops -- pre_req() and
 57 post_req() -- that the host driver may implement in order to move work
 58 to before and after the actual mmc_host_ops.request() function is called.
 59 
 60 In the DMA case pre_req() may do dma_map_sg() and prepare the DMA
 61 descriptor, and post_req() runs the dma_unmap_sg().
 62 
 63 Optimize for the first request
 64 ==============================
 65 
 66 The first request in a series of requests can't be prepared in parallel
 67 with the previous transfer, since there is no previous request.
 68 
 69 The argument is_first_req in pre_req() indicates that there is no previous
 70 request. The host driver may optimize for this scenario to minimize
 71 the performance loss. A way to optimize for this is to split the current
 72 request in two chunks, prepare the first chunk and start the request,
 73 and finally prepare the second chunk and start the transfer.
 74 
 75 Pseudocode to handle is_first_req scenario with minimal prepare overhead::
 76 
 77   if (is_first_req && req->size > threshold)
 78      /* start MMC transfer for the complete transfer size */
 79      mmc_start_command(MMC_CMD_TRANSFER_FULL_SIZE);
 80 
 81      /*
 82       * Begin to prepare DMA while cmd is being processed by MMC.
 83       * The first chunk of the request should take the same time
 84       * to prepare as the "MMC process command time".
 85       * If prepare time exceeds MMC cmd time
 86       * the transfer is delayed, guesstimate max 4k as first chunk size.
 87       */
 88       prepare_1st_chunk_for_dma(req);
 89       /* flush pending desc to the DMAC (dmaengine.h) */
 90       dma_issue_pending(req->dma_desc);
 91 
 92       prepare_2nd_chunk_for_dma(req);
 93       /*
 94        * The second issue_pending should be called before MMC runs out
 95        * of the first chunk. If the MMC runs out of the first data chunk
 96        * before this call, the transfer is delayed.
 97        */
 98       dma_issue_pending(req->dma_desc);

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php