~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/sound/designs/compress-offload.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/sound/designs/compress-offload.rst (Version linux-6.12-rc7) and /Documentation/sound/designs/compress-offload.rst (Version linux-4.12.14)


  1 =========================                           1 =========================
  2 ALSA Compress-Offload API                           2 ALSA Compress-Offload API
  3 =========================                           3 =========================
  4                                                     4 
  5 Pierre-Louis.Bossart <pierre-louis.bossart@linu      5 Pierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com>
  6                                                     6 
  7 Vinod Koul <vinod.koul@linux.intel.com>              7 Vinod Koul <vinod.koul@linux.intel.com>
  8                                                     8 
  9                                                     9 
 10 Overview                                           10 Overview
 11 ========                                           11 ========
 12 Since its early days, the ALSA API was defined     12 Since its early days, the ALSA API was defined with PCM support or
 13 constant bitrates payloads such as IEC61937 in     13 constant bitrates payloads such as IEC61937 in mind. Arguments and
 14 returned values in frames are the norm, making     14 returned values in frames are the norm, making it a challenge to
 15 extend the existing API to compressed data str     15 extend the existing API to compressed data streams.
 16                                                    16 
 17 In recent years, audio digital signal processo     17 In recent years, audio digital signal processors (DSP) were integrated
 18 in system-on-chip designs, and DSPs are also i     18 in system-on-chip designs, and DSPs are also integrated in audio
 19 codecs. Processing compressed data on such DSP     19 codecs. Processing compressed data on such DSPs results in a dramatic
 20 reduction of power consumption compared to hos     20 reduction of power consumption compared to host-based
 21 processing. Support for such hardware has not      21 processing. Support for such hardware has not been very good in Linux,
 22 mostly because of a lack of a generic API avai     22 mostly because of a lack of a generic API available in the mainline
 23 kernel.                                            23 kernel.
 24                                                    24 
 25 Rather than requiring a compatibility break wi     25 Rather than requiring a compatibility break with an API change of the
 26 ALSA PCM interface, a new 'Compressed Data' AP     26 ALSA PCM interface, a new 'Compressed Data' API is introduced to
 27 provide a control and data-streaming interface     27 provide a control and data-streaming interface for audio DSPs.
 28                                                    28 
 29 The design of this API was inspired by the 2-y     29 The design of this API was inspired by the 2-year experience with the
 30 Intel Moorestown SOC, with many corrections re     30 Intel Moorestown SOC, with many corrections required to upstream the
 31 API in the mainline kernel instead of the stag     31 API in the mainline kernel instead of the staging tree and make it
 32 usable by others.                                  32 usable by others.
 33                                                    33 
 34                                                    34 
 35 Requirements                                       35 Requirements
 36 ============                                       36 ============
 37 The main requirements are:                         37 The main requirements are:
 38                                                    38 
 39 - separation between byte counts and time. Com     39 - separation between byte counts and time. Compressed formats may have
 40   a header per file, per frame, or no header a     40   a header per file, per frame, or no header at all. The payload size
 41   may vary from frame-to-frame. As a result, i     41   may vary from frame-to-frame. As a result, it is not possible to
 42   estimate reliably the duration of audio buff     42   estimate reliably the duration of audio buffers when handling
 43   compressed data. Dedicated mechanisms are re     43   compressed data. Dedicated mechanisms are required to allow for
 44   reliable audio-video synchronization, which      44   reliable audio-video synchronization, which requires precise
 45   reporting of the number of samples rendered      45   reporting of the number of samples rendered at any given time.
 46                                                    46 
 47 - Handling of multiple formats. PCM data only      47 - Handling of multiple formats. PCM data only requires a specification
 48   of the sampling rate, number of channels and     48   of the sampling rate, number of channels and bits per sample. In
 49   contrast, compressed data comes in a variety     49   contrast, compressed data comes in a variety of formats. Audio DSPs
 50   may also provide support for a limited numbe     50   may also provide support for a limited number of audio encoders and
 51   decoders embedded in firmware, or may suppor     51   decoders embedded in firmware, or may support more choices through
 52   dynamic download of libraries.                   52   dynamic download of libraries.
 53                                                    53 
 54 - Focus on main formats. This API provides sup     54 - Focus on main formats. This API provides support for the most
 55   popular formats used for audio and video cap     55   popular formats used for audio and video capture and playback. It is
 56   likely that as audio compression technology      56   likely that as audio compression technology advances, new formats
 57   will be added.                                   57   will be added.
 58                                                    58 
 59 - Handling of multiple configurations. Even fo     59 - Handling of multiple configurations. Even for a given format like
 60   AAC, some implementations may support AAC mu     60   AAC, some implementations may support AAC multichannel but HE-AAC
 61   stereo. Likewise WMA10 level M3 may require      61   stereo. Likewise WMA10 level M3 may require too much memory and cpu
 62   cycles. The new API needs to provide a gener     62   cycles. The new API needs to provide a generic way of listing these
 63   formats.                                         63   formats.
 64                                                    64 
 65 - Rendering/Grabbing only. This API does not p     65 - Rendering/Grabbing only. This API does not provide any means of
 66   hardware acceleration, where PCM samples are     66   hardware acceleration, where PCM samples are provided back to
 67   user-space for additional processing. This A     67   user-space for additional processing. This API focuses instead on
 68   streaming compressed data to a DSP, with the     68   streaming compressed data to a DSP, with the assumption that the
 69   decoded samples are routed to a physical out     69   decoded samples are routed to a physical output or logical back-end.
 70                                                    70 
 71 - Complexity hiding. Existing user-space multi     71 - Complexity hiding. Existing user-space multimedia frameworks all
 72   have existing enums/structures for each comp     72   have existing enums/structures for each compressed format. This new
 73   API assumes the existence of a platform-spec     73   API assumes the existence of a platform-specific compatibility layer
 74   to expose, translate and make use of the cap     74   to expose, translate and make use of the capabilities of the audio
 75   DSP, eg. Android HAL or PulseAudio sinks. By     75   DSP, eg. Android HAL or PulseAudio sinks. By construction, regular
 76   applications are not supposed to make use of     76   applications are not supposed to make use of this API.
 77                                                    77 
 78                                                    78 
 79 Design                                             79 Design
 80 ======                                             80 ======
 81 The new API shares a number of concepts with t     81 The new API shares a number of concepts with the PCM API for flow
 82 control. Start, pause, resume, drain and stop      82 control. Start, pause, resume, drain and stop commands have the same
 83 semantics no matter what the content is.           83 semantics no matter what the content is.
 84                                                    84 
 85 The concept of memory ring buffer divided in a     85 The concept of memory ring buffer divided in a set of fragments is
 86 borrowed from the ALSA PCM API. However, only      86 borrowed from the ALSA PCM API. However, only sizes in bytes can be
 87 specified.                                         87 specified.
 88                                                    88 
 89 Seeks/trick modes are assumed to be handled by     89 Seeks/trick modes are assumed to be handled by the host.
 90                                                    90 
 91 The notion of rewinds/forwards is not supporte     91 The notion of rewinds/forwards is not supported. Data committed to the
 92 ring buffer cannot be invalidated, except when     92 ring buffer cannot be invalidated, except when dropping all buffers.
 93                                                    93 
 94 The Compressed Data API does not make any assu     94 The Compressed Data API does not make any assumptions on how the data
 95 is transmitted to the audio DSP. DMA transfers     95 is transmitted to the audio DSP. DMA transfers from main memory to an
 96 embedded audio cluster or to a SPI interface f     96 embedded audio cluster or to a SPI interface for external DSPs are
 97 possible. As in the ALSA PCM case, a core set      97 possible. As in the ALSA PCM case, a core set of routines is exposed;
 98 each driver implementer will have to write sup     98 each driver implementer will have to write support for a set of
 99 mandatory routines and possibly make use of op     99 mandatory routines and possibly make use of optional ones.
100                                                   100 
101 The main additions are                            101 The main additions are
102                                                   102 
103 get_caps                                          103 get_caps
104   This routine returns the list of audio forma    104   This routine returns the list of audio formats supported. Querying the
105   codecs on a capture stream will return encod    105   codecs on a capture stream will return encoders, decoders will be
106   listed for playback streams.                    106   listed for playback streams.
107                                                   107 
108 get_codec_caps                                    108 get_codec_caps
109   For each codec, this routine returns a list     109   For each codec, this routine returns a list of
110   capabilities. The intent is to make sure all    110   capabilities. The intent is to make sure all the capabilities
111   correspond to valid settings, and to minimiz    111   correspond to valid settings, and to minimize the risks of
112   configuration failures. For example, for a c    112   configuration failures. For example, for a complex codec such as AAC,
113   the number of channels supported may depend     113   the number of channels supported may depend on a specific profile. If
114   the capabilities were exposed with a single     114   the capabilities were exposed with a single descriptor, it may happen
115   that a specific combination of profiles/chan    115   that a specific combination of profiles/channels/formats may not be
116   supported. Likewise, embedded DSPs have limi    116   supported. Likewise, embedded DSPs have limited memory and cpu cycles,
117   it is likely that some implementations make     117   it is likely that some implementations make the list of capabilities
118   dynamic and dependent on existing workloads.    118   dynamic and dependent on existing workloads. In addition to codec
119   settings, this routine returns the minimum b    119   settings, this routine returns the minimum buffer size handled by the
120   implementation. This information can be a fu    120   implementation. This information can be a function of the DMA buffer
121   sizes, the number of bytes required to synch    121   sizes, the number of bytes required to synchronize, etc, and can be
122   used by userspace to define how much needs t    122   used by userspace to define how much needs to be written in the ring
123   buffer before playback can start.               123   buffer before playback can start.
124                                                   124 
125 set_params                                        125 set_params
126   This routine sets the configuration chosen f    126   This routine sets the configuration chosen for a specific codec. The
127   most important field in the parameters is th    127   most important field in the parameters is the codec type; in most
128   cases decoders will ignore other fields, whi    128   cases decoders will ignore other fields, while encoders will strictly
129   comply to the settings                          129   comply to the settings
130                                                   130 
131 get_params                                        131 get_params
132   This routines returns the actual settings us    132   This routines returns the actual settings used by the DSP. Changes to
133   the settings should remain the exception.       133   the settings should remain the exception.
134                                                   134 
135 get_timestamp                                     135 get_timestamp
136   The timestamp becomes a multiple field struc    136   The timestamp becomes a multiple field structure. It lists the number
137   of bytes transferred, the number of samples     137   of bytes transferred, the number of samples processed and the number
138   of samples rendered/grabbed. All these value    138   of samples rendered/grabbed. All these values can be used to determine
139   the average bitrate, figure out if the ring     139   the average bitrate, figure out if the ring buffer needs to be
140   refilled or the delay due to decoding/encodi    140   refilled or the delay due to decoding/encoding/io on the DSP.
141                                                   141 
142 Note that the list of codecs/profiles/modes wa    142 Note that the list of codecs/profiles/modes was derived from the
143 OpenMAX AL specification instead of reinventin    143 OpenMAX AL specification instead of reinventing the wheel.
144 Modifications include:                            144 Modifications include:
145 - Addition of FLAC and IEC formats                145 - Addition of FLAC and IEC formats
146 - Merge of encoder/decoder capabilities           146 - Merge of encoder/decoder capabilities
147 - Profiles/modes listed as bitmasks to make de    147 - Profiles/modes listed as bitmasks to make descriptors more compact
148 - Addition of set_params for decoders (missing    148 - Addition of set_params for decoders (missing in OpenMAX AL)
149 - Addition of AMR/AMR-WB encoding modes (missi    149 - Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL)
150 - Addition of format information for WMA          150 - Addition of format information for WMA
151 - Addition of encoding options when required (    151 - Addition of encoding options when required (derived from OpenMAX IL)
152 - Addition of rateControlSupported (missing in    152 - Addition of rateControlSupported (missing in OpenMAX AL)
153                                                   153 
154 State Machine                                  << 
155 =============                                  << 
156                                                << 
157 The compressed audio stream state machine is d << 
158                                                << 
159                                         +----- << 
160                                         |      << 
161                                         |   OP << 
162                                         |      << 
163                                         +----- << 
164                                              | << 
165                                              | << 
166                                              | << 
167                                              | << 
168                                              v << 
169          compr_free()                  +------ << 
170   +------------------------------------|       << 
171   |                                    |   SET << 
172   |          +-------------------------|       << 
173   |          |       compr_write()     +------ << 
174   |          |                              ^  << 
175   |          |                              |  << 
176   |          |                              |  << 
177   |          |                              |  << 
178   |          |                              |  << 
179   |          |                         +------ << 
180   |          |                         |       << 
181   |          |                         |   DRA << 
182   |          |                         |       << 
183   |          |                         +------ << 
184   |          |                              ^  << 
185   |          |                              |  << 
186   |          |                              |  << 
187   |          |                              |  << 
188   |          v                              |  << 
189   |    +----------+                    +------ << 
190   |    |          |    compr_start()   |       << 
191   |    | PREPARE  |------------------->|  RUNN << 
192   |    |          |                    |       << 
193   |    +----------+                    +------ << 
194   |          |                            |    << 
195   |          |compr_free()                |    << 
196   |          |              compr_pause() |    << 
197   |          |                            |    << 
198   |          v                            v    << 
199   |    +----------+                   +------- << 
200   |    |          |                   |        << 
201   +--->|   FREE   |                   |  PAUSE << 
202        |          |                   |        << 
203        +----------+                   +------- << 
204                                                << 
205                                                   154 
206 Gapless Playback                                  155 Gapless Playback
207 ================                                  156 ================
208 When playing thru an album, the decoders have     157 When playing thru an album, the decoders have the ability to skip the encoder
209 delay and padding and directly move from one t    158 delay and padding and directly move from one track content to another. The end
210 user can perceive this as gapless playback as     159 user can perceive this as gapless playback as we don't have silence while
211 switching from one track to another               160 switching from one track to another
212                                                   161 
213 Also, there might be low-intensity noises due     162 Also, there might be low-intensity noises due to encoding. Perfect gapless is
214 difficult to reach with all types of compresse    163 difficult to reach with all types of compressed data, but works fine with most
215 music content. The decoder needs to know the e    164 music content. The decoder needs to know the encoder delay and encoder padding.
216 So we need to pass this to DSP. This metadata     165 So we need to pass this to DSP. This metadata is extracted from ID3/MP4 headers
217 and are not present by default in the bitstrea    166 and are not present by default in the bitstream, hence the need for a new
218 interface to pass this information to the DSP.    167 interface to pass this information to the DSP. Also DSP and userspace needs to
219 switch from one track to another and start usi    168 switch from one track to another and start using data for second track.
220                                                   169 
221 The main additions are:                           170 The main additions are:
222                                                   171 
223 set_metadata                                      172 set_metadata
224   This routine sets the encoder delay and enco    173   This routine sets the encoder delay and encoder padding. This can be used by
225   decoder to strip the silence. This needs to     174   decoder to strip the silence. This needs to be set before the data in the track
226   is written.                                     175   is written.
227                                                   176 
228 set_next_track                                    177 set_next_track
229   This routine tells DSP that metadata and wri    178   This routine tells DSP that metadata and write operation sent after this would
230   correspond to subsequent track                  179   correspond to subsequent track
231                                                   180 
232 partial drain                                     181 partial drain
233   This is called when end of file is reached.     182   This is called when end of file is reached. The userspace can inform DSP that
234   EOF is reached and now DSP can start skippin    183   EOF is reached and now DSP can start skipping padding delay. Also next write
235   data would belong to next track                 184   data would belong to next track
236                                                   185 
237 Sequence flow for gapless would be:               186 Sequence flow for gapless would be:
238 - Open                                            187 - Open
239 - Get caps / codec caps                           188 - Get caps / codec caps
240 - Set params                                      189 - Set params
241 - Set metadata of the first track                 190 - Set metadata of the first track
242 - Fill data of the first track                    191 - Fill data of the first track
243 - Trigger start                                   192 - Trigger start
244 - User-space finished sending all,                193 - User-space finished sending all,
245 - Indicate next track data by sending set_next    194 - Indicate next track data by sending set_next_track
246 - Set metadata of the next track                  195 - Set metadata of the next track
247 - then call partial_drain to flush most of buf    196 - then call partial_drain to flush most of buffer in DSP
248 - Fill data of the next track                     197 - Fill data of the next track
249 - DSP switches to second track                    198 - DSP switches to second track
250                                                   199 
251 (note: order for partial_drain and write for n    200 (note: order for partial_drain and write for next track can be reversed as well)
252                                                   201 
253 Gapless Playback SM                            << 
254 ===================                            << 
255                                                << 
256 For Gapless, we move from running state to par << 
257 with setting of meta_data and signalling for n << 
258                                                << 
259                                                << 
260                                         +----- << 
261                 compr_drain_notify()    |      << 
262               +------------------------>|  RUN << 
263               |                         |      << 
264               |                         +----- << 
265               |                              | << 
266               |                              | << 
267               |                              | << 
268               |                              | << 
269               |                              V << 
270               |                         +----- << 
271               |    compr_set_params()   |      << 
272               |             +-----------|NEXT_ << 
273               |             |           |      << 
274               |             |           +--+-- << 
275               |             |              | | << 
276               |             +--------------+ | << 
277               |                              | << 
278               |                              | << 
279               |                              | << 
280               |                              V << 
281               |                         +----- << 
282               |                         |      << 
283               +------------------------ | PART << 
284                                         |  DRA << 
285                                         +----- << 
286                                                   202 
287 Not supported                                     203 Not supported
288 =============                                     204 =============
289 - Support for VoIP/circuit-switched calls is n    205 - Support for VoIP/circuit-switched calls is not the target of this
290   API. Support for dynamic bit-rate changes wo    206   API. Support for dynamic bit-rate changes would require a tight
291   coupling between the DSP and the host stack,    207   coupling between the DSP and the host stack, limiting power savings.
292                                                   208 
293 - Packet-loss concealment is not supported. Th    209 - Packet-loss concealment is not supported. This would require an
294   additional interface to let the decoder synt    210   additional interface to let the decoder synthesize data when frames
295   are lost during transmission. This may be ad    211   are lost during transmission. This may be added in the future.
296                                                   212 
297 - Volume control/routing is not handled by thi    213 - Volume control/routing is not handled by this API. Devices exposing a
298   compressed data interface will be considered    214   compressed data interface will be considered as regular ALSA devices;
299   volume changes and routing information will     215   volume changes and routing information will be provided with regular
300   ALSA kcontrols.                                 216   ALSA kcontrols.
301                                                   217 
302 - Embedded audio effects. Such effects should     218 - Embedded audio effects. Such effects should be enabled in the same
303   manner, no matter if the input was PCM or co    219   manner, no matter if the input was PCM or compressed.
304                                                   220 
305 - multichannel IEC encoding. Unclear if this i    221 - multichannel IEC encoding. Unclear if this is required.
306                                                   222 
307 - Encoding/decoding acceleration is not suppor    223 - Encoding/decoding acceleration is not supported as mentioned
308   above. It is possible to route the output of    224   above. It is possible to route the output of a decoder to a capture
309   stream, or even implement transcoding capabi    225   stream, or even implement transcoding capabilities. This routing
310   would be enabled with ALSA kcontrols.           226   would be enabled with ALSA kcontrols.
311                                                   227 
312 - Audio policy/resource management. This API d    228 - Audio policy/resource management. This API does not provide any
313   hooks to query the utilization of the audio     229   hooks to query the utilization of the audio DSP, nor any preemption
314   mechanisms.                                     230   mechanisms.
315                                                   231 
316 - No notion of underrun/overrun. Since the byt    232 - No notion of underrun/overrun. Since the bytes written are compressed
317   in nature and data written/read doesn't tran    233   in nature and data written/read doesn't translate directly to
318   rendered output in time, this does not deal     234   rendered output in time, this does not deal with underrun/overrun and
319   maybe dealt in user-library                     235   maybe dealt in user-library
320                                                   236 
321                                                   237 
322 Credits                                           238 Credits
323 =======                                           239 =======
324 - Mark Brown and Liam Girdwood for discussions    240 - Mark Brown and Liam Girdwood for discussions on the need for this API
325 - Harsha Priya for her work on intel_sst compr    241 - Harsha Priya for her work on intel_sst compressed API
326 - Rakesh Ughreja for valuable feedback            242 - Rakesh Ughreja for valuable feedback
327 - Sing Nallasellan, Sikkandar Madar and Prasan    243 - Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for
328   demonstrating and quantifying the benefits o    244   demonstrating and quantifying the benefits of audio offload on a
329   real platform.                                  245   real platform.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php