1 ======================== 2 SoundWire Error Handling 3 ======================== 4 5 The SoundWire PHY was designed with care and errors on the bus are going to 6 be very unlikely, and if they happen it should be limited to single bit 7 errors. Examples of this design can be found in the synchronization 8 mechanism (sync loss after two errors) and short CRCs used for the Bulk 9 Register Access. 10 11 The errors can be detected with multiple mechanisms: 12 13 1. Bus clash or parity errors: This mechanism relies on low-level detectors 14 that are independent of the payload and usages, and they cover both control 15 and audio data. The current implementation only logs such errors. 16 Improvements could be invalidating an entire programming sequence and 17 restarting from a known position. In the case of such errors outside of a 18 control/command sequence, there is no concealment or recovery for audio 19 data enabled by the SoundWire protocol, the location of the error will also 20 impact its audibility (most-significant bits will be more impacted in PCM), 21 and after a number of such errors are detected the bus might be reset. Note 22 that bus clashes due to programming errors (two streams using the same bit 23 slots) or electrical issues during the transmit/receive transition cannot 24 be distinguished, although a recurring bus clash when audio is enabled is a 25 indication of a bus allocation issue. The interrupt mechanism can also help 26 identify Slaves which detected a Bus Clash or a Parity Error, but they may 27 not be responsible for the errors so resetting them individually is not a 28 viable recovery strategy. 29 30 2. Command status: Each command is associated with a status, which only 31 covers transmission of the data between devices. The ACK status indicates 32 that the command was received and will be executed by the end of the 33 current frame. A NAK indicates that the command was in error and will not 34 be applied. In case of a bad programming (command sent to non-existent 35 Slave or to a non-implemented register) or electrical issue, no response 36 signals the command was ignored. Some Master implementations allow for a 37 command to be retransmitted several times. If the retransmission fails, 38 backtracking and restarting the entire programming sequence might be a 39 solution. Alternatively some implementations might directly issue a bus 40 reset and re-enumerate all devices. 41 42 3. Timeouts: In a number of cases such as ChannelPrepare or 43 ClockStopPrepare, the bus driver is supposed to poll a register field until 44 it transitions to a NotFinished value of zero. The MIPI SoundWire spec 1.1 45 does not define timeouts but the MIPI SoundWire DisCo document adds 46 recommendation on timeouts. If such configurations do not complete, the 47 driver will return a -ETIMEOUT. Such timeouts are symptoms of a faulty 48 Slave device and are likely impossible to recover from. 49 50 Errors during global reconfiguration sequences are extremely difficult to 51 handle: 52 53 1. BankSwitch: An error during the last command issuing a BankSwitch is 54 difficult to backtrack from. Retransmitting the Bank Switch command may be 55 possible in a single segment setup, but this can lead to synchronization 56 problems when enabling multiple bus segments (a command with side effects 57 such as frame reconfiguration would be handled at different times). A global 58 hard-reset might be the best solution. 59 60 Note that SoundWire does not provide a mechanism to detect illegal values 61 written in valid registers. In a number of cases the standard even mentions 62 that the Slave might behave in implementation-defined ways. The bus 63 implementation does not provide a recovery mechanism for such errors, Slave 64 or Master driver implementers are responsible for writing valid values in 65 valid registers and implement additional range checking if needed.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.