1 .. SPDX-License-Identifier: GPL-2.0 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 ======= 3 ======= 4 SCSI EH 4 SCSI EH 5 ======= 5 ======= 6 6 7 This document describes SCSI midlayer error ha 7 This document describes SCSI midlayer error handling infrastructure. 8 Please refer to Documentation/scsi/scsi_mid_lo 8 Please refer to Documentation/scsi/scsi_mid_low_api.rst for more 9 information regarding SCSI midlayer. 9 information regarding SCSI midlayer. 10 10 11 .. TABLE OF CONTENTS 11 .. TABLE OF CONTENTS 12 12 13 [1] How SCSI commands travel through the mi 13 [1] How SCSI commands travel through the midlayer and to EH 14 [1-1] struct scsi_cmnd 14 [1-1] struct scsi_cmnd 15 [1-2] How do scmd's get completed? 15 [1-2] How do scmd's get completed? 16 [1-2-1] Completing a scmd w/ scsi_done 16 [1-2-1] Completing a scmd w/ scsi_done 17 [1-2-2] Completing a scmd w/ timeout 17 [1-2-2] Completing a scmd w/ timeout 18 [1-3] How EH takes over 18 [1-3] How EH takes over 19 [2] How SCSI EH works 19 [2] How SCSI EH works 20 [2-1] EH through fine-grained callbacks 20 [2-1] EH through fine-grained callbacks 21 [2-1-1] Overview 21 [2-1-1] Overview 22 [2-1-2] Flow of scmds through EH 22 [2-1-2] Flow of scmds through EH 23 [2-1-3] Flow of control 23 [2-1-3] Flow of control 24 [2-2] EH through transportt->eh_strateg 24 [2-2] EH through transportt->eh_strategy_handler() 25 [2-2-1] Pre transportt->eh_strategy_ha 25 [2-2-1] Pre transportt->eh_strategy_handler() SCSI midlayer conditions 26 [2-2-2] Post transportt->eh_strategy_h 26 [2-2-2] Post transportt->eh_strategy_handler() SCSI midlayer conditions 27 [2-2-3] Things to consider 27 [2-2-3] Things to consider 28 28 29 29 30 1. How SCSI commands travel through the midlay 30 1. How SCSI commands travel through the midlayer and to EH 31 ============================================== 31 ========================================================== 32 32 33 1.1 struct scsi_cmnd 33 1.1 struct scsi_cmnd 34 -------------------- 34 -------------------- 35 35 36 Each SCSI command is represented with struct s 36 Each SCSI command is represented with struct scsi_cmnd (== scmd). A 37 scmd has two list_head's to link itself into l 37 scmd has two list_head's to link itself into lists. The two are 38 scmd->list and scmd->eh_entry. The former is 38 scmd->list and scmd->eh_entry. The former is used for free list or 39 per-device allocated scmd list and not of much 39 per-device allocated scmd list and not of much interest to this EH 40 discussion. The latter is used for completion 40 discussion. The latter is used for completion and EH lists and unless 41 otherwise stated scmds are always linked using 41 otherwise stated scmds are always linked using scmd->eh_entry in this 42 discussion. 42 discussion. 43 43 44 44 45 1.2 How do scmd's get completed? 45 1.2 How do scmd's get completed? 46 -------------------------------- 46 -------------------------------- 47 47 48 Once LLDD gets hold of a scmd, either the LLDD 48 Once LLDD gets hold of a scmd, either the LLDD will complete the 49 command by calling scsi_done callback passed f 49 command by calling scsi_done callback passed from midlayer when 50 invoking hostt->queuecommand() or the block la 50 invoking hostt->queuecommand() or the block layer will time it out. 51 51 52 52 53 1.2.1 Completing a scmd w/ scsi_done 53 1.2.1 Completing a scmd w/ scsi_done 54 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 54 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 55 55 56 For all non-EH commands, scsi_done() is the co 56 For all non-EH commands, scsi_done() is the completion callback. It 57 just calls blk_complete_request() to delete th 57 just calls blk_complete_request() to delete the block layer timer and 58 raise SCSI_SOFTIRQ 58 raise SCSI_SOFTIRQ 59 59 60 SCSI_SOFTIRQ handler scsi_softirq calls scsi_d 60 SCSI_SOFTIRQ handler scsi_softirq calls scsi_decide_disposition() to 61 determine what to do with the command. scsi_d 61 determine what to do with the command. scsi_decide_disposition() 62 looks at the scmd->result value and sense data 62 looks at the scmd->result value and sense data to determine what to do 63 with the command. 63 with the command. 64 64 65 - SUCCESS 65 - SUCCESS 66 66 67 scsi_finish_command() is invoked for t 67 scsi_finish_command() is invoked for the command. The 68 function does some maintenance chores 68 function does some maintenance chores and then calls 69 scsi_io_completion() to finish the I/O 69 scsi_io_completion() to finish the I/O. 70 scsi_io_completion() then notifies the 70 scsi_io_completion() then notifies the block layer on 71 the completed request by calling blk_e 71 the completed request by calling blk_end_request and 72 friends or figures out what to do with 72 friends or figures out what to do with the remainder 73 of the data in case of an error. 73 of the data in case of an error. 74 74 75 - NEEDS_RETRY 75 - NEEDS_RETRY 76 76 77 - ADD_TO_MLQUEUE 77 - ADD_TO_MLQUEUE 78 78 79 scmd is requeued to blk queue. 79 scmd is requeued to blk queue. 80 80 81 - otherwise 81 - otherwise 82 82 83 scsi_eh_scmd_add(scmd) is invoked for 83 scsi_eh_scmd_add(scmd) is invoked for the command. See 84 [1-3] for details of this function. 84 [1-3] for details of this function. 85 85 86 86 87 1.2.2 Completing a scmd w/ timeout 87 1.2.2 Completing a scmd w/ timeout 88 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 88 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 89 89 90 The timeout handler is scsi_timeout(). When a !! 90 The timeout handler is scsi_times_out(). When a timeout occurs, this >> 91 function 91 92 92 1. invokes optional hostt->eh_timed_out() cal 93 1. invokes optional hostt->eh_timed_out() callback. Return value can 93 be one of 94 be one of 94 95 95 - SCSI_EH_RESET_TIMER !! 96 - BLK_EH_RESET_TIMER 96 This indicates that more time is requi 97 This indicates that more time is required to finish the 97 command. Timer is restarted. !! 98 command. Timer is restarted. This action is counted as a >> 99 retry and only allowed scmd->allowed + 1(!) times. Once the >> 100 limit is reached, action for BLK_EH_DONE is taken instead. 98 101 99 - SCSI_EH_NOT_HANDLED !! 102 - BLK_EH_DONE 100 eh_timed_out() callback did not handle 103 eh_timed_out() callback did not handle the command. 101 Step #2 is taken. 104 Step #2 is taken. 102 105 103 - SCSI_EH_DONE !! 106 2. scsi_abort_command() is invoked to schedule an asynchrous abort. 104 eh_timed_out() completed the command. !! 107 Asynchronous abort are not invoked for commands which the 105 !! 108 SCSI_EH_ABORT_SCHEDULED flag is set (this indicates that the command 106 2. scsi_abort_command() is invoked to schedul !! 109 already had been aborted once, and this is a retry which failed), 107 issue a retry scmd->allowed + 1 times. As !! 110 or when the EH deadline is expired. In these case Step #3 is taken. 108 for commands for which the SCSI_EH_ABORT_S << 109 indicates that the command already had bee << 110 retry which failed), when retries are exce << 111 expired. In these cases Step #3 is taken. << 112 111 113 3. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) 112 3. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) is invoked for the 114 command. See [1-4] for more information. 113 command. See [1-4] for more information. 115 114 116 1.3 Asynchronous command aborts 115 1.3 Asynchronous command aborts 117 ------------------------------- 116 ------------------------------- 118 117 119 After a timeout occurs a command abort is sch 118 After a timeout occurs a command abort is scheduled from 120 scsi_abort_command(). If the abort is success 119 scsi_abort_command(). If the abort is successful the command 121 will either be retried (if the number of retr 120 will either be retried (if the number of retries is not exhausted) 122 or terminated with DID_TIME_OUT. 121 or terminated with DID_TIME_OUT. 123 122 124 Otherwise scsi_eh_scmd_add() is invoked for t 123 Otherwise scsi_eh_scmd_add() is invoked for the command. 125 See [1-4] for more information. 124 See [1-4] for more information. 126 125 127 1.4 How EH takes over 126 1.4 How EH takes over 128 --------------------- 127 --------------------- 129 128 130 scmds enter EH via scsi_eh_scmd_add(), which d 129 scmds enter EH via scsi_eh_scmd_add(), which does the following. 131 130 132 1. Links scmd->eh_entry to shost->eh_cmd_q 131 1. Links scmd->eh_entry to shost->eh_cmd_q 133 132 134 2. Sets SHOST_RECOVERY bit in shost->shost_st 133 2. Sets SHOST_RECOVERY bit in shost->shost_state 135 134 136 3. Increments shost->host_failed 135 3. Increments shost->host_failed 137 136 138 4. Wakes up SCSI EH thread if shost->host_bus 137 4. Wakes up SCSI EH thread if shost->host_busy == shost->host_failed 139 138 140 As can be seen above, once any scmd is added t 139 As can be seen above, once any scmd is added to shost->eh_cmd_q, 141 SHOST_RECOVERY shost_state bit is turned on. 140 SHOST_RECOVERY shost_state bit is turned on. This prevents any new 142 scmd to be issued from blk queue to the host; 141 scmd to be issued from blk queue to the host; eventually, all scmds on 143 the host either complete normally, fail and ge 142 the host either complete normally, fail and get added to eh_cmd_q, or 144 time out and get added to shost->eh_cmd_q. 143 time out and get added to shost->eh_cmd_q. 145 144 146 If all scmds either complete or fail, the numb 145 If all scmds either complete or fail, the number of in-flight scmds 147 becomes equal to the number of failed scmds - 146 becomes equal to the number of failed scmds - i.e. shost->host_busy == 148 shost->host_failed. This wakes up SCSI EH thr 147 shost->host_failed. This wakes up SCSI EH thread. So, once woken up, 149 SCSI EH thread can expect that all in-flight c 148 SCSI EH thread can expect that all in-flight commands have failed and 150 are linked on shost->eh_cmd_q. 149 are linked on shost->eh_cmd_q. 151 150 152 Note that this does not mean lower layers are 151 Note that this does not mean lower layers are quiescent. If a LLDD 153 completed a scmd with error status, the LLDD a 152 completed a scmd with error status, the LLDD and lower layers are 154 assumed to forget about the scmd at that point 153 assumed to forget about the scmd at that point. However, if a scmd 155 has timed out, unless hostt->eh_timed_out() ma 154 has timed out, unless hostt->eh_timed_out() made lower layers forget 156 about the scmd, which currently no LLDD does, 155 about the scmd, which currently no LLDD does, the command is still 157 active as long as lower layers are concerned a 156 active as long as lower layers are concerned and completion could 158 occur at any time. Of course, all such comple 157 occur at any time. Of course, all such completions are ignored as the 159 timer has already expired. 158 timer has already expired. 160 159 161 We'll talk about how SCSI EH takes actions to 160 We'll talk about how SCSI EH takes actions to abort - make LLDD 162 forget about - timed out scmds later. 161 forget about - timed out scmds later. 163 162 164 163 165 2. How SCSI EH works 164 2. How SCSI EH works 166 ==================== 165 ==================== 167 166 168 LLDD's can implement SCSI EH actions in one of 167 LLDD's can implement SCSI EH actions in one of the following two 169 ways. 168 ways. 170 169 171 - Fine-grained EH callbacks 170 - Fine-grained EH callbacks 172 LLDD can implement fine-grained EH cal 171 LLDD can implement fine-grained EH callbacks and let SCSI 173 midlayer drive error handling and call 172 midlayer drive error handling and call appropriate callbacks. 174 This will be discussed further in [2-1 173 This will be discussed further in [2-1]. 175 174 176 - eh_strategy_handler() callback 175 - eh_strategy_handler() callback 177 This is one big callback which should 176 This is one big callback which should perform whole error 178 handling. As such, it should do all c 177 handling. As such, it should do all chores the SCSI midlayer 179 performs during recovery. This will b 178 performs during recovery. This will be discussed in [2-2]. 180 179 181 Once recovery is complete, SCSI EH resumes nor 180 Once recovery is complete, SCSI EH resumes normal operation by 182 calling scsi_restart_operations(), which 181 calling scsi_restart_operations(), which 183 182 184 1. Checks if door locking is needed and locks 183 1. Checks if door locking is needed and locks door. 185 184 186 2. Clears SHOST_RECOVERY shost_state bit 185 2. Clears SHOST_RECOVERY shost_state bit 187 186 188 3. Wakes up waiters on shost->host_wait. Thi 187 3. Wakes up waiters on shost->host_wait. This occurs if someone 189 calls scsi_block_when_processing_errors() 188 calls scsi_block_when_processing_errors() on the host. 190 (*QUESTION* why is it needed? All operati 189 (*QUESTION* why is it needed? All operations will be blocked 191 anyway after it reaches blk queue.) 190 anyway after it reaches blk queue.) 192 191 193 4. Kicks queues in all devices on the host in 192 4. Kicks queues in all devices on the host in the asses 194 193 195 194 196 2.1 EH through fine-grained callbacks 195 2.1 EH through fine-grained callbacks 197 ------------------------------------- 196 ------------------------------------- 198 197 199 2.1.1 Overview 198 2.1.1 Overview 200 ^^^^^^^^^^^^^^ 199 ^^^^^^^^^^^^^^ 201 200 202 If eh_strategy_handler() is not present, SCSI 201 If eh_strategy_handler() is not present, SCSI midlayer takes charge 203 of driving error handling. EH's goals are two 202 of driving error handling. EH's goals are two - make LLDD, host and 204 device forget about timed out scmds and make t 203 device forget about timed out scmds and make them ready for new 205 commands. A scmd is said to be recovered if t 204 commands. A scmd is said to be recovered if the scmd is forgotten by 206 lower layers and lower layers are ready to pro 205 lower layers and lower layers are ready to process or fail the scmd 207 again. 206 again. 208 207 209 To achieve these goals, EH performs recovery a 208 To achieve these goals, EH performs recovery actions with increasing 210 severity. Some actions are performed by issui 209 severity. Some actions are performed by issuing SCSI commands and 211 others are performed by invoking one of the fo 210 others are performed by invoking one of the following fine-grained 212 hostt EH callbacks. Callbacks may be omitted 211 hostt EH callbacks. Callbacks may be omitted and omitted ones are 213 considered to fail always. 212 considered to fail always. 214 213 215 :: 214 :: 216 215 217 int (* eh_abort_handler)(struct scsi_cmnd 216 int (* eh_abort_handler)(struct scsi_cmnd *); 218 int (* eh_device_reset_handler)(struct scs 217 int (* eh_device_reset_handler)(struct scsi_cmnd *); 219 int (* eh_bus_reset_handler)(struct scsi_c 218 int (* eh_bus_reset_handler)(struct scsi_cmnd *); 220 int (* eh_host_reset_handler)(struct scsi_ 219 int (* eh_host_reset_handler)(struct scsi_cmnd *); 221 220 222 Higher-severity actions are taken only when lo 221 Higher-severity actions are taken only when lower-severity actions 223 cannot recover some of failed scmds. Also, no 222 cannot recover some of failed scmds. Also, note that failure of the 224 highest-severity action means EH failure and r 223 highest-severity action means EH failure and results in offlining of 225 all unrecovered devices. 224 all unrecovered devices. 226 225 227 During recovery, the following rules are follo 226 During recovery, the following rules are followed 228 227 229 - Recovery actions are performed on failed sc 228 - Recovery actions are performed on failed scmds on the to do list, 230 eh_work_q. If a recovery action succeeds f 229 eh_work_q. If a recovery action succeeds for a scmd, recovered 231 scmds are removed from eh_work_q. 230 scmds are removed from eh_work_q. 232 231 233 Note that single recovery action on a scmd 232 Note that single recovery action on a scmd can recover multiple 234 scmds. e.g. resetting a device recovers al 233 scmds. e.g. resetting a device recovers all failed scmds on the 235 device. 234 device. 236 235 237 - Higher severity actions are taken iff eh_wo 236 - Higher severity actions are taken iff eh_work_q is not empty after 238 lower severity actions are complete. 237 lower severity actions are complete. 239 238 240 - EH reuses failed scmds to issue commands fo 239 - EH reuses failed scmds to issue commands for recovery. For 241 timed-out scmds, SCSI EH ensures that LLDD 240 timed-out scmds, SCSI EH ensures that LLDD forgets about a scmd 242 before reusing it for EH commands. 241 before reusing it for EH commands. 243 242 244 When a scmd is recovered, the scmd is moved fr 243 When a scmd is recovered, the scmd is moved from eh_work_q to EH 245 local eh_done_q using scsi_eh_finish_cmd(). A 244 local eh_done_q using scsi_eh_finish_cmd(). After all scmds are 246 recovered (eh_work_q is empty), scsi_eh_flush_ 245 recovered (eh_work_q is empty), scsi_eh_flush_done_q() is invoked to 247 either retry or error-finish (notify upper lay 246 either retry or error-finish (notify upper layer of failure) recovered 248 scmds. 247 scmds. 249 248 250 scmds are retried iff its sdev is still online 249 scmds are retried iff its sdev is still online (not offlined during 251 EH), REQ_FAILFAST is not set and ++scmd->retri 250 EH), REQ_FAILFAST is not set and ++scmd->retries is less than 252 scmd->allowed. 251 scmd->allowed. 253 252 254 253 255 2.1.2 Flow of scmds through EH 254 2.1.2 Flow of scmds through EH 256 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 255 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 257 256 258 1. Error completion / time out 257 1. Error completion / time out 259 258 260 :ACTION: scsi_eh_scmd_add() is invoked for 259 :ACTION: scsi_eh_scmd_add() is invoked for scmd 261 260 262 - add scmd to shost->eh_cmd_q 261 - add scmd to shost->eh_cmd_q 263 - set SHOST_RECOVERY 262 - set SHOST_RECOVERY 264 - shost->host_failed++ 263 - shost->host_failed++ 265 264 266 :LOCKING: shost->host_lock 265 :LOCKING: shost->host_lock 267 266 268 2. EH starts 267 2. EH starts 269 268 270 :ACTION: move all scmds to EH's local eh_w 269 :ACTION: move all scmds to EH's local eh_work_q. shost->eh_cmd_q 271 is cleared. 270 is cleared. 272 271 273 :LOCKING: shost->host_lock (not strictly n 272 :LOCKING: shost->host_lock (not strictly necessary, just for 274 consistency) 273 consistency) 275 274 276 3. scmd recovered 275 3. scmd recovered 277 276 278 :ACTION: scsi_eh_finish_cmd() is invoked t 277 :ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd 279 278 280 - scsi_setup_cmd_retry() 279 - scsi_setup_cmd_retry() 281 - move from local eh_work_q to local e 280 - move from local eh_work_q to local eh_done_q 282 281 283 :LOCKING: none 282 :LOCKING: none 284 283 285 :CONCURRENCY: at most one thread per separ 284 :CONCURRENCY: at most one thread per separate eh_work_q to 286 keep queue manipulation lock 285 keep queue manipulation lockless 287 286 288 4. EH completes 287 4. EH completes 289 288 290 :ACTION: scsi_eh_flush_done_q() retries sc 289 :ACTION: scsi_eh_flush_done_q() retries scmds or notifies upper 291 layer of failure. May be called c 290 layer of failure. May be called concurrently but must have 292 a no more than one thread per sep 291 a no more than one thread per separate eh_work_q to 293 manipulate the queue locklessly 292 manipulate the queue locklessly 294 293 295 - scmd is removed from eh_done_q 294 - scmd is removed from eh_done_q and scmd->eh_entry is cleared 296 - if retry is necessary, scmd is 295 - if retry is necessary, scmd is requeued using 297 scsi_queue_insert() 296 scsi_queue_insert() 298 - otherwise, scsi_finish_command( 297 - otherwise, scsi_finish_command() is invoked for scmd 299 - zero shost->host_failed 298 - zero shost->host_failed 300 299 301 :LOCKING: queue or finish function perform 300 :LOCKING: queue or finish function performs appropriate locking 302 301 303 302 304 2.1.3 Flow of control 303 2.1.3 Flow of control 305 ^^^^^^^^^^^^^^^^^^^^^^ 304 ^^^^^^^^^^^^^^^^^^^^^^ 306 305 307 EH through fine-grained callbacks start from 306 EH through fine-grained callbacks start from scsi_unjam_host(). 308 307 309 ``scsi_unjam_host`` 308 ``scsi_unjam_host`` 310 309 311 1. Lock shost->host_lock, splice_init shos 310 1. Lock shost->host_lock, splice_init shost->eh_cmd_q into local 312 eh_work_q and unlock host_lock. Note t 311 eh_work_q and unlock host_lock. Note that shost->eh_cmd_q is 313 cleared by this action. 312 cleared by this action. 314 313 315 2. Invoke scsi_eh_get_sense. 314 2. Invoke scsi_eh_get_sense. 316 315 317 ``scsi_eh_get_sense`` 316 ``scsi_eh_get_sense`` 318 317 319 This action is taken for each error-co 318 This action is taken for each error-completed 320 (!SCSI_EH_CANCEL_CMD) commands without 319 (!SCSI_EH_CANCEL_CMD) commands without valid sense data. Most 321 SCSI transports/LLDDs automatically ac 320 SCSI transports/LLDDs automatically acquire sense data on 322 command failures (autosense). Autosen 321 command failures (autosense). Autosense is recommended for 323 performance reasons and as sense infor 322 performance reasons and as sense information could get out of 324 sync between occurrence of CHECK CONDI 323 sync between occurrence of CHECK CONDITION and this action. 325 324 326 Note that if autosense is not supporte 325 Note that if autosense is not supported, scmd->sense_buffer 327 contains invalid sense data when error 326 contains invalid sense data when error-completing the scmd 328 with scsi_done(). scsi_decide_disposi 327 with scsi_done(). scsi_decide_disposition() always returns 329 FAILED in such cases thus invoking SCS 328 FAILED in such cases thus invoking SCSI EH. When the scmd 330 reaches here, sense data is acquired a 329 reaches here, sense data is acquired and 331 scsi_decide_disposition() is called ag 330 scsi_decide_disposition() is called again. 332 331 333 1. Invoke scsi_request_sense() which i 332 1. Invoke scsi_request_sense() which issues REQUEST_SENSE 334 command. If fails, no action. Not 333 command. If fails, no action. Note that taking no action 335 causes higher-severity recovery to 334 causes higher-severity recovery to be taken for the scmd. 336 335 337 2. Invoke scsi_decide_disposition() on 336 2. Invoke scsi_decide_disposition() on the scmd 338 337 339 - SUCCESS 338 - SUCCESS 340 scmd->retries is set to scmd-> 339 scmd->retries is set to scmd->allowed preventing 341 scsi_eh_flush_done_q() from re 340 scsi_eh_flush_done_q() from retrying the scmd and 342 scsi_eh_finish_cmd() is invoke 341 scsi_eh_finish_cmd() is invoked. 343 342 344 - NEEDS_RETRY 343 - NEEDS_RETRY 345 scsi_eh_finish_cmd() invoked 344 scsi_eh_finish_cmd() invoked 346 345 347 - otherwise 346 - otherwise 348 No action. 347 No action. 349 348 350 3. If !list_empty(&eh_work_q), invoke scsi 349 3. If !list_empty(&eh_work_q), invoke scsi_eh_abort_cmds(). 351 350 352 ``scsi_eh_abort_cmds`` 351 ``scsi_eh_abort_cmds`` 353 352 354 This action is taken for each timed ou 353 This action is taken for each timed out command when 355 no_async_abort is enabled in the host 354 no_async_abort is enabled in the host template. 356 hostt->eh_abort_handler() is invoked f 355 hostt->eh_abort_handler() is invoked for each scmd. The 357 handler returns SUCCESS if it has succ 356 handler returns SUCCESS if it has succeeded to make LLDD and 358 all related hardware forget about the 357 all related hardware forget about the scmd. 359 358 360 If a timedout scmd is successfully abo 359 If a timedout scmd is successfully aborted and the sdev is 361 either offline or ready, scsi_eh_finis 360 either offline or ready, scsi_eh_finish_cmd() is invoked for 362 the scmd. Otherwise, the scmd is left 361 the scmd. Otherwise, the scmd is left in eh_work_q for 363 higher-severity actions. 362 higher-severity actions. 364 363 365 Note that both offline and ready statu 364 Note that both offline and ready status mean that the sdev is 366 ready to process new scmds, where proc 365 ready to process new scmds, where processing also implies 367 immediate failing; thus, if a sdev is 366 immediate failing; thus, if a sdev is in one of the two 368 states, no further recovery action is 367 states, no further recovery action is needed. 369 368 370 Device readiness is tested using scsi_ 369 Device readiness is tested using scsi_eh_tur() which issues 371 TEST_UNIT_READY command. Note that th 370 TEST_UNIT_READY command. Note that the scmd must have been 372 aborted successfully before reusing it 371 aborted successfully before reusing it for TEST_UNIT_READY. 373 372 374 4. If !list_empty(&eh_work_q), invoke scsi 373 4. If !list_empty(&eh_work_q), invoke scsi_eh_ready_devs() 375 374 376 ``scsi_eh_ready_devs`` 375 ``scsi_eh_ready_devs`` 377 376 378 This function takes four increasingly 377 This function takes four increasingly more severe measures to 379 make failed sdevs ready for new comman 378 make failed sdevs ready for new commands. 380 379 381 1. Invoke scsi_eh_stu() 380 1. Invoke scsi_eh_stu() 382 381 383 ``scsi_eh_stu`` 382 ``scsi_eh_stu`` 384 383 385 For each sdev which has failed scm 384 For each sdev which has failed scmds with valid sense data 386 of which scsi_check_sense()'s verd 385 of which scsi_check_sense()'s verdict is FAILED, 387 START_STOP_UNIT command is issued 386 START_STOP_UNIT command is issued w/ start=1. Note that 388 as we explicitly choose error-comp 387 as we explicitly choose error-completed scmds, it is known 389 that lower layers have forgotten a 388 that lower layers have forgotten about the scmd and we can 390 reuse it for STU. 389 reuse it for STU. 391 390 392 If STU succeeds and the sdev is ei 391 If STU succeeds and the sdev is either offline or ready, 393 all failed scmds on the sdev are E 392 all failed scmds on the sdev are EH-finished with 394 scsi_eh_finish_cmd(). 393 scsi_eh_finish_cmd(). 395 394 396 *NOTE* If hostt->eh_abort_handler( 395 *NOTE* If hostt->eh_abort_handler() isn't implemented or 397 failed, we may still have timed ou 396 failed, we may still have timed out scmds at this point 398 and STU doesn't make lower layers 397 and STU doesn't make lower layers forget about those 399 scmds. Yet, this function EH-fini 398 scmds. Yet, this function EH-finish all scmds on the sdev 400 if STU succeeds leaving lower laye 399 if STU succeeds leaving lower layers in an inconsistent 401 state. It seems that STU action s 400 state. It seems that STU action should be taken only when 402 a sdev has no timed out scmd. 401 a sdev has no timed out scmd. 403 402 404 2. If !list_empty(&eh_work_q), invoke 403 2. If !list_empty(&eh_work_q), invoke scsi_eh_bus_device_reset(). 405 404 406 ``scsi_eh_bus_device_reset`` 405 ``scsi_eh_bus_device_reset`` 407 406 408 This action is very similar to scs 407 This action is very similar to scsi_eh_stu() except that, 409 instead of issuing STU, hostt->eh_ 408 instead of issuing STU, hostt->eh_device_reset_handler() 410 is used. Also, as we're not issui 409 is used. Also, as we're not issuing SCSI commands and 411 resetting clears all scmds on the 410 resetting clears all scmds on the sdev, there is no need 412 to choose error-completed scmds. 411 to choose error-completed scmds. 413 412 414 3. If !list_empty(&eh_work_q), invoke 413 3. If !list_empty(&eh_work_q), invoke scsi_eh_bus_reset() 415 414 416 ``scsi_eh_bus_reset`` 415 ``scsi_eh_bus_reset`` 417 416 418 hostt->eh_bus_reset_handler() is i 417 hostt->eh_bus_reset_handler() is invoked for each channel 419 with failed scmds. If bus reset s 418 with failed scmds. If bus reset succeeds, all failed 420 scmds on all ready or offline sdev 419 scmds on all ready or offline sdevs on the channel are 421 EH-finished. 420 EH-finished. 422 421 423 4. If !list_empty(&eh_work_q), invoke 422 4. If !list_empty(&eh_work_q), invoke scsi_eh_host_reset() 424 423 425 ``scsi_eh_host_reset`` 424 ``scsi_eh_host_reset`` 426 425 427 This is the last resort. hostt->e 426 This is the last resort. hostt->eh_host_reset_handler() 428 is invoked. If host reset succeed 427 is invoked. If host reset succeeds, all failed scmds on 429 all ready or offline sdevs on the 428 all ready or offline sdevs on the host are EH-finished. 430 429 431 5. If !list_empty(&eh_work_q), invoke 430 5. If !list_empty(&eh_work_q), invoke scsi_eh_offline_sdevs() 432 431 433 ``scsi_eh_offline_sdevs`` 432 ``scsi_eh_offline_sdevs`` 434 433 435 Take all sdevs which still have un 434 Take all sdevs which still have unrecovered scmds offline 436 and EH-finish the scmds. 435 and EH-finish the scmds. 437 436 438 5. Invoke scsi_eh_flush_done_q(). 437 5. Invoke scsi_eh_flush_done_q(). 439 438 440 ``scsi_eh_flush_done_q`` 439 ``scsi_eh_flush_done_q`` 441 440 442 At this point all scmds are recove 441 At this point all scmds are recovered (or given up) and 443 put on eh_done_q by scsi_eh_finish 442 put on eh_done_q by scsi_eh_finish_cmd(). This function 444 flushes eh_done_q by either retryi 443 flushes eh_done_q by either retrying or notifying upper 445 layer of failure of the scmds. 444 layer of failure of the scmds. 446 445 447 446 448 2.2 EH through transportt->eh_strategy_handler 447 2.2 EH through transportt->eh_strategy_handler() 449 ---------------------------------------------- 448 ------------------------------------------------ 450 449 451 transportt->eh_strategy_handler() is invoked i 450 transportt->eh_strategy_handler() is invoked in the place of 452 scsi_unjam_host() and it is responsible for wh 451 scsi_unjam_host() and it is responsible for whole recovery process. 453 On completion, the handler should have made lo 452 On completion, the handler should have made lower layers forget about 454 all failed scmds and either ready for new comm 453 all failed scmds and either ready for new commands or offline. Also, 455 it should perform SCSI EH maintenance chores t 454 it should perform SCSI EH maintenance chores to maintain integrity of 456 SCSI midlayer. IOW, of the steps described in 455 SCSI midlayer. IOW, of the steps described in [2-1-2], all steps 457 except for #1 must be implemented by eh_strate 456 except for #1 must be implemented by eh_strategy_handler(). 458 457 459 458 460 2.2.1 Pre transportt->eh_strategy_handler() SC 459 2.2.1 Pre transportt->eh_strategy_handler() SCSI midlayer conditions 461 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 460 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 462 461 463 The following conditions are true on entry to 462 The following conditions are true on entry to the handler. 464 463 465 - Each failed scmd's eh_flags field is set ap 464 - Each failed scmd's eh_flags field is set appropriately. 466 465 467 - Each failed scmd is linked on scmd->eh_cmd_ 466 - Each failed scmd is linked on scmd->eh_cmd_q by scmd->eh_entry. 468 467 469 - SHOST_RECOVERY is set. 468 - SHOST_RECOVERY is set. 470 469 471 - shost->host_failed == shost->host_busy 470 - shost->host_failed == shost->host_busy 472 471 473 472 474 2.2.2 Post transportt->eh_strategy_handler() S 473 2.2.2 Post transportt->eh_strategy_handler() SCSI midlayer conditions 475 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 474 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 476 475 477 The following conditions must be true on exit 476 The following conditions must be true on exit from the handler. 478 477 479 - shost->host_failed is zero. 478 - shost->host_failed is zero. 480 479 481 - Each scmd is in such a state that scsi_setu 480 - Each scmd is in such a state that scsi_setup_cmd_retry() on the 482 scmd doesn't make any difference. 481 scmd doesn't make any difference. 483 482 484 - shost->eh_cmd_q is cleared. 483 - shost->eh_cmd_q is cleared. 485 484 486 - Each scmd->eh_entry is cleared. 485 - Each scmd->eh_entry is cleared. 487 486 488 - Either scsi_queue_insert() or scsi_finish_c 487 - Either scsi_queue_insert() or scsi_finish_command() is called on 489 each scmd. Note that the handler is free t 488 each scmd. Note that the handler is free to use scmd->retries and 490 ->allowed to limit the number of retries. 489 ->allowed to limit the number of retries. 491 490 492 491 493 2.2.3 Things to consider 492 2.2.3 Things to consider 494 ^^^^^^^^^^^^^^^^^^^^^^^^ 493 ^^^^^^^^^^^^^^^^^^^^^^^^ 495 494 496 - Know that timed out scmds are still active 495 - Know that timed out scmds are still active on lower layers. Make 497 lower layers forget about them before doing 496 lower layers forget about them before doing anything else with 498 those scmds. 497 those scmds. 499 498 500 - For consistency, when accessing/modifying s 499 - For consistency, when accessing/modifying shost data structure, 501 grab shost->host_lock. 500 grab shost->host_lock. 502 501 503 - On completion, each failed sdev must have f 502 - On completion, each failed sdev must have forgotten about all 504 active scmds. 503 active scmds. 505 504 506 - On completion, each failed sdev must be rea 505 - On completion, each failed sdev must be ready for new commands or 507 offline. 506 offline. 508 507 509 508 510 Tejun Heo 509 Tejun Heo 511 htejun@gmail.com 510 htejun@gmail.com 512 511 513 11th September 2005 512 11th September 2005
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.