1 ========================================== 1 ========================================== 2 Explicit volatile write back cache control 2 Explicit volatile write back cache control 3 ========================================== 3 ========================================== 4 4 5 Introduction 5 Introduction 6 ------------ 6 ------------ 7 7 8 Many storage devices, especially in the consum 8 Many storage devices, especially in the consumer market, come with volatile 9 write back caches. That means the devices sig 9 write back caches. That means the devices signal I/O completion to the 10 operating system before data actually has hit 10 operating system before data actually has hit the non-volatile storage. This 11 behavior obviously speeds up various workloads 11 behavior obviously speeds up various workloads, but it means the operating 12 system needs to force data out to the non-vola 12 system needs to force data out to the non-volatile storage when it performs 13 a data integrity operation like fsync, sync or 13 a data integrity operation like fsync, sync or an unmount. 14 14 15 The Linux block layer provides two simple mech 15 The Linux block layer provides two simple mechanisms that let filesystems 16 control the caching behavior of the storage de 16 control the caching behavior of the storage device. These mechanisms are 17 a forced cache flush, and the Force Unit Acces 17 a forced cache flush, and the Force Unit Access (FUA) flag for requests. 18 18 19 19 20 Explicit cache flushes 20 Explicit cache flushes 21 ---------------------- 21 ---------------------- 22 22 23 The REQ_PREFLUSH flag can be OR ed into the r/ 23 The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from 24 the filesystem and will make sure the volatile 24 the filesystem and will make sure the volatile cache of the storage device 25 has been flushed before the actual I/O operati 25 has been flushed before the actual I/O operation is started. This explicitly 26 guarantees that previously completed write req 26 guarantees that previously completed write requests are on non-volatile 27 storage before the flagged bio starts. In addi 27 storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be 28 set on an otherwise empty bio structure, which 28 set on an otherwise empty bio structure, which causes only an explicit cache 29 flush without any dependent I/O. It is recomm 29 flush without any dependent I/O. It is recommend to use 30 the blkdev_issue_flush() helper for a pure cac 30 the blkdev_issue_flush() helper for a pure cache flush. 31 31 32 32 33 Forced Unit Access 33 Forced Unit Access 34 ------------------ 34 ------------------ 35 35 36 The REQ_FUA flag can be OR ed into the r/w fla 36 The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the 37 filesystem and will make sure that I/O complet 37 filesystem and will make sure that I/O completion for this request is only 38 signaled after the data has been committed to 38 signaled after the data has been committed to non-volatile storage. 39 39 40 40 41 Implementation details for filesystems 41 Implementation details for filesystems 42 -------------------------------------- 42 -------------------------------------- 43 43 44 Filesystems can simply set the REQ_PREFLUSH an 44 Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to 45 worry if the underlying devices need any expli 45 worry if the underlying devices need any explicit cache flushing and how 46 the Forced Unit Access is implemented. The RE 46 the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags 47 may both be set on a single bio. 47 may both be set on a single bio. 48 48 49 Feature settings for block drivers 49 Feature settings for block drivers 50 ---------------------------------- 50 ---------------------------------- 51 51 52 For devices that do not support volatile write 52 For devices that do not support volatile write caches there is no driver 53 support required, the block layer completes em 53 support required, the block layer completes empty REQ_PREFLUSH requests before 54 entering the driver and strips off the REQ_PRE 54 entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from 55 requests that have a payload. 55 requests that have a payload. 56 56 57 For devices with volatile write caches the dri 57 For devices with volatile write caches the driver needs to tell the block layer 58 that it supports flushing caches by setting th 58 that it supports flushing caches by setting the 59 59 60 BLK_FEAT_WRITE_CACHE 60 BLK_FEAT_WRITE_CACHE 61 61 62 flag in the queue_limits feature field. For d 62 flag in the queue_limits feature field. For devices that also support the FUA 63 bit the block layer needs to be told to pass o 63 bit the block layer needs to be told to pass on the REQ_FUA bit by also setting 64 the 64 the 65 65 66 BLK_FEAT_FUA 66 BLK_FEAT_FUA 67 67 68 flag in the features field of the queue_limits 68 flag in the features field of the queue_limits structure. 69 69 70 Implementation details for bio based block dri 70 Implementation details for bio based block drivers 71 ---------------------------------------------- 71 -------------------------------------------------- 72 72 73 For bio based drivers the REQ_PREFLUSH and REQ 73 For bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simply passed on to 74 the driver if the driver sets the BLK_FEAT_WRI 74 the driver if the driver sets the BLK_FEAT_WRITE_CACHE flag and the driver 75 needs to handle them. 75 needs to handle them. 76 76 77 *NOTE*: The REQ_FUA bit also gets passed on wh 77 *NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is 78 _not_ set. Any bio based driver that sets BLK 78 _not_ set. Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to 79 handle REQ_FUA. 79 handle REQ_FUA. 80 80 81 For remapping drivers the REQ_FUA bits need to 81 For remapping drivers the REQ_FUA bits need to be propagated to underlying 82 devices, and a global flush needs to be implem 82 devices, and a global flush needs to be implemented for bios with the 83 REQ_PREFLUSH bit set. 83 REQ_PREFLUSH bit set. 84 84 85 Implementation details for blk-mq drivers 85 Implementation details for blk-mq drivers 86 ----------------------------------------- 86 ----------------------------------------- 87 87 88 When the BLK_FEAT_WRITE_CACHE flag is set, REQ 88 When the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests 89 with a payload are automatically turned into a 89 with a payload are automatically turned into a sequence of a REQ_OP_FLUSH 90 request followed by the actual write by the bl 90 request followed by the actual write by the block layer. 91 91 92 When the BLK_FEAT_FUA flags is set, the REQ_FU 92 When the BLK_FEAT_FUA flags is set, the REQ_FUA bit is simply passed on for the 93 REQ_OP_WRITE request, else a REQ_OP_FLUSH requ 93 REQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer 94 after the completion of the write request for 94 after the completion of the write request for bio submissions with the REQ_FUA 95 bit set. 95 bit set.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.