~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/mm/zsmalloc.rst

Version: ~ [ linux-6.11.5 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.58 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.114 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.169 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.228 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.284 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.322 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 ========
  2 zsmalloc
  3 ========
  4 
  5 This allocator is designed for use with zram. Thus, the allocator is
  6 supposed to work well under low memory conditions. In particular, it
  7 never attempts higher order page allocation which is very likely to
  8 fail under memory pressure. On the other hand, if we just use single
  9 (0-order) pages, it would suffer from very high fragmentation --
 10 any object of size PAGE_SIZE/2 or larger would occupy an entire page.
 11 This was one of the major issues with its predecessor (xvmalloc).
 12 
 13 To overcome these issues, zsmalloc allocates a bunch of 0-order pages
 14 and links them together using various 'struct page' fields. These linked
 15 pages act as a single higher-order page i.e. an object can span 0-order
 16 page boundaries. The code refers to these linked pages as a single entity
 17 called zspage.
 18 
 19 For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
 20 since this satisfies the requirements of all its current users (in the
 21 worst case, page is incompressible and is thus stored "as-is" i.e. in
 22 uncompressed form). For allocation requests larger than this size, failure
 23 is returned (see zs_malloc).
 24 
 25 Additionally, zs_malloc() does not return a dereferenceable pointer.
 26 Instead, it returns an opaque handle (unsigned long) which encodes actual
 27 location of the allocated object. The reason for this indirection is that
 28 zsmalloc does not keep zspages permanently mapped since that would cause
 29 issues on 32-bit systems where the VA region for kernel space mappings
 30 is very small. So, before using the allocating memory, the object has to
 31 be mapped using zs_map_object() to get a usable pointer and subsequently
 32 unmapped using zs_unmap_object().
 33 
 34 stat
 35 ====
 36 
 37 With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via
 38 ``/sys/kernel/debug/zsmalloc/<user name>``. Here is a sample of stat output::
 39 
 40  # cat /sys/kernel/debug/zsmalloc/zram0/classes
 41 
 42  class  size       10%       20%       30%       40%       50%       60%       70%       80%       90%       99%      100% obj_allocated   obj_used pages_used pages_per_zspage freeable
 43     ...
 44     ...
 45     30   512         0        12         4         1         0         1         0         0         1         0       414          3464       3346        433                1       14
 46     31   528         2         7         2         2         1         0         1         0         0         2       117          4154       3793        536                4       44
 47     32   544         6         3         4         1         2         1         0         0         0         1       260          4170       3965        556                2       26
 48     ...
 49     ...
 50 
 51 
 52 class
 53         index
 54 size
 55         object size zspage stores
 56 10%
 57         the number of zspages with usage ratio less than 10% (see below)
 58 20%
 59         the number of zspages with usage ratio between 10% and 20%
 60 30%
 61         the number of zspages with usage ratio between 20% and 30%
 62 40%
 63         the number of zspages with usage ratio between 30% and 40%
 64 50%
 65         the number of zspages with usage ratio between 40% and 50%
 66 60%
 67         the number of zspages with usage ratio between 50% and 60%
 68 70%
 69         the number of zspages with usage ratio between 60% and 70%
 70 80%
 71         the number of zspages with usage ratio between 70% and 80%
 72 90%
 73         the number of zspages with usage ratio between 80% and 90%
 74 99%
 75         the number of zspages with usage ratio between 90% and 99%
 76 100%
 77         the number of zspages with usage ratio 100%
 78 obj_allocated
 79         the number of objects allocated
 80 obj_used
 81         the number of objects allocated to the user
 82 pages_used
 83         the number of pages allocated for the class
 84 pages_per_zspage
 85         the number of 0-order pages to make a zspage
 86 freeable
 87         the approximate number of pages class compaction can free
 88 
 89 Each zspage maintains inuse counter which keeps track of the number of
 90 objects stored in the zspage.  The inuse counter determines the zspage's
 91 "fullness group" which is calculated as the ratio of the "inuse" objects to
 92 the total number of objects the zspage can hold (objs_per_zspage). The
 93 closer the inuse counter is to objs_per_zspage, the better.
 94 
 95 Internals
 96 =========
 97 
 98 zsmalloc has 255 size classes, each of which can hold a number of zspages.
 99 Each zspage can contain up to ZSMALLOC_CHAIN_SIZE physical (0-order) pages.
100 The optimal zspage chain size for each size class is calculated during the
101 creation of the zsmalloc pool (see calculate_zspage_chain_size()).
102 
103 As an optimization, zsmalloc merges size classes that have similar
104 characteristics in terms of the number of pages per zspage and the number
105 of objects that each zspage can store.
106 
107 For instance, consider the following size classes:::
108 
109   class  size       10%   ....    100% obj_allocated   obj_used pages_used pages_per_zspage freeable
110   ...
111      94  1536        0    ....       0             0          0          0                3        0
112     100  1632        0    ....       0             0          0          0                2        0
113   ...
114 
115 
116 Size classes #95-99 are merged with size class #100. This means that when we
117 need to store an object of size, say, 1568 bytes, we end up using size class
118 #100 instead of size class #96. Size class #100 is meant for objects of size
119 1632 bytes, so each object of size 1568 bytes wastes 1632-1568=64 bytes.
120 
121 Size class #100 consists of zspages with 2 physical pages each, which can
122 hold a total of 5 objects. If we need to store 13 objects of size 1568, we
123 end up allocating three zspages, or 6 physical pages.
124 
125 However, if we take a closer look at size class #96 (which is meant for
126 objects of size 1568 bytes) and trace `calculate_zspage_chain_size()`, we
127 find that the most optimal zspage configuration for this class is a chain
128 of 5 physical pages:::
129 
130     pages per zspage      wasted bytes     used%
131            1                  960           76
132            2                  352           95
133            3                 1312           89
134            4                  704           95
135            5                   96           99
136 
137 This means that a class #96 configuration with 5 physical pages can store 13
138 objects of size 1568 in a single zspage, using a total of 5 physical pages.
139 This is more efficient than the class #100 configuration, which would use 6
140 physical pages to store the same number of objects.
141 
142 As the zspage chain size for class #96 increases, its key characteristics
143 such as pages per-zspage and objects per-zspage also change. This leads to
144 dewer class mergers, resulting in a more compact grouping of classes, which
145 reduces memory wastage.
146 
147 Let's take a closer look at the bottom of `/sys/kernel/debug/zsmalloc/zramX/classes`:::
148 
149   class  size       10%   ....    100% obj_allocated   obj_used pages_used pages_per_zspage freeable
150 
151   ...
152     202  3264         0   ..         0             0          0          0                4        0
153     254  4096         0   ..         0             0          0          0                1        0
154   ...
155 
156 Size class #202 stores objects of size 3264 bytes and has a maximum of 4 pages
157 per zspage. Any object larger than 3264 bytes is considered huge and belongs
158 to size class #254, which stores each object in its own physical page (objects
159 in huge classes do not share pages).
160 
161 Increasing the size of the chain of zspages also results in a higher watermark
162 for the huge size class and fewer huge classes overall. This allows for more
163 efficient storage of large objects.
164 
165 For zspage chain size of 8, huge class watermark becomes 3632 bytes:::
166 
167   class  size       10%   ....    100% obj_allocated   obj_used pages_used pages_per_zspage freeable
168 
169   ...
170     202  3264         0   ..         0             0          0          0                4        0
171     211  3408         0   ..         0             0          0          0                5        0
172     217  3504         0   ..         0             0          0          0                6        0
173     222  3584         0   ..         0             0          0          0                7        0
174     225  3632         0   ..         0             0          0          0                8        0
175     254  4096         0   ..         0             0          0          0                1        0
176   ...
177 
178 For zspage chain size of 16, huge class watermark becomes 3840 bytes:::
179 
180   class  size       10%   ....    100% obj_allocated   obj_used pages_used pages_per_zspage freeable
181 
182   ...
183     202  3264         0   ..         0             0          0          0                4        0
184     206  3328         0   ..         0             0          0          0               13        0
185     207  3344         0   ..         0             0          0          0                9        0
186     208  3360         0   ..         0             0          0          0               14        0
187     211  3408         0   ..         0             0          0          0                5        0
188     212  3424         0   ..         0             0          0          0               16        0
189     214  3456         0   ..         0             0          0          0               11        0
190     217  3504         0   ..         0             0          0          0                6        0
191     219  3536         0   ..         0             0          0          0               13        0
192     222  3584         0   ..         0             0          0          0                7        0
193     223  3600         0   ..         0             0          0          0               15        0
194     225  3632         0   ..         0             0          0          0                8        0
195     228  3680         0   ..         0             0          0          0                9        0
196     230  3712         0   ..         0             0          0          0               10        0
197     232  3744         0   ..         0             0          0          0               11        0
198     234  3776         0   ..         0             0          0          0               12        0
199     235  3792         0   ..         0             0          0          0               13        0
200     236  3808         0   ..         0             0          0          0               14        0
201     238  3840         0   ..         0             0          0          0               15        0
202     254  4096         0   ..         0             0          0          0                1        0
203   ...
204 
205 Overall the combined zspage chain size effect on zsmalloc pool configuration:::
206 
207   pages per zspage   number of size classes (clusters)   huge size class watermark
208          4                        69                               3264
209          5                        86                               3408
210          6                        93                               3504
211          7                       112                               3584
212          8                       123                               3632
213          9                       140                               3680
214         10                       143                               3712
215         11                       159                               3744
216         12                       164                               3776
217         13                       180                               3792
218         14                       183                               3808
219         15                       188                               3840
220         16                       191                               3840
221 
222 
223 A synthetic test
224 ----------------
225 
226 zram as a build artifacts storage (Linux kernel compilation).
227 
228 * `CONFIG_ZSMALLOC_CHAIN_SIZE=4`
229 
230   zsmalloc classes stats:::
231 
232     class  size       10%   ....    100% obj_allocated   obj_used pages_used pages_per_zspage freeable
233 
234     ...
235     Total              13   ..        51        413836     412973     159955                         3
236 
237   zram mm_stat:::
238 
239    1691783168 628083717 655175680        0 655175680       60        0    34048    34049
240 
241 
242 * `CONFIG_ZSMALLOC_CHAIN_SIZE=8`
243 
244   zsmalloc classes stats:::
245 
246     class  size       10%   ....    100% obj_allocated   obj_used pages_used pages_per_zspage freeable
247 
248     ...
249     Total              18   ..        87        414852     412978     156666                         0
250 
251   zram mm_stat:::
252 
253     1691803648 627793930 641703936        0 641703936       60        0    33591    33591
254 
255 Using larger zspage chains may result in using fewer physical pages, as seen
256 in the example where the number of physical pages used decreased from 159955
257 to 156666, at the same time maximum zsmalloc pool memory usage went down from
258 655175680 to 641703936 bytes.
259 
260 However, this advantage may be offset by the potential for increased system
261 memory pressure (as some zspages have larger chain sizes) in cases where there
262 is heavy internal fragmentation and zspool compaction is unable to relocate
263 objects and release zspages. In these cases, it is recommended to decrease
264 the limit on the size of the zspage chains (as specified by the
265 CONFIG_ZSMALLOC_CHAIN_SIZE option).
266 
267 Functions
268 =========
269 
270 .. kernel-doc:: mm/zsmalloc.c

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php