~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/bpf/map_hash.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/bpf/map_hash.rst (Version linux-6.12-rc7) and /Documentation/bpf/map_hash.rst (Version linux-6.11.7)


  1 .. SPDX-License-Identifier: GPL-2.0-only            1 .. SPDX-License-Identifier: GPL-2.0-only
  2 .. Copyright (C) 2022 Red Hat, Inc.                 2 .. Copyright (C) 2022 Red Hat, Inc.
  3 .. Copyright (C) 2022-2023 Isovalent, Inc.          3 .. Copyright (C) 2022-2023 Isovalent, Inc.
  4                                                     4 
  5 ==============================================      5 ===============================================
  6 BPF_MAP_TYPE_HASH, with PERCPU and LRU Variant      6 BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants
  7 ==============================================      7 ===============================================
  8                                                     8 
  9 .. note::                                           9 .. note::
 10    - ``BPF_MAP_TYPE_HASH`` was introduced in k     10    - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19
 11    - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduc     11    - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6
 12    - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_     12    - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
 13      were introduced in version 4.10               13      were introduced in version 4.10
 14                                                    14 
 15 ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCP     15 ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general
 16 purpose hash map storage. Both the key and the     16 purpose hash map storage. Both the key and the value can be structs,
 17 allowing for composite keys and values.            17 allowing for composite keys and values.
 18                                                    18 
 19 The kernel is responsible for allocating and f     19 The kernel is responsible for allocating and freeing key/value pairs, up
 20 to the max_entries limit that you specify. Has     20 to the max_entries limit that you specify. Hash maps use pre-allocation
 21 of hash table elements by default. The ``BPF_F     21 of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be
 22 used to disable pre-allocation when it is too      22 used to disable pre-allocation when it is too memory expensive.
 23                                                    23 
 24 ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separa     24 ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per
 25 CPU. The per-cpu values are stored internally      25 CPU. The per-cpu values are stored internally in an array.
 26                                                    26 
 27 The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TY     27 The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
 28 variants add LRU semantics to their respective     28 variants add LRU semantics to their respective hash tables. An LRU hash
 29 will automatically evict the least recently us     29 will automatically evict the least recently used entries when the hash
 30 table reaches capacity. An LRU hash maintains      30 table reaches capacity. An LRU hash maintains an internal LRU list that
 31 is used to select elements for eviction. This      31 is used to select elements for eviction. This internal LRU list is
 32 shared across CPUs but it is possible to reque     32 shared across CPUs but it is possible to request a per CPU LRU list with
 33 the ``BPF_F_NO_COMMON_LRU`` flag when calling      33 the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``.  The
 34 following table outlines the properties of LRU     34 following table outlines the properties of LRU maps depending on the a
 35 map type and the flags used to create the map.     35 map type and the flags used to create the map.
 36                                                    36 
 37 ======================== =====================     37 ======================== ========================= ================================
 38 Flag                     ``BPF_MAP_TYPE_LRU_HA     38 Flag                     ``BPF_MAP_TYPE_LRU_HASH`` ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
 39 ======================== =====================     39 ======================== ========================= ================================
 40 **BPF_F_NO_COMMON_LRU**  Per-CPU LRU, global m     40 **BPF_F_NO_COMMON_LRU**  Per-CPU LRU, global map   Per-CPU LRU, per-cpu map
 41 **!BPF_F_NO_COMMON_LRU** Global LRU, global ma     41 **!BPF_F_NO_COMMON_LRU** Global LRU, global map    Global LRU, per-cpu map
 42 ======================== =====================     42 ======================== ========================= ================================
 43                                                    43 
 44 Usage                                              44 Usage
 45 =====                                              45 =====
 46                                                    46 
 47 Kernel BPF                                         47 Kernel BPF
 48 ----------                                         48 ----------
 49                                                    49 
 50 bpf_map_update_elem()                              50 bpf_map_update_elem()
 51 ~~~~~~~~~~~~~~~~~~~~~                              51 ~~~~~~~~~~~~~~~~~~~~~
 52                                                    52 
 53 .. code-block:: c                                  53 .. code-block:: c
 54                                                    54 
 55    long bpf_map_update_elem(struct bpf_map *ma     55    long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags)
 56                                                    56 
 57 Hash entries can be added or updated using the     57 Hash entries can be added or updated using the ``bpf_map_update_elem()``
 58 helper. This helper replaces existing elements     58 helper. This helper replaces existing elements atomically. The ``flags``
 59 parameter can be used to control the update be     59 parameter can be used to control the update behaviour:
 60                                                    60 
 61 - ``BPF_ANY`` will create a new element or upd     61 - ``BPF_ANY`` will create a new element or update an existing element
 62 - ``BPF_NOEXIST`` will create a new element on     62 - ``BPF_NOEXIST`` will create a new element only if one did not already
 63   exist                                            63   exist
 64 - ``BPF_EXIST`` will update an existing elemen     64 - ``BPF_EXIST`` will update an existing element
 65                                                    65 
 66 ``bpf_map_update_elem()`` returns 0 on success     66 ``bpf_map_update_elem()`` returns 0 on success, or negative error in
 67 case of failure.                                   67 case of failure.
 68                                                    68 
 69 bpf_map_lookup_elem()                              69 bpf_map_lookup_elem()
 70 ~~~~~~~~~~~~~~~~~~~~~                              70 ~~~~~~~~~~~~~~~~~~~~~
 71                                                    71 
 72 .. code-block:: c                                  72 .. code-block:: c
 73                                                    73 
 74    void *bpf_map_lookup_elem(struct bpf_map *m     74    void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
 75                                                    75 
 76 Hash entries can be retrieved using the ``bpf_     76 Hash entries can be retrieved using the ``bpf_map_lookup_elem()``
 77 helper. This helper returns a pointer to the v     77 helper. This helper returns a pointer to the value associated with
 78 ``key``, or ``NULL`` if no entry was found.        78 ``key``, or ``NULL`` if no entry was found.
 79                                                    79 
 80 bpf_map_delete_elem()                              80 bpf_map_delete_elem()
 81 ~~~~~~~~~~~~~~~~~~~~~                              81 ~~~~~~~~~~~~~~~~~~~~~
 82                                                    82 
 83 .. code-block:: c                                  83 .. code-block:: c
 84                                                    84 
 85    long bpf_map_delete_elem(struct bpf_map *ma     85    long bpf_map_delete_elem(struct bpf_map *map, const void *key)
 86                                                    86 
 87 Hash entries can be deleted using the ``bpf_ma     87 Hash entries can be deleted using the ``bpf_map_delete_elem()``
 88 helper. This helper will return 0 on success,      88 helper. This helper will return 0 on success, or negative error in case
 89 of failure.                                        89 of failure.
 90                                                    90 
 91 Per CPU Hashes                                     91 Per CPU Hashes
 92 --------------                                     92 --------------
 93                                                    93 
 94 For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP     94 For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
 95 the ``bpf_map_update_elem()`` and ``bpf_map_lo     95 the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers
 96 automatically access the hash slot for the cur     96 automatically access the hash slot for the current CPU.
 97                                                    97 
 98 bpf_map_lookup_percpu_elem()                       98 bpf_map_lookup_percpu_elem()
 99 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~                       99 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
100                                                   100 
101 .. code-block:: c                                 101 .. code-block:: c
102                                                   102 
103    void *bpf_map_lookup_percpu_elem(struct bpf    103    void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu)
104                                                   104 
105 The ``bpf_map_lookup_percpu_elem()`` helper ca    105 The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the
106 value in the hash slot for a specific CPU. Ret    106 value in the hash slot for a specific CPU. Returns value associated with
107 ``key`` on ``cpu`` , or ``NULL`` if no entry w    107 ``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is
108 invalid.                                          108 invalid.
109                                                   109 
110 Concurrency                                       110 Concurrency
111 -----------                                       111 -----------
112                                                   112 
113 Values stored in ``BPF_MAP_TYPE_HASH`` can be     113 Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by
114 programs running on different CPUs.  Since Ker    114 programs running on different CPUs.  Since Kernel version 5.1, the BPF
115 infrastructure provides ``struct bpf_spin_lock    115 infrastructure provides ``struct bpf_spin_lock`` to synchronise access.
116 See ``tools/testing/selftests/bpf/progs/test_s    116 See ``tools/testing/selftests/bpf/progs/test_spin_lock.c``.
117                                                   117 
118 Userspace                                         118 Userspace
119 ---------                                         119 ---------
120                                                   120 
121 bpf_map_get_next_key()                            121 bpf_map_get_next_key()
122 ~~~~~~~~~~~~~~~~~~~~~~                            122 ~~~~~~~~~~~~~~~~~~~~~~
123                                                   123 
124 .. code-block:: c                                 124 .. code-block:: c
125                                                   125 
126    int bpf_map_get_next_key(int fd, const void    126    int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key)
127                                                   127 
128 In userspace, it is possible to iterate throug    128 In userspace, it is possible to iterate through the keys of a hash using
129 libbpf's ``bpf_map_get_next_key()`` function.     129 libbpf's ``bpf_map_get_next_key()`` function. The first key can be fetched by
130 calling ``bpf_map_get_next_key()`` with ``cur_    130 calling ``bpf_map_get_next_key()`` with ``cur_key`` set to
131 ``NULL``. Subsequent calls will fetch the next    131 ``NULL``. Subsequent calls will fetch the next key that follows the
132 current key. ``bpf_map_get_next_key()`` return    132 current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if
133 cur_key is the last key in the hash, or negati    133 cur_key is the last key in the hash, or negative error in case of
134 failure.                                          134 failure.
135                                                   135 
136 Note that if ``cur_key`` gets deleted then ``b    136 Note that if ``cur_key`` gets deleted then ``bpf_map_get_next_key()``
137 will instead return the *first* key in the has    137 will instead return the *first* key in the hash table which is
138 undesirable. It is recommended to use batched     138 undesirable. It is recommended to use batched lookup if there is going
139 to be key deletion intermixed with ``bpf_map_g    139 to be key deletion intermixed with ``bpf_map_get_next_key()``.
140                                                   140 
141 Examples                                          141 Examples
142 ========                                          142 ========
143                                                   143 
144 Please see the ``tools/testing/selftests/bpf``    144 Please see the ``tools/testing/selftests/bpf`` directory for functional
145 examples.  The code snippets below demonstrate    145 examples.  The code snippets below demonstrates API usage.
146                                                   146 
147 This example shows how to declare an LRU Hash     147 This example shows how to declare an LRU Hash with a struct key and a
148 struct value.                                     148 struct value.
149                                                   149 
150 .. code-block:: c                                 150 .. code-block:: c
151                                                   151 
152     #include <linux/bpf.h>                        152     #include <linux/bpf.h>
153     #include <bpf/bpf_helpers.h>                  153     #include <bpf/bpf_helpers.h>
154                                                   154 
155     struct key {                                  155     struct key {
156         __u32 srcip;                              156         __u32 srcip;
157     };                                            157     };
158                                                   158 
159     struct value {                                159     struct value {
160         __u64 packets;                            160         __u64 packets;
161         __u64 bytes;                              161         __u64 bytes;
162     };                                            162     };
163                                                   163 
164     struct {                                      164     struct {
165             __uint(type, BPF_MAP_TYPE_LRU_HASH    165             __uint(type, BPF_MAP_TYPE_LRU_HASH);
166             __uint(max_entries, 32);              166             __uint(max_entries, 32);
167             __type(key, struct key);              167             __type(key, struct key);
168             __type(value, struct value);          168             __type(value, struct value);
169     } packet_stats SEC(".maps");                  169     } packet_stats SEC(".maps");
170                                                   170 
171 This example shows how to create or update has    171 This example shows how to create or update hash values using atomic
172 instructions:                                     172 instructions:
173                                                   173 
174 .. code-block:: c                                 174 .. code-block:: c
175                                                   175 
176     static void update_stats(__u32 srcip, int     176     static void update_stats(__u32 srcip, int bytes)
177     {                                             177     {
178             struct key key = {                    178             struct key key = {
179                     .srcip = srcip,               179                     .srcip = srcip,
180             };                                    180             };
181             struct value *value = bpf_map_look    181             struct value *value = bpf_map_lookup_elem(&packet_stats, &key);
182                                                   182 
183             if (value) {                          183             if (value) {
184                     __sync_fetch_and_add(&valu    184                     __sync_fetch_and_add(&value->packets, 1);
185                     __sync_fetch_and_add(&valu    185                     __sync_fetch_and_add(&value->bytes, bytes);
186             } else {                              186             } else {
187                     struct value newval = { 1,    187                     struct value newval = { 1, bytes };
188                                                   188 
189                     bpf_map_update_elem(&packe    189                     bpf_map_update_elem(&packet_stats, &key, &newval, BPF_NOEXIST);
190             }                                     190             }
191     }                                             191     }
192                                                   192 
193 Userspace walking the map elements from the ma    193 Userspace walking the map elements from the map declared above:
194                                                   194 
195 .. code-block:: c                                 195 .. code-block:: c
196                                                   196 
197     #include <bpf/libbpf.h>                       197     #include <bpf/libbpf.h>
198     #include <bpf/bpf.h>                          198     #include <bpf/bpf.h>
199                                                   199 
200     static void walk_hash_elements(int map_fd)    200     static void walk_hash_elements(int map_fd)
201     {                                             201     {
202             struct key *cur_key = NULL;           202             struct key *cur_key = NULL;
203             struct key next_key;                  203             struct key next_key;
204             struct value value;                   204             struct value value;
205             int err;                              205             int err;
206                                                   206 
207             for (;;) {                            207             for (;;) {
208                     err = bpf_map_get_next_key    208                     err = bpf_map_get_next_key(map_fd, cur_key, &next_key);
209                     if (err)                      209                     if (err)
210                             break;                210                             break;
211                                                   211 
212                     bpf_map_lookup_elem(map_fd    212                     bpf_map_lookup_elem(map_fd, &next_key, &value);
213                                                   213 
214                     // Use key and value here     214                     // Use key and value here
215                                                   215 
216                     cur_key = &next_key;          216                     cur_key = &next_key;
217             }                                     217             }
218     }                                             218     }
219                                                   219 
220 Internals                                         220 Internals
221 =========                                         221 =========
222                                                   222 
223 This section of the document is targeted at Li    223 This section of the document is targeted at Linux developers and describes
224 aspects of the map implementations that are no    224 aspects of the map implementations that are not considered stable ABI. The
225 following details are subject to change in fut    225 following details are subject to change in future versions of the kernel.
226                                                   226 
227 ``BPF_MAP_TYPE_LRU_HASH`` and variants            227 ``BPF_MAP_TYPE_LRU_HASH`` and variants
228 --------------------------------------            228 --------------------------------------
229                                                   229 
230 Updating elements in LRU maps may trigger evic    230 Updating elements in LRU maps may trigger eviction behaviour when the capacity
231 of the map is reached. There are various steps    231 of the map is reached. There are various steps that the update algorithm
232 attempts in order to enforce the LRU property     232 attempts in order to enforce the LRU property which have increasing impacts on
233 other CPUs involved in the following operation    233 other CPUs involved in the following operation attempts:
234                                                   234 
235 - Attempt to use CPU-local state to batch oper    235 - Attempt to use CPU-local state to batch operations
236 - Attempt to fetch free nodes from global list    236 - Attempt to fetch free nodes from global lists
237 - Attempt to pull any node from a global list     237 - Attempt to pull any node from a global list and remove it from the hashmap
238 - Attempt to pull any node from any CPU's list    238 - Attempt to pull any node from any CPU's list and remove it from the hashmap
239                                                   239 
240 This algorithm is described visually in the fo    240 This algorithm is described visually in the following diagram. See the
241 description in commit 3a08c2fd7634 ("bpf: LRU     241 description in commit 3a08c2fd7634 ("bpf: LRU List") for a full explanation of
242 the corresponding operations:                     242 the corresponding operations:
243                                                   243 
244 .. kernel-figure::  map_lru_hash_update.dot       244 .. kernel-figure::  map_lru_hash_update.dot
245    :alt:    Diagram outlining the LRU eviction    245    :alt:    Diagram outlining the LRU eviction steps taken during map update.
246                                                   246 
247    LRU hash eviction during map update for ``B    247    LRU hash eviction during map update for ``BPF_MAP_TYPE_LRU_HASH`` and
248    variants. See the dot file source for kerne    248    variants. See the dot file source for kernel function name code references.
249                                                   249 
250 Map updates start from the oval in the top rig    250 Map updates start from the oval in the top right "begin ``bpf_map_update()``"
251 and progress through the graph towards the bot    251 and progress through the graph towards the bottom where the result may be
252 either a successful update or a failure with v    252 either a successful update or a failure with various error codes. The key in
253 the top right provides indicators for which lo    253 the top right provides indicators for which locks may be involved in specific
254 operations. This is intended as a visual hint     254 operations. This is intended as a visual hint for reasoning about how map
255 contention may impact update operations, thoug    255 contention may impact update operations, though the map type and flags may
256 impact the actual contention on those locks, b    256 impact the actual contention on those locks, based on the logic described in
257 the table above. For instance, if the map is c    257 the table above. For instance, if the map is created with type
258 ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` and flags ``B    258 ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` and flags ``BPF_F_NO_COMMON_LRU`` then all map
259 properties would be per-cpu.                      259 properties would be per-cpu.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php