1 .. SPDX-License-Identifier: GPL-2.0-only 2 .. Copyright (C) 2022 Red Hat, Inc. 3 .. Copyright (C) 2022-2023 Isovalent, Inc. 4 5 ============================================== 6 BPF_MAP_TYPE_HASH, with PERCPU and LRU Variant 7 ============================================== 8 9 .. note:: 10 - ``BPF_MAP_TYPE_HASH`` was introduced in k 11 - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduc 12 - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_ 13 were introduced in version 4.10 14 15 ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCP 16 purpose hash map storage. Both the key and the 17 allowing for composite keys and values. 18 19 The kernel is responsible for allocating and f 20 to the max_entries limit that you specify. Has 21 of hash table elements by default. The ``BPF_F 22 used to disable pre-allocation when it is too 23 24 ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separa 25 CPU. The per-cpu values are stored internally 26 27 The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TY 28 variants add LRU semantics to their respective 29 will automatically evict the least recently us 30 table reaches capacity. An LRU hash maintains 31 is used to select elements for eviction. This 32 shared across CPUs but it is possible to reque 33 the ``BPF_F_NO_COMMON_LRU`` flag when calling 34 following table outlines the properties of LRU 35 map type and the flags used to create the map. 36 37 ======================== ===================== 38 Flag ``BPF_MAP_TYPE_LRU_HA 39 ======================== ===================== 40 **BPF_F_NO_COMMON_LRU** Per-CPU LRU, global m 41 **!BPF_F_NO_COMMON_LRU** Global LRU, global ma 42 ======================== ===================== 43 44 Usage 45 ===== 46 47 Kernel BPF 48 ---------- 49 50 bpf_map_update_elem() 51 ~~~~~~~~~~~~~~~~~~~~~ 52 53 .. code-block:: c 54 55 long bpf_map_update_elem(struct bpf_map *ma 56 57 Hash entries can be added or updated using the 58 helper. This helper replaces existing elements 59 parameter can be used to control the update be 60 61 - ``BPF_ANY`` will create a new element or upd 62 - ``BPF_NOEXIST`` will create a new element on 63 exist 64 - ``BPF_EXIST`` will update an existing elemen 65 66 ``bpf_map_update_elem()`` returns 0 on success 67 case of failure. 68 69 bpf_map_lookup_elem() 70 ~~~~~~~~~~~~~~~~~~~~~ 71 72 .. code-block:: c 73 74 void *bpf_map_lookup_elem(struct bpf_map *m 75 76 Hash entries can be retrieved using the ``bpf_ 77 helper. This helper returns a pointer to the v 78 ``key``, or ``NULL`` if no entry was found. 79 80 bpf_map_delete_elem() 81 ~~~~~~~~~~~~~~~~~~~~~ 82 83 .. code-block:: c 84 85 long bpf_map_delete_elem(struct bpf_map *ma 86 87 Hash entries can be deleted using the ``bpf_ma 88 helper. This helper will return 0 on success, 89 of failure. 90 91 Per CPU Hashes 92 -------------- 93 94 For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP 95 the ``bpf_map_update_elem()`` and ``bpf_map_lo 96 automatically access the hash slot for the cur 97 98 bpf_map_lookup_percpu_elem() 99 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 100 101 .. code-block:: c 102 103 void *bpf_map_lookup_percpu_elem(struct bpf 104 105 The ``bpf_map_lookup_percpu_elem()`` helper ca 106 value in the hash slot for a specific CPU. Ret 107 ``key`` on ``cpu`` , or ``NULL`` if no entry w 108 invalid. 109 110 Concurrency 111 ----------- 112 113 Values stored in ``BPF_MAP_TYPE_HASH`` can be 114 programs running on different CPUs. Since Ker 115 infrastructure provides ``struct bpf_spin_lock 116 See ``tools/testing/selftests/bpf/progs/test_s 117 118 Userspace 119 --------- 120 121 bpf_map_get_next_key() 122 ~~~~~~~~~~~~~~~~~~~~~~ 123 124 .. code-block:: c 125 126 int bpf_map_get_next_key(int fd, const void 127 128 In userspace, it is possible to iterate throug 129 libbpf's ``bpf_map_get_next_key()`` function. 130 calling ``bpf_map_get_next_key()`` with ``cur_ 131 ``NULL``. Subsequent calls will fetch the next 132 current key. ``bpf_map_get_next_key()`` return 133 cur_key is the last key in the hash, or negati 134 failure. 135 136 Note that if ``cur_key`` gets deleted then ``b 137 will instead return the *first* key in the has 138 undesirable. It is recommended to use batched 139 to be key deletion intermixed with ``bpf_map_g 140 141 Examples 142 ======== 143 144 Please see the ``tools/testing/selftests/bpf`` 145 examples. The code snippets below demonstrate 146 147 This example shows how to declare an LRU Hash 148 struct value. 149 150 .. code-block:: c 151 152 #include <linux/bpf.h> 153 #include <bpf/bpf_helpers.h> 154 155 struct key { 156 __u32 srcip; 157 }; 158 159 struct value { 160 __u64 packets; 161 __u64 bytes; 162 }; 163 164 struct { 165 __uint(type, BPF_MAP_TYPE_LRU_HASH 166 __uint(max_entries, 32); 167 __type(key, struct key); 168 __type(value, struct value); 169 } packet_stats SEC(".maps"); 170 171 This example shows how to create or update has 172 instructions: 173 174 .. code-block:: c 175 176 static void update_stats(__u32 srcip, int 177 { 178 struct key key = { 179 .srcip = srcip, 180 }; 181 struct value *value = bpf_map_look 182 183 if (value) { 184 __sync_fetch_and_add(&valu 185 __sync_fetch_and_add(&valu 186 } else { 187 struct value newval = { 1, 188 189 bpf_map_update_elem(&packe 190 } 191 } 192 193 Userspace walking the map elements from the ma 194 195 .. code-block:: c 196 197 #include <bpf/libbpf.h> 198 #include <bpf/bpf.h> 199 200 static void walk_hash_elements(int map_fd) 201 { 202 struct key *cur_key = NULL; 203 struct key next_key; 204 struct value value; 205 int err; 206 207 for (;;) { 208 err = bpf_map_get_next_key 209 if (err) 210 break; 211 212 bpf_map_lookup_elem(map_fd 213 214 // Use key and value here 215 216 cur_key = &next_key; 217 } 218 } 219 220 Internals 221 ========= 222 223 This section of the document is targeted at Li 224 aspects of the map implementations that are no 225 following details are subject to change in fut 226 227 ``BPF_MAP_TYPE_LRU_HASH`` and variants 228 -------------------------------------- 229 230 Updating elements in LRU maps may trigger evic 231 of the map is reached. There are various steps 232 attempts in order to enforce the LRU property 233 other CPUs involved in the following operation 234 235 - Attempt to use CPU-local state to batch oper 236 - Attempt to fetch free nodes from global list 237 - Attempt to pull any node from a global list 238 - Attempt to pull any node from any CPU's list 239 240 This algorithm is described visually in the fo 241 description in commit 3a08c2fd7634 ("bpf: LRU 242 the corresponding operations: 243 244 .. kernel-figure:: map_lru_hash_update.dot 245 :alt: Diagram outlining the LRU eviction 246 247 LRU hash eviction during map update for ``B 248 variants. See the dot file source for kerne 249 250 Map updates start from the oval in the top rig 251 and progress through the graph towards the bot 252 either a successful update or a failure with v 253 the top right provides indicators for which lo 254 operations. This is intended as a visual hint 255 contention may impact update operations, thoug 256 impact the actual contention on those locks, b 257 the table above. For instance, if the map is c 258 ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` and flags ``B 259 properties would be per-cpu.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.