~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/bpf/map_cgroup_storage.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0-only
  2 .. Copyright (C) 2020 Google LLC.
  3 
  4 ===========================
  5 BPF_MAP_TYPE_CGROUP_STORAGE
  6 ===========================
  7 
  8 The ``BPF_MAP_TYPE_CGROUP_STORAGE`` map type represents a local fix-sized
  9 storage. It is only available with ``CONFIG_CGROUP_BPF``, and to programs that
 10 attach to cgroups; the programs are made available by the same Kconfig. The
 11 storage is identified by the cgroup the program is attached to.
 12 
 13 The map provide a local storage at the cgroup that the BPF program is attached
 14 to. It provides a faster and simpler access than the general purpose hash
 15 table, which performs a hash table lookups, and requires user to track live
 16 cgroups on their own.
 17 
 18 This document describes the usage and semantics of the
 19 ``BPF_MAP_TYPE_CGROUP_STORAGE`` map type. Some of its behaviors was changed in
 20 Linux 5.9 and this document will describe the differences.
 21 
 22 Usage
 23 =====
 24 
 25 The map uses key of type of either ``__u64 cgroup_inode_id`` or
 26 ``struct bpf_cgroup_storage_key``, declared in ``linux/bpf.h``::
 27 
 28     struct bpf_cgroup_storage_key {
 29             __u64 cgroup_inode_id;
 30             __u32 attach_type;
 31     };
 32 
 33 ``cgroup_inode_id`` is the inode id of the cgroup directory.
 34 ``attach_type`` is the program's attach type.
 35 
 36 Linux 5.9 added support for type ``__u64 cgroup_inode_id`` as the key type.
 37 When this key type is used, then all attach types of the particular cgroup and
 38 map will share the same storage. Otherwise, if the type is
 39 ``struct bpf_cgroup_storage_key``, then programs of different attach types
 40 be isolated and see different storages.
 41 
 42 To access the storage in a program, use ``bpf_get_local_storage``::
 43 
 44     void *bpf_get_local_storage(void *map, u64 flags)
 45 
 46 ``flags`` is reserved for future use and must be 0.
 47 
 48 There is no implicit synchronization. Storages of ``BPF_MAP_TYPE_CGROUP_STORAGE``
 49 can be accessed by multiple programs across different CPUs, and user should
 50 take care of synchronization by themselves. The bpf infrastructure provides
 51 ``struct bpf_spin_lock`` to synchronize the storage. See
 52 ``tools/testing/selftests/bpf/progs/test_spin_lock.c``.
 53 
 54 Examples
 55 ========
 56 
 57 Usage with key type as ``struct bpf_cgroup_storage_key``::
 58 
 59     #include <bpf/bpf.h>
 60 
 61     struct {
 62             __uint(type, BPF_MAP_TYPE_CGROUP_STORAGE);
 63             __type(key, struct bpf_cgroup_storage_key);
 64             __type(value, __u32);
 65     } cgroup_storage SEC(".maps");
 66 
 67     int program(struct __sk_buff *skb)
 68     {
 69             __u32 *ptr = bpf_get_local_storage(&cgroup_storage, 0);
 70             __sync_fetch_and_add(ptr, 1);
 71 
 72             return 0;
 73     }
 74 
 75 Userspace accessing map declared above::
 76 
 77     #include <linux/bpf.h>
 78     #include <linux/libbpf.h>
 79 
 80     __u32 map_lookup(struct bpf_map *map, __u64 cgrp, enum bpf_attach_type type)
 81     {
 82             struct bpf_cgroup_storage_key = {
 83                     .cgroup_inode_id = cgrp,
 84                     .attach_type = type,
 85             };
 86             __u32 value;
 87             bpf_map_lookup_elem(bpf_map__fd(map), &key, &value);
 88             // error checking omitted
 89             return value;
 90     }
 91 
 92 Alternatively, using just ``__u64 cgroup_inode_id`` as key type::
 93 
 94     #include <bpf/bpf.h>
 95 
 96     struct {
 97             __uint(type, BPF_MAP_TYPE_CGROUP_STORAGE);
 98             __type(key, __u64);
 99             __type(value, __u32);
100     } cgroup_storage SEC(".maps");
101 
102     int program(struct __sk_buff *skb)
103     {
104             __u32 *ptr = bpf_get_local_storage(&cgroup_storage, 0);
105             __sync_fetch_and_add(ptr, 1);
106 
107             return 0;
108     }
109 
110 And userspace::
111 
112     #include <linux/bpf.h>
113     #include <linux/libbpf.h>
114 
115     __u32 map_lookup(struct bpf_map *map, __u64 cgrp, enum bpf_attach_type type)
116     {
117             __u32 value;
118             bpf_map_lookup_elem(bpf_map__fd(map), &cgrp, &value);
119             // error checking omitted
120             return value;
121     }
122 
123 Semantics
124 =========
125 
126 ``BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE`` is a variant of this map type. This
127 per-CPU variant will have different memory regions for each CPU for each
128 storage. The non-per-CPU will have the same memory region for each storage.
129 
130 Prior to Linux 5.9, the lifetime of a storage is precisely per-attachment, and
131 for a single ``CGROUP_STORAGE`` map, there can be at most one program loaded
132 that uses the map. A program may be attached to multiple cgroups or have
133 multiple attach types, and each attach creates a fresh zeroed storage. The
134 storage is freed upon detach.
135 
136 There is a one-to-one association between the map of each type (per-CPU and
137 non-per-CPU) and the BPF program during load verification time. As a result,
138 each map can only be used by one BPF program and each BPF program can only use
139 one storage map of each type. Because of map can only be used by one BPF
140 program, sharing of this cgroup's storage with other BPF programs were
141 impossible.
142 
143 Since Linux 5.9, storage can be shared by multiple programs. When a program is
144 attached to a cgroup, the kernel would create a new storage only if the map
145 does not already contain an entry for the cgroup and attach type pair, or else
146 the old storage is reused for the new attachment. If the map is attach type
147 shared, then attach type is simply ignored during comparison. Storage is freed
148 only when either the map or the cgroup attached to is being freed. Detaching
149 will not directly free the storage, but it may cause the reference to the map
150 to reach zero and indirectly freeing all storage in the map.
151 
152 The map is not associated with any BPF program, thus making sharing possible.
153 However, the BPF program can still only associate with one map of each type
154 (per-CPU and non-per-CPU). A BPF program cannot use more than one
155 ``BPF_MAP_TYPE_CGROUP_STORAGE`` or more than one
156 ``BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE``.
157 
158 In all versions, userspace may use the attach parameters of cgroup and
159 attach type pair in ``struct bpf_cgroup_storage_key`` as the key to the BPF map
160 APIs to read or update the storage for a given attachment. For Linux 5.9
161 attach type shared storages, only the first value in the struct, cgroup inode
162 id, is used during comparison, so userspace may just specify a ``__u64``
163 directly.
164 
165 The storage is bound at attach time. Even if the program is attached to parent
166 and triggers in child, the storage still belongs to the parent.
167 
168 Userspace cannot create a new entry in the map or delete an existing entry.
169 Program test runs always use a temporary storage.

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php