1 .. SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) 2 3 ===================== 4 BPF sk_lookup program 5 ===================== 6 7 BPF sk_lookup program type (``BPF_PROG_TYPE_SK_LOOKUP``) introduces programmability 8 into the socket lookup performed by the transport layer when a packet is to be 9 delivered locally. 10 11 When invoked BPF sk_lookup program can select a socket that will receive the 12 incoming packet by calling the ``bpf_sk_assign()`` BPF helper function. 13 14 Hooks for a common attach point (``BPF_SK_LOOKUP``) exist for both TCP and UDP. 15 16 Motivation 17 ========== 18 19 BPF sk_lookup program type was introduced to address setup scenarios where 20 binding sockets to an address with ``bind()`` socket call is impractical, such 21 as: 22 23 1. receiving connections on a range of IP addresses, e.g. 192.0.2.0/24, when 24 binding to a wildcard address ``INADRR_ANY`` is not possible due to a port 25 conflict, 26 2. receiving connections on all or a wide range of ports, i.e. an L7 proxy use 27 case. 28 29 Such setups would require creating and ``bind()``'ing one socket to each of the 30 IP address/port in the range, leading to resource consumption and potential 31 latency spikes during socket lookup. 32 33 Attachment 34 ========== 35 36 BPF sk_lookup program can be attached to a network namespace with 37 ``bpf(BPF_LINK_CREATE, ...)`` syscall using the ``BPF_SK_LOOKUP`` attach type and a 38 netns FD as attachment ``target_fd``. 39 40 Multiple programs can be attached to one network namespace. Programs will be 41 invoked in the same order as they were attached. 42 43 Hooks 44 ===== 45 46 The attached BPF sk_lookup programs run whenever the transport layer needs to 47 find a listening (TCP) or an unconnected (UDP) socket for an incoming packet. 48 49 Incoming traffic to established (TCP) and connected (UDP) sockets is delivered 50 as usual without triggering the BPF sk_lookup hook. 51 52 The attached BPF programs must return with either ``SK_PASS`` or ``SK_DROP`` 53 verdict code. As for other BPF program types that are network filters, 54 ``SK_PASS`` signifies that the socket lookup should continue on to regular 55 hashtable-based lookup, while ``SK_DROP`` causes the transport layer to drop the 56 packet. 57 58 A BPF sk_lookup program can also select a socket to receive the packet by 59 calling ``bpf_sk_assign()`` BPF helper. Typically, the program looks up a socket 60 in a map holding sockets, such as ``SOCKMAP`` or ``SOCKHASH``, and passes a 61 ``struct bpf_sock *`` to ``bpf_sk_assign()`` helper to record the 62 selection. Selecting a socket only takes effect if the program has terminated 63 with ``SK_PASS`` code. 64 65 When multiple programs are attached, the end result is determined from return 66 codes of all the programs according to the following rules: 67 68 1. If any program returned ``SK_PASS`` and selected a valid socket, the socket 69 is used as the result of the socket lookup. 70 2. If more than one program returned ``SK_PASS`` and selected a socket, the last 71 selection takes effect. 72 3. If any program returned ``SK_DROP``, and no program returned ``SK_PASS`` and 73 selected a socket, socket lookup fails. 74 4. If all programs returned ``SK_PASS`` and none of them selected a socket, 75 socket lookup continues on. 76 77 API 78 === 79 80 In its context, an instance of ``struct bpf_sk_lookup``, BPF sk_lookup program 81 receives information about the packet that triggered the socket lookup. Namely: 82 83 * IP version (``AF_INET`` or ``AF_INET6``), 84 * L4 protocol identifier (``IPPROTO_TCP`` or ``IPPROTO_UDP``), 85 * source and destination IP address, 86 * source and destination L4 port, 87 * the socket that has been selected with ``bpf_sk_assign()``. 88 89 Refer to ``struct bpf_sk_lookup`` declaration in ``linux/bpf.h`` user API 90 header, and `bpf-helpers(7) 91 <https://man7.org/linux/man-pages/man7/bpf-helpers.7.html>`_ man-page section 92 for ``bpf_sk_assign()`` for details. 93 94 Example 95 ======= 96 97 See ``tools/testing/selftests/bpf/prog_tests/sk_lookup.c`` for the reference 98 implementation.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.