~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/bpf/prog_flow_dissector.rst

Version: ~ [ linux-6.11-rc3 ] ~ [ linux-6.10.4 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.45 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.104 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.164 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.223 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.281 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.319 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.9 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0
  2 
  3 ============================
  4 BPF_PROG_TYPE_FLOW_DISSECTOR
  5 ============================
  6 
  7 Overview
  8 ========
  9 
 10 Flow dissector is a routine that parses metadata out of the packets. It's
 11 used in the various places in the networking subsystem (RFS, flow hash, etc).
 12 
 13 BPF flow dissector is an attempt to reimplement C-based flow dissector logic
 14 in BPF to gain all the benefits of BPF verifier (namely, limits on the
 15 number of instructions and tail calls).
 16 
 17 API
 18 ===
 19 
 20 BPF flow dissector programs operate on an ``__sk_buff``. However, only the
 21 limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
 22 ``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
 23 and output arguments.
 24 
 25 The inputs are:
 26   * ``nhoff`` - initial offset of the networking header
 27   * ``thoff`` - initial offset of the transport header, initialized to nhoff
 28   * ``n_proto`` - L3 protocol type, parsed out of L2 header
 29   * ``flags`` - optional flags
 30 
 31 Flow dissector BPF program should fill out the rest of the ``struct
 32 bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
 33 also adjusted accordingly.
 34 
 35 The return code of the BPF program is either BPF_OK to indicate successful
 36 dissection, or BPF_DROP to indicate parsing error.
 37 
 38 __sk_buff->data
 39 ===============
 40 
 41 In the VLAN-less case, this is what the initial state of the BPF flow
 42 dissector looks like::
 43 
 44   +------+------+------------+-----------+
 45   | DMAC | SMAC | ETHER_TYPE | L3_HEADER |
 46   +------+------+------------+-----------+
 47                               ^
 48                               |
 49                               +-- flow dissector starts here
 50 
 51 
 52 .. code:: c
 53 
 54   skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
 55   flow_keys->thoff = nhoff
 56   flow_keys->n_proto = ETHER_TYPE
 57 
 58 In case of VLAN, flow dissector can be called with the two different states.
 59 
 60 Pre-VLAN parsing::
 61 
 62   +------+------+------+-----+-----------+-----------+
 63   | DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
 64   +------+------+------+-----+-----------+-----------+
 65                         ^
 66                         |
 67                         +-- flow dissector starts here
 68 
 69 .. code:: c
 70 
 71   skb->data + flow_keys->nhoff point the to first byte of TCI
 72   flow_keys->thoff = nhoff
 73   flow_keys->n_proto = TPID
 74 
 75 Please note that TPID can be 802.1AD and, hence, BPF program would
 76 have to parse VLAN information twice for double tagged packets.
 77 
 78 
 79 Post-VLAN parsing::
 80 
 81   +------+------+------+-----+-----------+-----------+
 82   | DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
 83   +------+------+------+-----+-----------+-----------+
 84                                           ^
 85                                           |
 86                                           +-- flow dissector starts here
 87 
 88 .. code:: c
 89 
 90   skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
 91   flow_keys->thoff = nhoff
 92   flow_keys->n_proto = ETHER_TYPE
 93 
 94 In this case VLAN information has been processed before the flow dissector
 95 and BPF flow dissector is not required to handle it.
 96 
 97 
 98 The takeaway here is as follows: BPF flow dissector program can be called with
 99 the optional VLAN header and should gracefully handle both cases: when single
100 or double VLAN is present and when it is not present. The same program
101 can be called for both cases and would have to be written carefully to
102 handle both cases.
103 
104 
105 Flags
106 =====
107 
108 ``flow_keys->flags`` might contain optional input flags that work as follows:
109 
110 * ``BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG`` - tells BPF flow dissector to
111   continue parsing first fragment; the default expected behavior is that
112   flow dissector returns as soon as it finds out that the packet is fragmented;
113   used by ``eth_get_headlen`` to estimate length of all headers for GRO.
114 * ``BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL`` - tells BPF flow dissector to
115   stop parsing as soon as it reaches IPv6 flow label; used by
116   ``___skb_get_hash`` to get flow hash.
117 * ``BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP`` - tells BPF flow dissector to stop
118   parsing as soon as it reaches encapsulated headers; used by routing
119   infrastructure.
120 
121 
122 Reference Implementation
123 ========================
124 
125 See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
126 implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
127 for the loader. bpftool can be used to load BPF flow dissector program as well.
128 
129 The reference implementation is organized as follows:
130   * ``jmp_table`` map that contains sub-programs for each supported L3 protocol
131   * ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
132     does ``bpf_tail_call`` to the appropriate L3 handler
133 
134 Since BPF at this point doesn't support looping (or any jumping back),
135 jmp_table is used instead to handle multiple levels of encapsulation (and
136 IPv6 options).
137 
138 
139 Current Limitations
140 ===================
141 BPF flow dissector doesn't support exporting all the metadata that in-kernel
142 C-based implementation can export. Notable example is single VLAN (802.1Q)
143 and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
144 for a set of information that's currently can be exported from the BPF context.
145 
146 When BPF flow dissector is attached to the root network namespace (machine-wide
147 policy), users can't override it in their child network namespaces.

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php