~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/bpf/standardization/instruction-set.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/bpf/standardization/instruction-set.rst (Version linux-6.12-rc7) and /Documentation/bpf/standardization/instruction-set.rst (Version linux-6.9.12)


  1 .. contents::                                       1 .. contents::
  2 .. sectnum::                                        2 .. sectnum::
  3                                                     3 
  4 ======================================              4 ======================================
  5 BPF Instruction Set Architecture (ISA)              5 BPF Instruction Set Architecture (ISA)
  6 ======================================              6 ======================================
  7                                                     7 
  8 eBPF, also commonly                            !!   8 This document specifies the BPF instruction set architecture (ISA).
  9 referred to as BPF, is a technology with origi << 
 10 that can run untrusted programs in a privilege << 
 11 operating system kernel. This document specifi << 
 12 set architecture (ISA).                        << 
 13                                                << 
 14 As a historical note, BPF originally stood for << 
 15 but now that it can do so much more than packe << 
 16 no longer makes sense. BPF is now considered a << 
 17 does not stand for anything.  The original BPF << 
 18 as cBPF (classic BPF) to distinguish it from t << 
 19 eBPF (extended BPF).                           << 
 20                                                     9 
 21 Documentation conventions                          10 Documentation conventions
 22 =========================                          11 =========================
 23                                                    12 
 24 The key words "MUST", "MUST NOT", "REQUIRED",  << 
 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RE << 
 26 "OPTIONAL" in this document are to be interpre << 
 27 BCP 14 `<https://www.rfc-editor.org/info/rfc21 << 
 28 `<https://www.rfc-editor.org/info/rfc8174>`_   << 
 29 when, and only when, they appear in all capita << 
 30                                                << 
 31 For brevity and consistency, this document ref     13 For brevity and consistency, this document refers to families
 32 of types using a shorthand syntax and refers t     14 of types using a shorthand syntax and refers to several expository,
 33 mnemonic functions when describing the semanti     15 mnemonic functions when describing the semantics of instructions.
 34 The range of valid values for those types and      16 The range of valid values for those types and the semantics of those
 35 functions are defined in the following subsect     17 functions are defined in the following subsections.
 36                                                    18 
 37 Types                                              19 Types
 38 -----                                              20 -----
 39 This document refers to integer types with the     21 This document refers to integer types with the notation `SN` to specify
 40 a type's signedness (`S`) and bit width (`N`),     22 a type's signedness (`S`) and bit width (`N`), respectively.
 41                                                    23 
 42 .. table:: Meaning of signedness notation      !!  24 .. table:: Meaning of signedness notation.
 43                                                    25 
 44   ==== =========                                   26   ==== =========
 45   S    Meaning                                     27   S    Meaning
 46   ==== =========                                   28   ==== =========
 47   u    unsigned                                    29   u    unsigned
 48   s    signed                                      30   s    signed
 49   ==== =========                                   31   ==== =========
 50                                                    32 
 51 .. table:: Meaning of bit-width notation       !!  33 .. table:: Meaning of bit-width notation.
 52                                                    34 
 53   ===== =========                                  35   ===== =========
 54   N     Bit width                                  36   N     Bit width
 55   ===== =========                                  37   ===== =========
 56   8     8 bits                                     38   8     8 bits
 57   16    16 bits                                    39   16    16 bits
 58   32    32 bits                                    40   32    32 bits
 59   64    64 bits                                    41   64    64 bits
 60   128   128 bits                                   42   128   128 bits
 61   ===== =========                                  43   ===== =========
 62                                                    44 
 63 For example, `u32` is a type whose valid value     45 For example, `u32` is a type whose valid values are all the 32-bit unsigned
 64 numbers and `s16` is a type whose valid values !!  46 numbers and `s16` is a types whose valid values are all the 16-bit signed
 65 numbers.                                           47 numbers.
 66                                                    48 
 67 Functions                                          49 Functions
 68 ---------                                          50 ---------
 69                                                !!  51 * htobe16: Takes an unsigned 16-bit number in host-endian format and
 70 The following byteswap functions are direction !!  52   returns the equivalent number as an unsigned 16-bit number in big-endian
 71 the same function is used for conversion in ei !!  53   format.
 72 below.                                         !!  54 * htobe32: Takes an unsigned 32-bit number in host-endian format and
 73                                                !!  55   returns the equivalent number as an unsigned 32-bit number in big-endian
 74 * be16: Takes an unsigned 16-bit number and co !!  56   format.
 75   host byte order and big-endian               !!  57 * htobe64: Takes an unsigned 64-bit number in host-endian format and
 76   (`IEN137 <https://www.rfc-editor.org/ien/ien !!  58   returns the equivalent number as an unsigned 64-bit number in big-endian
 77 * be32: Takes an unsigned 32-bit number and co !!  59   format.
 78   host byte order and big-endian byte order.   !!  60 * htole16: Takes an unsigned 16-bit number in host-endian format and
 79 * be64: Takes an unsigned 64-bit number and co !!  61   returns the equivalent number as an unsigned 16-bit number in little-endian
 80   host byte order and big-endian byte order.   !!  62   format.
                                                   >>  63 * htole32: Takes an unsigned 32-bit number in host-endian format and
                                                   >>  64   returns the equivalent number as an unsigned 32-bit number in little-endian
                                                   >>  65   format.
                                                   >>  66 * htole64: Takes an unsigned 64-bit number in host-endian format and
                                                   >>  67   returns the equivalent number as an unsigned 64-bit number in little-endian
                                                   >>  68   format.
 81 * bswap16: Takes an unsigned 16-bit number in      69 * bswap16: Takes an unsigned 16-bit number in either big- or little-endian
 82   format and returns the equivalent number wit     70   format and returns the equivalent number with the same bit width but
 83   opposite endianness.                             71   opposite endianness.
 84 * bswap32: Takes an unsigned 32-bit number in      72 * bswap32: Takes an unsigned 32-bit number in either big- or little-endian
 85   format and returns the equivalent number wit     73   format and returns the equivalent number with the same bit width but
 86   opposite endianness.                             74   opposite endianness.
 87 * bswap64: Takes an unsigned 64-bit number in      75 * bswap64: Takes an unsigned 64-bit number in either big- or little-endian
 88   format and returns the equivalent number wit     76   format and returns the equivalent number with the same bit width but
 89   opposite endianness.                             77   opposite endianness.
 90 * le16: Takes an unsigned 16-bit number and co !!  78 
 91   host byte order and little-endian byte order << 
 92 * le32: Takes an unsigned 32-bit number and co << 
 93   host byte order and little-endian byte order << 
 94 * le64: Takes an unsigned 64-bit number and co << 
 95   host byte order and little-endian byte order << 
 96                                                    79 
 97 Definitions                                        80 Definitions
 98 -----------                                        81 -----------
 99                                                    82 
100 .. glossary::                                      83 .. glossary::
101                                                    84 
102   Sign Extend                                      85   Sign Extend
103     To `sign extend an` ``X`` `-bit number, A,     86     To `sign extend an` ``X`` `-bit number, A, to a` ``Y`` `-bit number, B  ,` means to
104                                                    87 
105     #. Copy all ``X`` bits from `A` to the low     88     #. Copy all ``X`` bits from `A` to the lower ``X`` bits of `B`.
106     #. Set the value of the remaining ``Y`` -      89     #. Set the value of the remaining ``Y`` - ``X`` bits of `B` to the value of
107        the  most-significant bit of `A`.           90        the  most-significant bit of `A`.
108                                                    91 
109 .. admonition:: Example                            92 .. admonition:: Example
110                                                    93 
111   Sign extend an 8-bit number ``A`` to a 16-bi     94   Sign extend an 8-bit number ``A`` to a 16-bit number ``B`` on a big-endian platform:
112   ::                                               95   ::
113                                                    96 
114     A:          10000110                           97     A:          10000110
115     B: 11111111 10000110                           98     B: 11111111 10000110
116                                                    99 
117 Conformance groups                                100 Conformance groups
118 ------------------                                101 ------------------
119                                                   102 
120 An implementation does not need to support all    103 An implementation does not need to support all instructions specified in this
121 document (e.g., deprecated instructions).  Ins    104 document (e.g., deprecated instructions).  Instead, a number of conformance
122 groups are specified.  An implementation MUST  !! 105 groups are specified.  An implementation must support the base32 conformance
123 group and MAY support additional conformance g !! 106 group and may support additional conformance groups, where supporting a
124 conformance group means it MUST support all in !! 107 conformance group means it must support all instructions in that conformance
125 group.                                            108 group.
126                                                   109 
127 The use of named conformance groups enables in    110 The use of named conformance groups enables interoperability between a runtime
128 that executes instructions, and tools such as  !! 111 that executes instructions, and tools as such compilers that generate
129 instructions for the runtime.  Thus, capabilit    112 instructions for the runtime.  Thus, capability discovery in terms of
130 conformance groups might be done manually by u    113 conformance groups might be done manually by users or automatically by tools.
131                                                   114 
132 Each conformance group has a short ASCII label    115 Each conformance group has a short ASCII label (e.g., "base32") that
133 corresponds to a set of instructions that are     116 corresponds to a set of instructions that are mandatory.  That is, each
134 instruction has one or more conformance groups    117 instruction has one or more conformance groups of which it is a member.
135                                                   118 
136 This document defines the following conformanc    119 This document defines the following conformance groups:
137                                                   120 
138 * base32: includes all instructions defined in    121 * base32: includes all instructions defined in this
139   specification unless otherwise noted.           122   specification unless otherwise noted.
140 * base64: includes base32, plus instructions e    123 * base64: includes base32, plus instructions explicitly noted
141   as being in the base64 conformance group.       124   as being in the base64 conformance group.
142 * atomic32: includes 32-bit atomic operation i    125 * atomic32: includes 32-bit atomic operation instructions (see `Atomic operations`_).
143 * atomic64: includes atomic32, plus 64-bit ato    126 * atomic64: includes atomic32, plus 64-bit atomic operation instructions.
144 * divmul32: includes 32-bit division, multipli    127 * divmul32: includes 32-bit division, multiplication, and modulo instructions.
145 * divmul64: includes divmul32, plus 64-bit div    128 * divmul64: includes divmul32, plus 64-bit division, multiplication,
146   and modulo instructions.                        129   and modulo instructions.
147 * packet: deprecated packet access instruction    130 * packet: deprecated packet access instructions.
148                                                   131 
149 Instruction encoding                              132 Instruction encoding
150 ====================                              133 ====================
151                                                   134 
152 BPF has two instruction encodings:                135 BPF has two instruction encodings:
153                                                   136 
154 * the basic instruction encoding, which uses 6    137 * the basic instruction encoding, which uses 64 bits to encode an instruction
155 * the wide instruction encoding, which appends    138 * the wide instruction encoding, which appends a second 64 bits
156   after the basic instruction for a total of 1    139   after the basic instruction for a total of 128 bits.
157                                                   140 
158 Basic instruction encoding                        141 Basic instruction encoding
159 --------------------------                        142 --------------------------
160                                                   143 
161 A basic instruction is encoded as follows::       144 A basic instruction is encoded as follows::
162                                                   145 
163   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    146   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
164   |    opcode     |     regs      |               147   |    opcode     |     regs      |            offset             |
165   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    148   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
166   |                              imm              149   |                              imm                              |
167   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    150   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
168                                                   151 
169 **opcode**                                        152 **opcode**
170   operation to perform, encoded as follows::      153   operation to perform, encoded as follows::
171                                                   154 
172     +-+-+-+-+-+-+-+-+                             155     +-+-+-+-+-+-+-+-+
173     |specific |class|                             156     |specific |class|
174     +-+-+-+-+-+-+-+-+                             157     +-+-+-+-+-+-+-+-+
175                                                   158 
176   **specific**                                    159   **specific**
177     The format of these bits varies by instruc    160     The format of these bits varies by instruction class
178                                                   161 
179   **class**                                       162   **class**
180     The instruction class (see `Instruction cl    163     The instruction class (see `Instruction classes`_)
181                                                   164 
182 **regs**                                          165 **regs**
183   The source and destination register numbers,    166   The source and destination register numbers, encoded as follows
184   on a little-endian host::                       167   on a little-endian host::
185                                                   168 
186     +-+-+-+-+-+-+-+-+                             169     +-+-+-+-+-+-+-+-+
187     |src_reg|dst_reg|                             170     |src_reg|dst_reg|
188     +-+-+-+-+-+-+-+-+                             171     +-+-+-+-+-+-+-+-+
189                                                   172 
190   and as follows on a big-endian host::           173   and as follows on a big-endian host::
191                                                   174 
192     +-+-+-+-+-+-+-+-+                             175     +-+-+-+-+-+-+-+-+
193     |dst_reg|src_reg|                             176     |dst_reg|src_reg|
194     +-+-+-+-+-+-+-+-+                             177     +-+-+-+-+-+-+-+-+
195                                                   178 
196   **src_reg**                                     179   **src_reg**
197     the source register number (0-10), except     180     the source register number (0-10), except where otherwise specified
198     (`64-bit immediate instructions`_ reuse th    181     (`64-bit immediate instructions`_ reuse this field for other purposes)
199                                                   182 
200   **dst_reg**                                     183   **dst_reg**
201     destination register number (0-10), unless !! 184     destination register number (0-10)
202     (future instructions might reuse this fiel << 
203                                                   185 
204 **offset**                                        186 **offset**
205   signed integer offset used with pointer arit !! 187   signed integer offset used with pointer arithmetic
206   otherwise specified (some arithmetic instruc << 
207   for other purposes)                          << 
208                                                   188 
209 **imm**                                           189 **imm**
210   signed integer immediate value                  190   signed integer immediate value
211                                                   191 
212 Note that the contents of multi-byte fields ('    192 Note that the contents of multi-byte fields ('offset' and 'imm') are
213 stored using big-endian byte ordering on big-e    193 stored using big-endian byte ordering on big-endian hosts and
214 little-endian byte ordering on little-endian h    194 little-endian byte ordering on little-endian hosts.
215                                                   195 
216 For example::                                     196 For example::
217                                                   197 
218   opcode                  offset imm              198   opcode                  offset imm          assembly
219          src_reg dst_reg                          199          src_reg dst_reg
220   07     0       1        00 00  44 33 22 11      200   07     0       1        00 00  44 33 22 11  r1 += 0x11223344 // little
221          dst_reg src_reg                          201          dst_reg src_reg
222   07     1       0        00 00  11 22 33 44      202   07     1       0        00 00  11 22 33 44  r1 += 0x11223344 // big
223                                                   203 
224 Note that most instructions do not use all of     204 Note that most instructions do not use all of the fields.
225 Unused fields SHALL be cleared to zero.        !! 205 Unused fields shall be cleared to zero.
226                                                   206 
227 Wide instruction encoding                         207 Wide instruction encoding
228 --------------------------                        208 --------------------------
229                                                   209 
230 Some instructions are defined to use the wide     210 Some instructions are defined to use the wide instruction encoding,
231 which uses two 32-bit immediate values.  The 6    211 which uses two 32-bit immediate values.  The 64 bits following
232 the basic instruction format contain a pseudo     212 the basic instruction format contain a pseudo instruction
233 with 'opcode', 'dst_reg', 'src_reg', and 'offs    213 with 'opcode', 'dst_reg', 'src_reg', and 'offset' all set to zero.
234                                                   214 
235 This is depicted in the following figure::        215 This is depicted in the following figure::
236                                                   216 
237   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    217   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
238   |    opcode     |     regs      |               218   |    opcode     |     regs      |            offset             |
239   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    219   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
240   |                              imm              220   |                              imm                              |
241   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    221   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
242   |                           reserved            222   |                           reserved                            |
243   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    223   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
244   |                           next_imm            224   |                           next_imm                            |
245   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-    225   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
246                                                   226 
247 **opcode**                                        227 **opcode**
248   operation to perform, encoded as explained a    228   operation to perform, encoded as explained above
249                                                   229 
250 **regs**                                          230 **regs**
251   The source and destination register numbers  !! 231   The source and destination register numbers, encoded as explained above
252   specified), encoded as explained above       << 
253                                                   232 
254 **offset**                                        233 **offset**
255   signed integer offset used with pointer arit !! 234   signed integer offset used with pointer arithmetic
256   otherwise specified                          << 
257                                                   235 
258 **imm**                                           236 **imm**
259   signed integer immediate value                  237   signed integer immediate value
260                                                   238 
261 **reserved**                                      239 **reserved**
262   unused, set to zero                             240   unused, set to zero
263                                                   241 
264 **next_imm**                                      242 **next_imm**
265   second signed integer immediate value           243   second signed integer immediate value
266                                                   244 
267 Instruction classes                               245 Instruction classes
268 -------------------                               246 -------------------
269                                                   247 
270 The three least significant bits of the 'opcod    248 The three least significant bits of the 'opcode' field store the instruction class:
271                                                   249 
272 .. table:: Instruction class                   !! 250 =====  =====  ===============================  ===================================
273                                                !! 251 class  value  description                      reference
274   =====  =====  ============================== !! 252 =====  =====  ===============================  ===================================
275   class  value  description                    !! 253 LD     0x0    non-standard load operations     `Load and store instructions`_
276   =====  =====  ============================== !! 254 LDX    0x1    load into register operations    `Load and store instructions`_
277   LD     0x0    non-standard load operations   !! 255 ST     0x2    store from immediate operations  `Load and store instructions`_
278   LDX    0x1    load into register operations  !! 256 STX    0x3    store from register operations   `Load and store instructions`_
279   ST     0x2    store from immediate operation !! 257 ALU    0x4    32-bit arithmetic operations     `Arithmetic and jump instructions`_
280   STX    0x3    store from register operations !! 258 JMP    0x5    64-bit jump operations           `Arithmetic and jump instructions`_
281   ALU    0x4    32-bit arithmetic operations   !! 259 JMP32  0x6    32-bit jump operations           `Arithmetic and jump instructions`_
282   JMP    0x5    64-bit jump operations         !! 260 ALU64  0x7    64-bit arithmetic operations     `Arithmetic and jump instructions`_
283   JMP32  0x6    32-bit jump operations         !! 261 =====  =====  ===============================  ===================================
284   ALU64  0x7    64-bit arithmetic operations   << 
285   =====  =====  ============================== << 
286                                                   262 
287 Arithmetic and jump instructions                  263 Arithmetic and jump instructions
288 ================================                  264 ================================
289                                                   265 
290 For arithmetic and jump instructions (``ALU``,    266 For arithmetic and jump instructions (``ALU``, ``ALU64``, ``JMP`` and
291 ``JMP32``), the 8-bit 'opcode' field is divide    267 ``JMP32``), the 8-bit 'opcode' field is divided into three parts::
292                                                   268 
293   +-+-+-+-+-+-+-+-+                               269   +-+-+-+-+-+-+-+-+
294   |  code |s|class|                               270   |  code |s|class|
295   +-+-+-+-+-+-+-+-+                               271   +-+-+-+-+-+-+-+-+
296                                                   272 
297 **code**                                          273 **code**
298   the operation code, whose meaning varies by     274   the operation code, whose meaning varies by instruction class
299                                                   275 
300 **s (source)**                                    276 **s (source)**
301   the source operand location, which unless ot    277   the source operand location, which unless otherwise specified is one of:
302                                                   278 
303   .. table:: Source operand location           !! 279   ======  =====  ==============================================
304                                                !! 280   source  value  description
305     ======  =====  =========================== !! 281   ======  =====  ==============================================
306     source  value  description                 !! 282   K       0      use 32-bit 'imm' value as source operand
307     ======  =====  =========================== !! 283   X       1      use 'src_reg' register value as source operand
308     K       0      use 32-bit 'imm' value as s !! 284   ======  =====  ==============================================
309     X       1      use 'src_reg' register valu << 
310     ======  =====  =========================== << 
311                                                   285 
312 **instruction class**                             286 **instruction class**
313   the instruction class (see `Instruction clas    287   the instruction class (see `Instruction classes`_)
314                                                   288 
315 Arithmetic instructions                           289 Arithmetic instructions
316 -----------------------                           290 -----------------------
317                                                   291 
318 ``ALU`` uses 32-bit wide operands while ``ALU6    292 ``ALU`` uses 32-bit wide operands while ``ALU64`` uses 64-bit wide operands for
319 otherwise identical operations. ``ALU64`` inst    293 otherwise identical operations. ``ALU64`` instructions belong to the
320 base64 conformance group unless noted otherwis    294 base64 conformance group unless noted otherwise.
321 The 'code' field encodes the operation as belo !! 295 The 'code' field encodes the operation as below, where 'src' and 'dst' refer
322 the source operand and 'dst' refers to the val !! 296 to the values of the source and destination registers, respectively.
323 register.                                      !! 297 
324                                                !! 298 =====  =====  =======  ==========================================================
325 .. table:: Arithmetic instructions             !! 299 name   code   offset   description
326                                                !! 300 =====  =====  =======  ==========================================================
327   =====  =====  =======  ===================== !! 301 ADD    0x0    0        dst += src
328   name   code   offset   description           !! 302 SUB    0x1    0        dst -= src
329   =====  =====  =======  ===================== !! 303 MUL    0x2    0        dst \*= src
330   ADD    0x0    0        dst += src            !! 304 DIV    0x3    0        dst = (src != 0) ? (dst / src) : 0
331   SUB    0x1    0        dst -= src            !! 305 SDIV   0x3    1        dst = (src != 0) ? (dst s/ src) : 0
332   MUL    0x2    0        dst \*= src           !! 306 OR     0x4    0        dst \|= src
333   DIV    0x3    0        dst = (src != 0) ? (d !! 307 AND    0x5    0        dst &= src
334   SDIV   0x3    1        dst = (src != 0) ? (d !! 308 LSH    0x6    0        dst <<= (src & mask)
335   OR     0x4    0        dst \|= src           !! 309 RSH    0x7    0        dst >>= (src & mask)
336   AND    0x5    0        dst &= src            !! 310 NEG    0x8    0        dst = -dst
337   LSH    0x6    0        dst <<= (src & mask)  !! 311 MOD    0x9    0        dst = (src != 0) ? (dst % src) : dst
338   RSH    0x7    0        dst >>= (src & mask)  !! 312 SMOD   0x9    1        dst = (src != 0) ? (dst s% src) : dst
339   NEG    0x8    0        dst = -dst            !! 313 XOR    0xa    0        dst ^= src
340   MOD    0x9    0        dst = (src != 0) ? (d !! 314 MOV    0xb    0        dst = src
341   SMOD   0x9    1        dst = (src != 0) ? (d !! 315 MOVSX  0xb    8/16/32  dst = (s8,s16,s32)src
342   XOR    0xa    0        dst ^= src            !! 316 ARSH   0xc    0        :term:`sign extending<Sign Extend>` dst >>= (src & mask)
343   MOV    0xb    0        dst = src             !! 317 END    0xd    0        byte swap operations (see `Byte swap instructions`_ below)
344   MOVSX  0xb    8/16/32  dst = (s8,s16,s32)src !! 318 =====  =====  =======  ==========================================================
345   ARSH   0xc    0        :term:`sign extending << 
346   END    0xd    0        byte swap operations  << 
347   =====  =====  =======  ===================== << 
348                                                   319 
349 Underflow and overflow are allowed during arit    320 Underflow and overflow are allowed during arithmetic operations, meaning
350 the 64-bit or 32-bit value will wrap. If BPF p    321 the 64-bit or 32-bit value will wrap. If BPF program execution would
351 result in division by zero, the destination re    322 result in division by zero, the destination register is instead set to zero.
352 If execution would result in modulo by zero, f    323 If execution would result in modulo by zero, for ``ALU64`` the value of
353 the destination register is unchanged whereas     324 the destination register is unchanged whereas for ``ALU`` the upper
354 32 bits of the destination register are zeroed    325 32 bits of the destination register are zeroed.
355                                                   326 
356 ``{ADD, X, ALU}``, where 'code' = ``ADD``, 'so    327 ``{ADD, X, ALU}``, where 'code' = ``ADD``, 'source' = ``X``, and 'class' = ``ALU``, means::
357                                                   328 
358   dst = (u32) ((u32) dst + (u32) src)             329   dst = (u32) ((u32) dst + (u32) src)
359                                                   330 
360 where '(u32)' indicates that the upper 32 bits    331 where '(u32)' indicates that the upper 32 bits are zeroed.
361                                                   332 
362 ``{ADD, X, ALU64}`` means::                       333 ``{ADD, X, ALU64}`` means::
363                                                   334 
364   dst = dst + src                                 335   dst = dst + src
365                                                   336 
366 ``{XOR, K, ALU}`` means::                         337 ``{XOR, K, ALU}`` means::
367                                                   338 
368   dst = (u32) dst ^ (u32) imm                     339   dst = (u32) dst ^ (u32) imm
369                                                   340 
370 ``{XOR, K, ALU64}`` means::                       341 ``{XOR, K, ALU64}`` means::
371                                                   342 
372   dst = dst ^ imm                                 343   dst = dst ^ imm
373                                                   344 
374 Note that most arithmetic instructions have 'o !! 345 Note that most instructions have instruction offset of 0. Only three instructions
375 (``SDIV``, ``SMOD``, ``MOVSX``) have a non-zer !! 346 (``SDIV``, ``SMOD``, ``MOVSX``) have a non-zero offset.
376                                                   347 
377 Division, multiplication, and modulo operation    348 Division, multiplication, and modulo operations for ``ALU`` are part
378 of the "divmul32" conformance group, and divis    349 of the "divmul32" conformance group, and division, multiplication, and
379 modulo operations for ``ALU64`` are part of th    350 modulo operations for ``ALU64`` are part of the "divmul64" conformance
380 group.                                            351 group.
381 The division and modulo operations support bot    352 The division and modulo operations support both unsigned and signed flavors.
382                                                   353 
383 For unsigned operations (``DIV`` and ``MOD``),    354 For unsigned operations (``DIV`` and ``MOD``), for ``ALU``,
384 'imm' is interpreted as a 32-bit unsigned valu    355 'imm' is interpreted as a 32-bit unsigned value. For ``ALU64``,
385 'imm' is first :term:`sign extended<Sign Exten    356 'imm' is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
386 interpreted as a 64-bit unsigned value.           357 interpreted as a 64-bit unsigned value.
387                                                   358 
388 For signed operations (``SDIV`` and ``SMOD``),    359 For signed operations (``SDIV`` and ``SMOD``), for ``ALU``,
389 'imm' is interpreted as a 32-bit signed value.    360 'imm' is interpreted as a 32-bit signed value. For ``ALU64``, 'imm'
390 is first :term:`sign extended<Sign Extend>` fr    361 is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
391 interpreted as a 64-bit signed value.             362 interpreted as a 64-bit signed value.
392                                                   363 
393 Note that there are varying definitions of the    364 Note that there are varying definitions of the signed modulo operation
394 when the dividend or divisor are negative, whe    365 when the dividend or divisor are negative, where implementations often
395 vary by language such that Python, Ruby, etc.     366 vary by language such that Python, Ruby, etc.  differ from C, Go, Java,
396 etc. This specification requires that signed m !! 367 etc. This specification requires that signed modulo use truncated division
397 (where -13 % 3 == -1) as implemented in C, Go, !! 368 (where -13 % 3 == -1) as implemented in C, Go, etc.:
398                                                   369 
399    a % n = a - n * trunc(a / n)                   370    a % n = a - n * trunc(a / n)
400                                                   371 
401 The ``MOVSX`` instruction does a move operatio    372 The ``MOVSX`` instruction does a move operation with sign extension.
402 ``{MOVSX, X, ALU}`` :term:`sign extends<Sign E !! 373 ``{MOVSX, X, ALU}`` :term:`sign extends<Sign Extend>` 8-bit and 16-bit operands into 32
403 32-bit operands, and zeroes the remaining uppe !! 374 bit operands, and zeroes the remaining upper 32 bits.
404 ``{MOVSX, X, ALU64}`` :term:`sign extends<Sign    375 ``{MOVSX, X, ALU64}`` :term:`sign extends<Sign Extend>` 8-bit, 16-bit, and 32-bit
405 operands into 64-bit operands.  Unlike other a !! 376 operands into 64 bit operands.  Unlike other arithmetic instructions,
406 ``MOVSX`` is only defined for register source     377 ``MOVSX`` is only defined for register source operands (``X``).
407                                                   378 
408 ``{MOV, K, ALU64}`` means::                    << 
409                                                << 
410   dst = (s64)imm                               << 
411                                                << 
412 ``{MOV, X, ALU}`` means::                      << 
413                                                << 
414   dst = (u32)src                               << 
415                                                << 
416 ``{MOVSX, X, ALU}`` with 'offset' 8 means::    << 
417                                                << 
418   dst = (u32)(s32)(s8)src                      << 
419                                                << 
420                                                << 
421 The ``NEG`` instruction is only defined when t    379 The ``NEG`` instruction is only defined when the source bit is clear
422 (``K``).                                          380 (``K``).
423                                                   381 
424 Shift operations use a mask of 0x3F (63) for 6    382 Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
425 for 32-bit operations.                            383 for 32-bit operations.
426                                                   384 
427 Byte swap instructions                            385 Byte swap instructions
428 ----------------------                            386 ----------------------
429                                                   387 
430 The byte swap instructions use instruction cla    388 The byte swap instructions use instruction classes of ``ALU`` and ``ALU64``
431 and a 4-bit 'code' field of ``END``.              389 and a 4-bit 'code' field of ``END``.
432                                                   390 
433 The byte swap instructions operate on the dest    391 The byte swap instructions operate on the destination register
434 only and do not use a separate source register    392 only and do not use a separate source register or immediate value.
435                                                   393 
436 For ``ALU``, the 1-bit source operand field in    394 For ``ALU``, the 1-bit source operand field in the opcode is used to
437 select what byte order the operation converts     395 select what byte order the operation converts from or to. For
438 ``ALU64``, the 1-bit source operand field in t    396 ``ALU64``, the 1-bit source operand field in the opcode is reserved
439 and MUST be set to 0.                          !! 397 and must be set to 0.
440                                                << 
441 .. table:: Byte swap instructions              << 
442                                                   398 
443   =====  ========  =====  ==================== !! 399 =====  ========  =====  =================================================
444   class  source    value  description          !! 400 class  source    value  description
445   =====  ========  =====  ==================== !! 401 =====  ========  =====  =================================================
446   ALU    LE        0      convert between host !! 402 ALU    TO_LE     0      convert between host byte order and little endian
447   ALU    BE        1      convert between host !! 403 ALU    TO_BE     1      convert between host byte order and big endian
448   ALU64  Reserved  0      do byte swap uncondi !! 404 ALU64  Reserved  0      do byte swap unconditionally
449   =====  ========  =====  ==================== !! 405 =====  ========  =====  =================================================
450                                                   406 
451 The 'imm' field encodes the width of the swap     407 The 'imm' field encodes the width of the swap operations.  The following widths
452 are supported: 16, 32 and 64.  Width 64 operat    408 are supported: 16, 32 and 64.  Width 64 operations belong to the base64
453 conformance group and other swap operations be    409 conformance group and other swap operations belong to the base32
454 conformance group.                                410 conformance group.
455                                                   411 
456 Examples:                                         412 Examples:
457                                                   413 
458 ``{END, LE, ALU}`` with 'imm' = 16/32/64 means !! 414 ``{END, TO_LE, ALU}`` with imm = 16/32/64 means::
459                                                   415 
460   dst = le16(dst)                              !! 416   dst = htole16(dst)
461   dst = le32(dst)                              !! 417   dst = htole32(dst)
462   dst = le64(dst)                              !! 418   dst = htole64(dst)
463                                                   419 
464 ``{END, BE, ALU}`` with 'imm' = 16/32/64 means !! 420 ``{END, TO_BE, ALU}`` with imm = 16/32/64 means::
465                                                   421 
466   dst = be16(dst)                              !! 422   dst = htobe16(dst)
467   dst = be32(dst)                              !! 423   dst = htobe32(dst)
468   dst = be64(dst)                              !! 424   dst = htobe64(dst)
469                                                   425 
470 ``{END, TO, ALU64}`` with 'imm' = 16/32/64 mea !! 426 ``{END, TO_LE, ALU64}`` with imm = 16/32/64 means::
471                                                   427 
472   dst = bswap16(dst)                              428   dst = bswap16(dst)
473   dst = bswap32(dst)                              429   dst = bswap32(dst)
474   dst = bswap64(dst)                              430   dst = bswap64(dst)
475                                                   431 
476 Jump instructions                                 432 Jump instructions
477 -----------------                                 433 -----------------
478                                                   434 
479 ``JMP32`` uses 32-bit wide operands and indica    435 ``JMP32`` uses 32-bit wide operands and indicates the base32
480 conformance group, while ``JMP`` uses 64-bit w    436 conformance group, while ``JMP`` uses 64-bit wide operands for
481 otherwise identical operations, and indicates     437 otherwise identical operations, and indicates the base64 conformance
482 group unless otherwise specified.                 438 group unless otherwise specified.
483 The 'code' field encodes the operation as belo    439 The 'code' field encodes the operation as below:
484                                                   440 
485 .. table:: Jump instructions                   !! 441 ========  =====  =======  ===============================  ===================================================
                                                   >> 442 code      value  src_reg  description                      notes
                                                   >> 443 ========  =====  =======  ===============================  ===================================================
                                                   >> 444 JA        0x0    0x0      PC += offset                     {JA, K, JMP} only
                                                   >> 445 JA        0x0    0x0      PC += imm                        {JA, K, JMP32} only
                                                   >> 446 JEQ       0x1    any      PC += offset if dst == src
                                                   >> 447 JGT       0x2    any      PC += offset if dst > src        unsigned
                                                   >> 448 JGE       0x3    any      PC += offset if dst >= src       unsigned
                                                   >> 449 JSET      0x4    any      PC += offset if dst & src
                                                   >> 450 JNE       0x5    any      PC += offset if dst != src
                                                   >> 451 JSGT      0x6    any      PC += offset if dst > src        signed
                                                   >> 452 JSGE      0x7    any      PC += offset if dst >= src       signed
                                                   >> 453 CALL      0x8    0x0      call helper function by address  {CALL, K, JMP} only, see `Helper functions`_
                                                   >> 454 CALL      0x8    0x1      call PC += imm                   {CALL, K, JMP} only, see `Program-local functions`_
                                                   >> 455 CALL      0x8    0x2      call helper function by BTF ID   {CALL, K, JMP} only, see `Helper functions`_
                                                   >> 456 EXIT      0x9    0x0      return                           {CALL, K, JMP} only
                                                   >> 457 JLT       0xa    any      PC += offset if dst < src        unsigned
                                                   >> 458 JLE       0xb    any      PC += offset if dst <= src       unsigned
                                                   >> 459 JSLT      0xc    any      PC += offset if dst < src        signed
                                                   >> 460 JSLE      0xd    any      PC += offset if dst <= src       signed
                                                   >> 461 ========  =====  =======  ===============================  ===================================================
486                                                   462 
487   ========  =====  =======  ================== !! 463 The BPF program needs to store the return value into register R0 before doing an
488   code      value  src_reg  description        !! 464 ``EXIT``.
489   ========  =====  =======  ================== << 
490   JA        0x0    0x0      PC += offset       << 
491   JA        0x0    0x0      PC += imm          << 
492   JEQ       0x1    any      PC += offset if ds << 
493   JGT       0x2    any      PC += offset if ds << 
494   JGE       0x3    any      PC += offset if ds << 
495   JSET      0x4    any      PC += offset if ds << 
496   JNE       0x5    any      PC += offset if ds << 
497   JSGT      0x6    any      PC += offset if ds << 
498   JSGE      0x7    any      PC += offset if ds << 
499   CALL      0x8    0x0      call helper functi << 
500   CALL      0x8    0x1      call PC += imm     << 
501   CALL      0x8    0x2      call helper functi << 
502   EXIT      0x9    0x0      return             << 
503   JLT       0xa    any      PC += offset if ds << 
504   JLE       0xb    any      PC += offset if ds << 
505   JSLT      0xc    any      PC += offset if ds << 
506   JSLE      0xd    any      PC += offset if ds << 
507   ========  =====  =======  ================== << 
508                                                << 
509 where 'PC' denotes the program counter, and th << 
510 is in units of 64-bit instructions relative to << 
511 the jump instruction.  Thus 'PC += 1' skips ex << 
512 instruction if it's a basic instruction or res << 
513 if the next instruction is a 128-bit wide inst << 
514                                                   465 
515 Example:                                          466 Example:
516                                                   467 
517 ``{JSGE, X, JMP32}`` means::                      468 ``{JSGE, X, JMP32}`` means::
518                                                   469 
519   if (s32)dst s>= (s32)src goto +offset           470   if (s32)dst s>= (s32)src goto +offset
520                                                   471 
521 where 's>=' indicates a signed '>=' comparison    472 where 's>=' indicates a signed '>=' comparison.
522                                                   473 
523 ``{JLE, K, JMP}`` means::                      << 
524                                                << 
525   if dst <= (u64)(s64)imm goto +offset         << 
526                                                << 
527 ``{JA, K, JMP32}`` means::                        474 ``{JA, K, JMP32}`` means::
528                                                   475 
529   gotol +imm                                      476   gotol +imm
530                                                   477 
531 where 'imm' means the branch offset comes from !! 478 where 'imm' means the branch offset comes from insn 'imm' field.
532                                                   479 
533 Note that there are two flavors of ``JA`` inst    480 Note that there are two flavors of ``JA`` instructions. The
534 ``JMP`` class permits a 16-bit jump offset spe    481 ``JMP`` class permits a 16-bit jump offset specified by the 'offset'
535 field, whereas the ``JMP32`` class permits a 3    482 field, whereas the ``JMP32`` class permits a 32-bit jump offset
536 specified by the 'imm' field. A > 16-bit condi    483 specified by the 'imm' field. A > 16-bit conditional jump may be
537 converted to a < 16-bit conditional jump plus     484 converted to a < 16-bit conditional jump plus a 32-bit unconditional
538 jump.                                             485 jump.
539                                                   486 
540 All ``CALL`` and ``JA`` instructions belong to    487 All ``CALL`` and ``JA`` instructions belong to the
541 base32 conformance group.                         488 base32 conformance group.
542                                                   489 
543 Helper functions                                  490 Helper functions
544 ~~~~~~~~~~~~~~~~                                  491 ~~~~~~~~~~~~~~~~
545                                                   492 
546 Helper functions are a concept whereby BPF pro    493 Helper functions are a concept whereby BPF programs can call into a
547 set of function calls exposed by the underlyin    494 set of function calls exposed by the underlying platform.
548                                                   495 
549 Historically, each helper function was identif !! 496 Historically, each helper function was identified by an address
550 encoded in the 'imm' field.  Further documenta !! 497 encoded in the imm field.  The available helper functions may differ
551 is outside the scope of this document and stan !! 498 for each program type, but address values are unique across all program types.
552 future work, but use is widely deployed and mo << 
553 found in platform-specific documentation (e.g. << 
554                                                   499 
555 Platforms that support the BPF Type Format (BT    500 Platforms that support the BPF Type Format (BTF) support identifying
556 a helper function by a BTF ID encoded in the ' !! 501 a helper function by a BTF ID encoded in the imm field, where the BTF ID
557 identifies the helper name and type.  Further  !! 502 identifies the helper name and type.
558 is outside the scope of this document and stan << 
559 future work, but use is widely deployed and mo << 
560 found in platform-specific documentation (e.g. << 
561                                                   503 
562 Program-local functions                           504 Program-local functions
563 ~~~~~~~~~~~~~~~~~~~~~~~                           505 ~~~~~~~~~~~~~~~~~~~~~~~
564 Program-local functions are functions exposed     506 Program-local functions are functions exposed by the same BPF program as the
565 caller, and are referenced by offset from the  !! 507 caller, and are referenced by offset from the call instruction, similar to
566 instruction, similar to ``JA``.  The offset is !! 508 ``JA``.  The offset is encoded in the imm field of the call instruction.
567 the call instruction. An ``EXIT`` within the p !! 509 A ``EXIT`` within the program-local function will return to the caller.
568 return to the caller.                          << 
569                                                   510 
570 Load and store instructions                       511 Load and store instructions
571 ===========================                       512 ===========================
572                                                   513 
573 For load and store instructions (``LD``, ``LDX    514 For load and store instructions (``LD``, ``LDX``, ``ST``, and ``STX``), the
574 8-bit 'opcode' field is divided as follows::   !! 515 8-bit 'opcode' field is divided as::
575                                                   516 
576   +-+-+-+-+-+-+-+-+                               517   +-+-+-+-+-+-+-+-+
577   |mode |sz |class|                               518   |mode |sz |class|
578   +-+-+-+-+-+-+-+-+                               519   +-+-+-+-+-+-+-+-+
579                                                   520 
580 **mode**                                          521 **mode**
581   The mode modifier is one of:                    522   The mode modifier is one of:
582                                                   523 
583   .. table:: Mode modifier                     << 
584                                                << 
585     =============  =====  ====================    524     =============  =====  ====================================  =============
586     mode modifier  value  description             525     mode modifier  value  description                           reference
587     =============  =====  ====================    526     =============  =====  ====================================  =============
588     IMM            0      64-bit immediate ins    527     IMM            0      64-bit immediate instructions         `64-bit immediate instructions`_
589     ABS            1      legacy BPF packet ac    528     ABS            1      legacy BPF packet access (absolute)   `Legacy BPF Packet access instructions`_
590     IND            2      legacy BPF packet ac    529     IND            2      legacy BPF packet access (indirect)   `Legacy BPF Packet access instructions`_
591     MEM            3      regular load and sto    530     MEM            3      regular load and store operations     `Regular load and store operations`_
592     MEMSX          4      sign-extension load     531     MEMSX          4      sign-extension load operations        `Sign-extension load operations`_
593     ATOMIC         6      atomic operations       532     ATOMIC         6      atomic operations                     `Atomic operations`_
594     =============  =====  ====================    533     =============  =====  ====================================  =============
595                                                   534 
596 **sz (size)**                                     535 **sz (size)**
597   The size modifier is one of:                    536   The size modifier is one of:
598                                                   537 
599   .. table:: Size modifier                     << 
600                                                << 
601     ====  =====  =====================            538     ====  =====  =====================
602     size  value  description                      539     size  value  description
603     ====  =====  =====================            540     ====  =====  =====================
604     W     0      word        (4 bytes)            541     W     0      word        (4 bytes)
605     H     1      half word   (2 bytes)            542     H     1      half word   (2 bytes)
606     B     2      byte                             543     B     2      byte
607     DW    3      double word (8 bytes)            544     DW    3      double word (8 bytes)
608     ====  =====  =====================            545     ====  =====  =====================
609                                                   546 
610   Instructions using ``DW`` belong to the base    547   Instructions using ``DW`` belong to the base64 conformance group.
611                                                   548 
612 **class**                                         549 **class**
613   The instruction class (see `Instruction clas    550   The instruction class (see `Instruction classes`_)
614                                                   551 
615 Regular load and store operations                 552 Regular load and store operations
616 ---------------------------------                 553 ---------------------------------
617                                                   554 
618 The ``MEM`` mode modifier is used to encode re    555 The ``MEM`` mode modifier is used to encode regular load and store
619 instructions that transfer data between a regi    556 instructions that transfer data between a register and memory.
620                                                   557 
621 ``{MEM, <size>, STX}`` means::                    558 ``{MEM, <size>, STX}`` means::
622                                                   559 
623   *(size *) (dst + offset) = src                  560   *(size *) (dst + offset) = src
624                                                   561 
625 ``{MEM, <size>, ST}`` means::                     562 ``{MEM, <size>, ST}`` means::
626                                                   563 
627   *(size *) (dst + offset) = imm                  564   *(size *) (dst + offset) = imm
628                                                   565 
629 ``{MEM, <size>, LDX}`` means::                    566 ``{MEM, <size>, LDX}`` means::
630                                                   567 
631   dst = *(unsigned size *) (src + offset)         568   dst = *(unsigned size *) (src + offset)
632                                                   569 
633 Where '<size>' is one of: ``B``, ``H``, ``W``,    570 Where '<size>' is one of: ``B``, ``H``, ``W``, or ``DW``, and
634 'unsigned size' is one of: u8, u16, u32, or u6    571 'unsigned size' is one of: u8, u16, u32, or u64.
635                                                   572 
636 Sign-extension load operations                    573 Sign-extension load operations
637 ------------------------------                    574 ------------------------------
638                                                   575 
639 The ``MEMSX`` mode modifier is used to encode     576 The ``MEMSX`` mode modifier is used to encode :term:`sign-extension<Sign Extend>` load
640 instructions that transfer data between a regi    577 instructions that transfer data between a register and memory.
641                                                   578 
642 ``{MEMSX, <size>, LDX}`` means::                  579 ``{MEMSX, <size>, LDX}`` means::
643                                                   580 
644   dst = *(signed size *) (src + offset)           581   dst = *(signed size *) (src + offset)
645                                                   582 
646 Where '<size>' is one of: ``B``, ``H``, or ``W !! 583 Where size is one of: ``B``, ``H``, or ``W``, and
647 'signed size' is one of: s8, s16, or s32.         584 'signed size' is one of: s8, s16, or s32.
648                                                   585 
649 Atomic operations                                 586 Atomic operations
650 -----------------                                 587 -----------------
651                                                   588 
652 Atomic operations are operations that operate     589 Atomic operations are operations that operate on memory and can not be
653 interrupted or corrupted by other access to th    590 interrupted or corrupted by other access to the same memory region
654 by other BPF programs or means outside of this    591 by other BPF programs or means outside of this specification.
655                                                   592 
656 All atomic operations supported by BPF are enc    593 All atomic operations supported by BPF are encoded as store operations
657 that use the ``ATOMIC`` mode modifier as follo    594 that use the ``ATOMIC`` mode modifier as follows:
658                                                   595 
659 * ``{ATOMIC, W, STX}`` for 32-bit operations,     596 * ``{ATOMIC, W, STX}`` for 32-bit operations, which are
660   part of the "atomic32" conformance group.       597   part of the "atomic32" conformance group.
661 * ``{ATOMIC, DW, STX}`` for 64-bit operations,    598 * ``{ATOMIC, DW, STX}`` for 64-bit operations, which are
662   part of the "atomic64" conformance group.       599   part of the "atomic64" conformance group.
663 * 8-bit and 16-bit wide atomic operations are     600 * 8-bit and 16-bit wide atomic operations are not supported.
664                                                   601 
665 The 'imm' field is used to encode the actual a    602 The 'imm' field is used to encode the actual atomic operation.
666 Simple atomic operation use a subset of the va    603 Simple atomic operation use a subset of the values defined to encode
667 arithmetic operations in the 'imm' field to en    604 arithmetic operations in the 'imm' field to encode the atomic operation:
668                                                   605 
669 .. table:: Simple atomic operations            !! 606 ========  =====  ===========
670                                                !! 607 imm       value  description
671   ========  =====  ===========                 !! 608 ========  =====  ===========
672   imm       value  description                 !! 609 ADD       0x00   atomic add
673   ========  =====  ===========                 !! 610 OR        0x40   atomic or
674   ADD       0x00   atomic add                  !! 611 AND       0x50   atomic and
675   OR        0x40   atomic or                   !! 612 XOR       0xa0   atomic xor
676   AND       0x50   atomic and                  !! 613 ========  =====  ===========
677   XOR       0xa0   atomic xor                  << 
678   ========  =====  ===========                 << 
679                                                   614 
680                                                   615 
681 ``{ATOMIC, W, STX}`` with 'imm' = ADD means::     616 ``{ATOMIC, W, STX}`` with 'imm' = ADD means::
682                                                   617 
683   *(u32 *)(dst + offset) += src                   618   *(u32 *)(dst + offset) += src
684                                                   619 
685 ``{ATOMIC, DW, STX}`` with 'imm' = ADD means::    620 ``{ATOMIC, DW, STX}`` with 'imm' = ADD means::
686                                                   621 
687   *(u64 *)(dst + offset) += src                   622   *(u64 *)(dst + offset) += src
688                                                   623 
689 In addition to the simple atomic operations, t    624 In addition to the simple atomic operations, there also is a modifier and
690 two complex atomic operations:                    625 two complex atomic operations:
691                                                   626 
692 .. table:: Complex atomic operations           !! 627 ===========  ================  ===========================
693                                                !! 628 imm          value             description
694   ===========  ================  ============= !! 629 ===========  ================  ===========================
695   imm          value             description   !! 630 FETCH        0x01              modifier: return old value
696   ===========  ================  ============= !! 631 XCHG         0xe0 | FETCH      atomic exchange
697   FETCH        0x01              modifier: ret !! 632 CMPXCHG      0xf0 | FETCH      atomic compare and exchange
698   XCHG         0xe0 | FETCH      atomic exchan !! 633 ===========  ================  ===========================
699   CMPXCHG      0xf0 | FETCH      atomic compar << 
700   ===========  ================  ============= << 
701                                                   634 
702 The ``FETCH`` modifier is optional for simple     635 The ``FETCH`` modifier is optional for simple atomic operations, and
703 always set for the complex atomic operations.     636 always set for the complex atomic operations.  If the ``FETCH`` flag
704 is set, then the operation also overwrites ``s    637 is set, then the operation also overwrites ``src`` with the value that
705 was in memory before it was modified.             638 was in memory before it was modified.
706                                                   639 
707 The ``XCHG`` operation atomically exchanges ``    640 The ``XCHG`` operation atomically exchanges ``src`` with the value
708 addressed by ``dst + offset``.                    641 addressed by ``dst + offset``.
709                                                   642 
710 The ``CMPXCHG`` operation atomically compares     643 The ``CMPXCHG`` operation atomically compares the value addressed by
711 ``dst + offset`` with ``R0``. If they match, t    644 ``dst + offset`` with ``R0``. If they match, the value addressed by
712 ``dst + offset`` is replaced with ``src``. In     645 ``dst + offset`` is replaced with ``src``. In either case, the
713 value that was at ``dst + offset`` before the     646 value that was at ``dst + offset`` before the operation is zero-extended
714 and loaded back to ``R0``.                        647 and loaded back to ``R0``.
715                                                   648 
716 64-bit immediate instructions                     649 64-bit immediate instructions
717 -----------------------------                     650 -----------------------------
718                                                   651 
719 Instructions with the ``IMM`` 'mode' modifier     652 Instructions with the ``IMM`` 'mode' modifier use the wide instruction
720 encoding defined in `Instruction encoding`_, a    653 encoding defined in `Instruction encoding`_, and use the 'src_reg' field of the
721 basic instruction to hold an opcode subtype.      654 basic instruction to hold an opcode subtype.
722                                                   655 
723 The following table defines a set of ``{IMM, D    656 The following table defines a set of ``{IMM, DW, LD}`` instructions
724 with opcode subtypes in the 'src_reg' field, u    657 with opcode subtypes in the 'src_reg' field, using new terms such as "map"
725 defined further below:                            658 defined further below:
726                                                   659 
727 .. table:: 64-bit immediate instructions       !! 660 =======  =========================================  ===========  ==============
728                                                !! 661 src_reg  pseudocode                                 imm type     dst type
729   =======  =================================== !! 662 =======  =========================================  ===========  ==============
730   src_reg  pseudocode                          !! 663 0x0      dst = (next_imm << 32) | imm               integer      integer
731   =======  =================================== !! 664 0x1      dst = map_by_fd(imm)                       map fd       map
732   0x0      dst = (next_imm << 32) | imm        !! 665 0x2      dst = map_val(map_by_fd(imm)) + next_imm   map fd       data pointer
733   0x1      dst = map_by_fd(imm)                !! 666 0x3      dst = var_addr(imm)                        variable id  data pointer
734   0x2      dst = map_val(map_by_fd(imm)) + nex !! 667 0x4      dst = code_addr(imm)                       integer      code pointer
735   0x3      dst = var_addr(imm)                 !! 668 0x5      dst = map_by_idx(imm)                      map index    map
736   0x4      dst = code_addr(imm)                !! 669 0x6      dst = map_val(map_by_idx(imm)) + next_imm  map index    data pointer
737   0x5      dst = map_by_idx(imm)               !! 670 =======  =========================================  ===========  ==============
738   0x6      dst = map_val(map_by_idx(imm)) + ne << 
739   =======  =================================== << 
740                                                   671 
741 where                                             672 where
742                                                   673 
743 * map_by_fd(imm) means to convert a 32-bit fil    674 * map_by_fd(imm) means to convert a 32-bit file descriptor into an address of a map (see `Maps`_)
744 * map_by_idx(imm) means to convert a 32-bit in    675 * map_by_idx(imm) means to convert a 32-bit index into an address of a map
745 * map_val(map) gets the address of the first v    676 * map_val(map) gets the address of the first value in a given map
746 * var_addr(imm) gets the address of a platform    677 * var_addr(imm) gets the address of a platform variable (see `Platform Variables`_) with a given id
747 * code_addr(imm) gets the address of the instr    678 * code_addr(imm) gets the address of the instruction at a specified relative offset in number of (64-bit) instructions
748 * the 'imm type' can be used by disassemblers     679 * the 'imm type' can be used by disassemblers for display
749 * the 'dst type' can be used for verification     680 * the 'dst type' can be used for verification and JIT compilation purposes
750                                                   681 
751 Maps                                              682 Maps
752 ~~~~                                              683 ~~~~
753                                                   684 
754 Maps are shared memory regions accessible by B    685 Maps are shared memory regions accessible by BPF programs on some platforms.
755 A map can have various semantics as defined in    686 A map can have various semantics as defined in a separate document, and may or
756 may not have a single contiguous memory region    687 may not have a single contiguous memory region, but the 'map_val(map)' is
757 currently only defined for maps that do have a    688 currently only defined for maps that do have a single contiguous memory region.
758                                                   689 
759 Each map can have a file descriptor (fd) if su    690 Each map can have a file descriptor (fd) if supported by the platform, where
760 'map_by_fd(imm)' means to get the map with the    691 'map_by_fd(imm)' means to get the map with the specified file descriptor. Each
761 BPF program can also be defined to use a set o    692 BPF program can also be defined to use a set of maps associated with the
762 program at load time, and 'map_by_idx(imm)' me    693 program at load time, and 'map_by_idx(imm)' means to get the map with the given
763 index in the set associated with the BPF progr    694 index in the set associated with the BPF program containing the instruction.
764                                                   695 
765 Platform Variables                                696 Platform Variables
766 ~~~~~~~~~~~~~~~~~~                                697 ~~~~~~~~~~~~~~~~~~
767                                                   698 
768 Platform variables are memory regions, identif    699 Platform variables are memory regions, identified by integer ids, exposed by
769 the runtime and accessible by BPF programs on     700 the runtime and accessible by BPF programs on some platforms.  The
770 'var_addr(imm)' operation means to get the add    701 'var_addr(imm)' operation means to get the address of the memory region
771 identified by the given id.                       702 identified by the given id.
772                                                   703 
773 Legacy BPF Packet access instructions             704 Legacy BPF Packet access instructions
774 -------------------------------------             705 -------------------------------------
775                                                   706 
776 BPF previously introduced special instructions    707 BPF previously introduced special instructions for access to packet data that were
777 carried over from classic BPF. These instructi    708 carried over from classic BPF. These instructions used an instruction
778 class of ``LD``, a size modifier of ``W``, ``H    709 class of ``LD``, a size modifier of ``W``, ``H``, or ``B``, and a
779 mode modifier of ``ABS`` or ``IND``.  The 'dst    710 mode modifier of ``ABS`` or ``IND``.  The 'dst_reg' and 'offset' fields were
780 set to zero, and 'src_reg' was set to zero for    711 set to zero, and 'src_reg' was set to zero for ``ABS``.  However, these
781 instructions are deprecated and SHOULD no long !! 712 instructions are deprecated and should no longer be used.  All legacy packet
782 access instructions belong to the "packet" con    713 access instructions belong to the "packet" conformance group.
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php