bpf, docs: Document BPF insn encoding in term of stored bytes

[Changes from V4:
- s/regs:16/regs:8 in figure.]

[Changes from V3:
- Back to src_reg and dst_reg, since they denote register numbers
as opposed to the values stored in these registers.]

[Changes from V2:
- Use src and dst consistently in the document.
- Use a more graphical depiction of the 128-bit instruction.
- Remove `Where:' fragment.
- Clarify that unused bits are reserved and shall be zeroed.]

[Changes from V1:
- Use rst literal blocks for figures.
- Avoid using | in the basic instruction/pseudo instruction figure.
- Rebased to today's bpf-next master branch.]

This patch modifies instruction-set.rst so it documents the encoding
of BPF instructions in terms of how the bytes are stored (be it in an
ELF file or as bytes in a memory buffer to be loaded into the kernel
or some other BPF consumer) as opposed to how the instruction looks
like once loaded.

This is hopefully easier to understand by implementors looking to
generate and/or consume bytes conforming BPF instructions.

The patch also clarifies that the unused bytes in a pseudo-instruction
shall be cleared with zeros.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/87h6v6i0da.fsf_-_@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Jose E. Marchesi and committed by

Alexei Starovoitov 3 years ago ae256f95 30a2d832

+24 -22

1 changed file

expand all

Documentation

bpf

instruction-set.rst

+24 -22

Documentation/bpf/instruction-set.rst

··· 38 38 * the wide instruction encoding, which appends a second 64-bit immediate (i.e., 39 39 constant) value after the basic instruction for a total of 128 bits. 40 40 41 - The basic instruction encoding looks as follows for a little-endian processor, 42 - where MSB and LSB mean the most significant bits and least significant bits, 43 - respectively: 41 + The fields conforming an encoded basic instruction are stored in the 42 + following order:: 44 43 45 - ============= ======= ======= ======= ============ 46 - 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) 47 - ============= ======= ======= ======= ============ 48 - imm offset src_reg dst_reg opcode 49 - ============= ======= ======= ======= ============ 44 + opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF. 45 + opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF. 50 46 51 47 **imm** 52 48 signed integer immediate value ··· 60 64 **opcode** 61 65 operation to perform 62 66 63 - and as follows for a big-endian processor: 67 + Note that the contents of multi-byte fields ('imm' and 'offset') are 68 + stored using big-endian byte ordering in big-endian BPF and 69 + little-endian byte ordering in little-endian BPF. 64 70 65 - ============= ======= ======= ======= ============ 66 - 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) 67 - ============= ======= ======= ======= ============ 68 - imm offset dst_reg src_reg opcode 69 - ============= ======= ======= ======= ============ 71 + For example:: 70 72 71 - Multi-byte fields ('imm' and 'offset') are similarly stored in 72 - the byte order of the processor. 73 + opcode offset imm assembly 74 + src_reg dst_reg 75 + 07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little 76 + dst_reg src_reg 77 + 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big 73 78 74 79 Note that most instructions do not use all of the fields. 75 80 Unused fields shall be cleared to zero. ··· 81 84 using the same format but with opcode, dst_reg, src_reg, and offset all set to zero, 82 85 and imm containing the high 32 bits of the immediate value. 83 86 84 - ================= ================== 85 - 64 bits (MSB) 64 bits (LSB) 86 - ================= ================== 87 - basic instruction pseudo instruction 88 - ================= ================== 87 + This is depicted in the following figure:: 88 + 89 + basic_instruction 90 + .-----------------------------. 91 + | | 92 + code:8 regs:8 offset:16 imm:32 unused:32 imm:32 93 + | | 94 + '--------------' 95 + pseudo instruction 89 96 90 97 Thus the 64-bit immediate value is constructed as follows: 91 98 92 99 imm64 = (next_imm << 32) | imm 93 100 94 101 where 'next_imm' refers to the imm value of the pseudo instruction 95 - following the basic instruction. 102 + following the basic instruction. The unused bytes in the pseudo 103 + instruction are reserved and shall be cleared to zero. 96 104 97 105 Instruction classes 98 106 -------------------

Configure Feed

Configure Feed