1 xdrgen - Linux Kernel XDR code generator 1 xdrgen - Linux Kernel XDR code generator 2 2 3 Introduction 3 Introduction 4 ------------ 4 ------------ 5 5 6 SunRPC programs are typically specified using 6 SunRPC programs are typically specified using a language defined by 7 RFC 4506. In fact, all IETF-published NFS spec 7 RFC 4506. In fact, all IETF-published NFS specifications provide a 8 description of the specified protocol using th 8 description of the specified protocol using this language. 9 9 10 Since the 1990's, user space consumers of SunR 10 Since the 1990's, user space consumers of SunRPC have had access to 11 a tool that could read such XDR specifications 11 a tool that could read such XDR specifications and then generate C 12 code that implements the RPC portions of that 12 code that implements the RPC portions of that protocol. This tool is 13 called rpcgen. 13 called rpcgen. 14 14 15 This RPC-level code is code that handles input 15 This RPC-level code is code that handles input directly from the 16 network, and thus a high degree of memory safe 16 network, and thus a high degree of memory safety and sanity checking 17 is needed to help ensure proper levels of secu 17 is needed to help ensure proper levels of security. Bugs in this 18 code can have significant impact on security a 18 code can have significant impact on security and performance. 19 19 20 However, it is code that is repetitive and ted 20 However, it is code that is repetitive and tedious to write by hand. 21 21 22 The C code generated by rpcgen makes extensive 22 The C code generated by rpcgen makes extensive use of the facilities 23 of the user space TI-RPC library and libc. Fur 23 of the user space TI-RPC library and libc. Furthermore, the dialect 24 of the generated code is very traditional K&R 24 of the generated code is very traditional K&R C. 25 25 26 The Linux kernel's implementation of SunRPC-ba 26 The Linux kernel's implementation of SunRPC-based protocols hand-roll 27 their XDR implementation. There are two main r 27 their XDR implementation. There are two main reasons for this: 28 28 29 1. libtirpc (and its predecessors) operate onl 29 1. libtirpc (and its predecessors) operate only in user space. The 30 kernel's RPC implementation and its API are 30 kernel's RPC implementation and its API are significantly 31 different than libtirpc. 31 different than libtirpc. 32 32 33 2. rpcgen-generated code is believed to be les 33 2. rpcgen-generated code is believed to be less efficient than code 34 that is hand-written. 34 that is hand-written. 35 35 36 These days, gcc and its kin are capable of opt 36 These days, gcc and its kin are capable of optimizing code better 37 than human authors. There are only a few insta 37 than human authors. There are only a few instances where writing 38 XDR code by hand will make a measurable perfor 38 XDR code by hand will make a measurable performance different. 39 39 40 In addition, the current hand-written code in 40 In addition, the current hand-written code in the Linux kernel is 41 difficult to audit and prove that it implement 41 difficult to audit and prove that it implements exactly what is in 42 the protocol specification. 42 the protocol specification. 43 43 44 In order to accrue the benefits of machine-gen 44 In order to accrue the benefits of machine-generated XDR code in the 45 kernel, a tool is needed that will output C co 45 kernel, a tool is needed that will output C code that works against 46 the kernel's SunRPC implementation rather than 46 the kernel's SunRPC implementation rather than libtirpc. 47 47 48 Enter xdrgen. 48 Enter xdrgen. 49 49 50 50 51 Dependencies 51 Dependencies 52 ------------ 52 ------------ 53 53 54 These dependencies are typically packaged by L 54 These dependencies are typically packaged by Linux distributions: 55 55 56 - python3 56 - python3 57 - python3-lark 57 - python3-lark 58 - python3-jinja2 58 - python3-jinja2 59 59 60 These dependencies are available via PyPi: 60 These dependencies are available via PyPi: 61 61 62 - pip install 'lark[interegular]' 62 - pip install 'lark[interegular]' 63 63 64 64 65 XDR Specifications 65 XDR Specifications 66 ------------------ 66 ------------------ 67 67 68 When adding a new protocol implementation to t 68 When adding a new protocol implementation to the kernel, the XDR 69 specification can be derived by feeding a .txt 69 specification can be derived by feeding a .txt copy of the RFC to 70 the script located in tools/net/sunrpc/extract 70 the script located in tools/net/sunrpc/extract.sh. 71 71 72 $ extract.sh < rfc0001.txt > new2.x 72 $ extract.sh < rfc0001.txt > new2.x 73 73 74 74 75 Operation 75 Operation 76 --------- 76 --------- 77 77 78 Once a .x file is available, use xdrgen to gen 78 Once a .x file is available, use xdrgen to generate source and 79 header files containing an implementation of X 79 header files containing an implementation of XDR encoding and 80 decoding functions for the specified protocol. 80 decoding functions for the specified protocol. 81 81 82 $ ./xdrgen definitions new2.x > include/lin 82 $ ./xdrgen definitions new2.x > include/linux/sunrpc/xdrgen/new2.h 83 $ ./xdrgen declarations new2.x > new2xdr_ge 83 $ ./xdrgen declarations new2.x > new2xdr_gen.h 84 84 85 and 85 and 86 86 87 $ ./xdrgen source new2.x > new2xdr_gen.c 87 $ ./xdrgen source new2.x > new2xdr_gen.c 88 88 89 The files are ready to use for a server-side p 89 The files are ready to use for a server-side protocol implementation, 90 or may be used as a guide for implementing the 90 or may be used as a guide for implementing these routines by hand. 91 91 92 By default, the only comments added to this co 92 By default, the only comments added to this code are kdoc comments 93 that appear directly in front of the public pe 93 that appear directly in front of the public per-procedure APIs. For 94 deeper introspection, specifying the "--annota 94 deeper introspection, specifying the "--annotate" flag will insert 95 additional comments in the generated code to h 95 additional comments in the generated code to help readers match the 96 generated code to specific parts of the XDR sp 96 generated code to specific parts of the XDR specification. 97 97 98 Because the generated code is targeted for the 98 Because the generated code is targeted for the Linux kernel, it 99 is tagged with a GPLv2-only license. 99 is tagged with a GPLv2-only license. 100 100 101 The xdrgen tool can also provide lexical and s 101 The xdrgen tool can also provide lexical and syntax checking of 102 an XDR specification: 102 an XDR specification: 103 103 104 $ ./xdrgen lint xdr/new.x 104 $ ./xdrgen lint xdr/new.x 105 105 106 106 107 How It Works 107 How It Works 108 ------------ 108 ------------ 109 109 110 xdrgen does not use machine learning to genera 110 xdrgen does not use machine learning to generate source code. The 111 translation is entirely deterministic. 111 translation is entirely deterministic. 112 112 113 RFC 4506 Section 6 contains a BNF grammar of t 113 RFC 4506 Section 6 contains a BNF grammar of the XDR specification 114 language. The grammar has been adapted for use 114 language. The grammar has been adapted for use by the Python Lark 115 module. 115 module. 116 116 117 The xdr.ebnf file in this directory contains t 117 The xdr.ebnf file in this directory contains the grammar used to 118 parse XDR specifications. xdrgen configures La 118 parse XDR specifications. xdrgen configures Lark using the grammar 119 in xdr.ebnf. Lark parses the target XDR specif 119 in xdr.ebnf. Lark parses the target XDR specification using this 120 grammar, creating a parse tree. 120 grammar, creating a parse tree. 121 121 122 xdrgen then transforms the parse tree into an 122 xdrgen then transforms the parse tree into an abstract syntax tree. 123 This tree is passed to a series of code genera 123 This tree is passed to a series of code generators. 124 124 125 The generators are implemented as Python class 125 The generators are implemented as Python classes residing in the 126 generators/ directory. Each generator emits co 126 generators/ directory. Each generator emits code created from Jinja2 127 templates stored in the templates/ directory. 127 templates stored in the templates/ directory. 128 128 129 The source code is generated in the same order 129 The source code is generated in the same order in which they appear 130 in the specification to ensure the generated c 130 in the specification to ensure the generated code compiles. This 131 conforms with the behavior of rpcgen. 131 conforms with the behavior of rpcgen. 132 132 133 xdrgen assumes that the generated source code 133 xdrgen assumes that the generated source code is further compiled by 134 a compiler that can optimize in a number of wa 134 a compiler that can optimize in a number of ways, including: 135 135 136 - Unused functions are discarded (ie, not add 136 - Unused functions are discarded (ie, not added to the executable) 137 137 138 - Aggressive function inlining removes unnece 138 - Aggressive function inlining removes unnecessary stack frames 139 139 140 - Single-arm switch statements are replaced b 140 - Single-arm switch statements are replaced by a single conditional 141 branch 141 branch 142 142 143 And so on. 143 And so on. 144 144 145 145 146 Pragmas 146 Pragmas 147 ------- 147 ------- 148 148 149 Pragma directives specify exceptions to the no 149 Pragma directives specify exceptions to the normal generation of 150 encoding and decoding functions. Currently one 150 encoding and decoding functions. Currently one directive is 151 implemented: "public". 151 implemented: "public". 152 152 153 Pragma exclude 153 Pragma exclude 154 ------ ------- 154 ------ ------- 155 155 156 pragma exclude <RPC procedure> ; 156 pragma exclude <RPC procedure> ; 157 157 158 In some cases, a procedure encoder or decoder 158 In some cases, a procedure encoder or decoder function might need 159 special processing that cannot be automaticall 159 special processing that cannot be automatically generated. The 160 automatically-generated functions might confli 160 automatically-generated functions might conflict or interfere with 161 the hand-rolled function. To avoid editing the 161 the hand-rolled function. To avoid editing the generated source code 162 by hand, a pragma can specify that the procedu 162 by hand, a pragma can specify that the procedure's encoder and 163 decoder functions are not included in the gene 163 decoder functions are not included in the generated header and 164 source. 164 source. 165 165 166 For example: 166 For example: 167 167 168 pragma exclude NFSPROC3_READDIRPLUS; 168 pragma exclude NFSPROC3_READDIRPLUS; 169 169 170 Excludes the decoder function for the READDIRP 170 Excludes the decoder function for the READDIRPLUS argument and the 171 encoder function for the READDIRPLUS result. 171 encoder function for the READDIRPLUS result. 172 172 173 Note that because data item encoder and decode 173 Note that because data item encoder and decoder functions are 174 defined "static __maybe_unused", subsequent co 174 defined "static __maybe_unused", subsequent compilation 175 automatically excludes data item encoder and d 175 automatically excludes data item encoder and decoder functions that 176 are used only by excluded procedure. 176 are used only by excluded procedure. 177 177 178 Pragma header 178 Pragma header 179 ------ ------ 179 ------ ------ 180 180 181 pragma header <string> ; 181 pragma header <string> ; 182 182 183 Provide a name to use for the header file. For 183 Provide a name to use for the header file. For example: 184 184 185 pragma header nlm4; 185 pragma header nlm4; 186 186 187 Adds 187 Adds 188 188 189 #include "nlm4xdr_gen.h" 189 #include "nlm4xdr_gen.h" 190 190 191 to the generated source file. 191 to the generated source file. 192 192 193 Pragma public 193 Pragma public 194 ------ ------ 194 ------ ------ 195 195 196 pragma public <XDR data item> ; 196 pragma public <XDR data item> ; 197 197 198 Normally XDR encoder and decoder functions are 198 Normally XDR encoder and decoder functions are "static". In case an 199 implementer wants to call these functions from 199 implementer wants to call these functions from other source code, 200 s/he can add a public pragma in the input .x f 200 s/he can add a public pragma in the input .x file to indicate a set 201 of functions that should get a prototype in th 201 of functions that should get a prototype in the generated header, 202 and the function definitions will not be decla 202 and the function definitions will not be declared static. 203 203 204 For example: 204 For example: 205 205 206 pragma public nfsstat3; 206 pragma public nfsstat3; 207 207 208 Adds these prototypes in the generated header: 208 Adds these prototypes in the generated header: 209 209 210 bool xdrgen_decode_nfsstat3(struct xdr_strea 210 bool xdrgen_decode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 *ptr); 211 bool xdrgen_encode_nfsstat3(struct xdr_strea 211 bool xdrgen_encode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 value); 212 212 213 And, in the generated source code, both of the 213 And, in the generated source code, both of these functions appear 214 without the "static __maybe_unused" modifiers. 214 without the "static __maybe_unused" modifiers. 215 215 216 216 217 Future Work 217 Future Work 218 ----------- 218 ----------- 219 219 220 Finish implementing XDR pointer and list types 220 Finish implementing XDR pointer and list types. 221 221 222 Generate client-side procedure functions 222 Generate client-side procedure functions 223 223 224 Expand the README into a user guide similar to 224 Expand the README into a user guide similar to rpcgen(1) 225 225 226 Add more pragma directives: 226 Add more pragma directives: 227 227 228 * @pages -- use xdr_read/write_pages() for t 228 * @pages -- use xdr_read/write_pages() for the specified opaque 229 field 229 field 230 * @skip -- do not decode, but rather skip, t 230 * @skip -- do not decode, but rather skip, the specified argument 231 field 231 field 232 232 233 Enable something like a #include to dynamicall 233 Enable something like a #include to dynamically insert the content 234 of other specification files 234 of other specification files 235 235 236 Properly support line-by-line pass-through via 236 Properly support line-by-line pass-through via the "%" decorator 237 237 238 Build a unit test suite for verifying translat 238 Build a unit test suite for verifying translation of XDR language 239 into compilable code 239 into compilable code 240 240 241 Add a command-line option to insert trace_prin 241 Add a command-line option to insert trace_printk call sites in the 242 generated source code, for improved (temporary 242 generated source code, for improved (temporary) observability 243 243 244 Generate kernel Rust code as well as C code 244 Generate kernel Rust code as well as C code
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.