1 This file has moved to process/coding-style.rs !! 1 >> 2 Linux kernel coding style >> 3 >> 4 This is a short document describing the preferred coding style for the >> 5 linux kernel. Coding style is very personal, and I won't _force_ my >> 6 views on anybody, but this is what goes for anything that I have to be >> 7 able to maintain, and I'd prefer it for most other things too. Please >> 8 at least consider the points made here. >> 9 >> 10 First off, I'd suggest printing out a copy of the GNU coding standards, >> 11 and NOT read it. Burn them, it's a great symbolic gesture. >> 12 >> 13 Anyway, here goes: >> 14 >> 15 >> 16 Chapter 1: Indentation >> 17 >> 18 Tabs are 8 characters, and thus indentations are also 8 characters. >> 19 There are heretic movements that try to make indentations 4 (or even 2!) >> 20 characters deep, and that is akin to trying to define the value of PI to >> 21 be 3. >> 22 >> 23 Rationale: The whole idea behind indentation is to clearly define where >> 24 a block of control starts and ends. Especially when you've been looking >> 25 at your screen for 20 straight hours, you'll find it a lot easier to see >> 26 how the indentation works if you have large indentations. >> 27 >> 28 Now, some people will claim that having 8-character indentations makes >> 29 the code move too far to the right, and makes it hard to read on a >> 30 80-character terminal screen. The answer to that is that if you need >> 31 more than 3 levels of indentation, you're screwed anyway, and should fix >> 32 your program. >> 33 >> 34 In short, 8-char indents make things easier to read, and have the added >> 35 benefit of warning you when you're nesting your functions too deep. >> 36 Heed that warning. >> 37 >> 38 The preferred way to ease multiple indentation levels in a switch statement is >> 39 to align the "switch" and its subordinate "case" labels in the same column >> 40 instead of "double-indenting" the "case" labels. E.g.: >> 41 >> 42 switch (suffix) { >> 43 case 'G': >> 44 case 'g': >> 45 mem <<= 30; >> 46 break; >> 47 case 'M': >> 48 case 'm': >> 49 mem <<= 20; >> 50 break; >> 51 case 'K': >> 52 case 'k': >> 53 mem <<= 10; >> 54 /* fall through */ >> 55 default: >> 56 break; >> 57 } >> 58 >> 59 Don't put multiple statements on a single line unless you have >> 60 something to hide: >> 61 >> 62 if (condition) do_this; >> 63 do_something_everytime; >> 64 >> 65 Don't put multiple assignments on a single line either. Kernel coding style >> 66 is super simple. Avoid tricky expressions. >> 67 >> 68 Outside of comments, documentation and except in Kconfig, spaces are never >> 69 used for indentation, and the above example is deliberately broken. >> 70 >> 71 Get a decent editor and don't leave whitespace at the end of lines. >> 72 >> 73 >> 74 Chapter 2: Breaking long lines and strings >> 75 >> 76 Coding style is all about readability and maintainability using commonly >> 77 available tools. >> 78 >> 79 The limit on the length of lines is 80 columns and this is a strongly >> 80 preferred limit. >> 81 >> 82 Statements longer than 80 columns will be broken into sensible chunks, unless >> 83 exceeding 80 columns significantly increases readability and does not hide >> 84 information. Descendants are always substantially shorter than the parent and >> 85 are placed substantially to the right. The same applies to function headers >> 86 with a long argument list. However, never break user-visible strings such as >> 87 printk messages, because that breaks the ability to grep for them. >> 88 >> 89 >> 90 Chapter 3: Placing Braces and Spaces >> 91 >> 92 The other issue that always comes up in C styling is the placement of >> 93 braces. Unlike the indent size, there are few technical reasons to >> 94 choose one placement strategy over the other, but the preferred way, as >> 95 shown to us by the prophets Kernighan and Ritchie, is to put the opening >> 96 brace last on the line, and put the closing brace first, thusly: >> 97 >> 98 if (x is true) { >> 99 we do y >> 100 } >> 101 >> 102 This applies to all non-function statement blocks (if, switch, for, >> 103 while, do). E.g.: >> 104 >> 105 switch (action) { >> 106 case KOBJ_ADD: >> 107 return "add"; >> 108 case KOBJ_REMOVE: >> 109 return "remove"; >> 110 case KOBJ_CHANGE: >> 111 return "change"; >> 112 default: >> 113 return NULL; >> 114 } >> 115 >> 116 However, there is one special case, namely functions: they have the >> 117 opening brace at the beginning of the next line, thus: >> 118 >> 119 int function(int x) >> 120 { >> 121 body of function >> 122 } >> 123 >> 124 Heretic people all over the world have claimed that this inconsistency >> 125 is ... well ... inconsistent, but all right-thinking people know that >> 126 (a) K&R are _right_ and (b) K&R are right. Besides, functions are >> 127 special anyway (you can't nest them in C). >> 128 >> 129 Note that the closing brace is empty on a line of its own, _except_ in >> 130 the cases where it is followed by a continuation of the same statement, >> 131 ie a "while" in a do-statement or an "else" in an if-statement, like >> 132 this: >> 133 >> 134 do { >> 135 body of do-loop >> 136 } while (condition); >> 137 >> 138 and >> 139 >> 140 if (x == y) { >> 141 .. >> 142 } else if (x > y) { >> 143 ... >> 144 } else { >> 145 .... >> 146 } >> 147 >> 148 Rationale: K&R. >> 149 >> 150 Also, note that this brace-placement also minimizes the number of empty >> 151 (or almost empty) lines, without any loss of readability. Thus, as the >> 152 supply of new-lines on your screen is not a renewable resource (think >> 153 25-line terminal screens here), you have more empty lines to put >> 154 comments on. >> 155 >> 156 Do not unnecessarily use braces where a single statement will do. >> 157 >> 158 if (condition) >> 159 action(); >> 160 >> 161 and >> 162 >> 163 if (condition) >> 164 do_this(); >> 165 else >> 166 do_that(); >> 167 >> 168 This does not apply if only one branch of a conditional statement is a single >> 169 statement; in the latter case use braces in both branches: >> 170 >> 171 if (condition) { >> 172 do_this(); >> 173 do_that(); >> 174 } else { >> 175 otherwise(); >> 176 } >> 177 >> 178 3.1: Spaces >> 179 >> 180 Linux kernel style for use of spaces depends (mostly) on >> 181 function-versus-keyword usage. Use a space after (most) keywords. The >> 182 notable exceptions are sizeof, typeof, alignof, and __attribute__, which look >> 183 somewhat like functions (and are usually used with parentheses in Linux, >> 184 although they are not required in the language, as in: "sizeof info" after >> 185 "struct fileinfo info;" is declared). >> 186 >> 187 So use a space after these keywords: >> 188 >> 189 if, switch, case, for, do, while >> 190 >> 191 but not with sizeof, typeof, alignof, or __attribute__. E.g., >> 192 >> 193 s = sizeof(struct file); >> 194 >> 195 Do not add spaces around (inside) parenthesized expressions. This example is >> 196 *bad*: >> 197 >> 198 s = sizeof( struct file ); >> 199 >> 200 When declaring pointer data or a function that returns a pointer type, the >> 201 preferred use of '*' is adjacent to the data name or function name and not >> 202 adjacent to the type name. Examples: >> 203 >> 204 char *linux_banner; >> 205 unsigned long long memparse(char *ptr, char **retptr); >> 206 char *match_strdup(substring_t *s); >> 207 >> 208 Use one space around (on each side of) most binary and ternary operators, >> 209 such as any of these: >> 210 >> 211 = + - < > * / % | & ^ <= >= == != ? : >> 212 >> 213 but no space after unary operators: >> 214 >> 215 & * + - ~ ! sizeof typeof alignof __attribute__ defined >> 216 >> 217 no space before the postfix increment & decrement unary operators: >> 218 >> 219 ++ -- >> 220 >> 221 no space after the prefix increment & decrement unary operators: >> 222 >> 223 ++ -- >> 224 >> 225 and no space around the '.' and "->" structure member operators. >> 226 >> 227 Do not leave trailing whitespace at the ends of lines. Some editors with >> 228 "smart" indentation will insert whitespace at the beginning of new lines as >> 229 appropriate, so you can start typing the next line of code right away. >> 230 However, some such editors do not remove the whitespace if you end up not >> 231 putting a line of code there, such as if you leave a blank line. As a result, >> 232 you end up with lines containing trailing whitespace. >> 233 >> 234 Git will warn you about patches that introduce trailing whitespace, and can >> 235 optionally strip the trailing whitespace for you; however, if applying a series >> 236 of patches, this may make later patches in the series fail by changing their >> 237 context lines. >> 238 >> 239 >> 240 Chapter 4: Naming >> 241 >> 242 C is a Spartan language, and so should your naming be. Unlike Modula-2 >> 243 and Pascal programmers, C programmers do not use cute names like >> 244 ThisVariableIsATemporaryCounter. A C programmer would call that >> 245 variable "tmp", which is much easier to write, and not the least more >> 246 difficult to understand. >> 247 >> 248 HOWEVER, while mixed-case names are frowned upon, descriptive names for >> 249 global variables are a must. To call a global function "foo" is a >> 250 shooting offense. >> 251 >> 252 GLOBAL variables (to be used only if you _really_ need them) need to >> 253 have descriptive names, as do global functions. If you have a function >> 254 that counts the number of active users, you should call that >> 255 "count_active_users()" or similar, you should _not_ call it "cntusr()". >> 256 >> 257 Encoding the type of a function into the name (so-called Hungarian >> 258 notation) is brain damaged - the compiler knows the types anyway and can >> 259 check those, and it only confuses the programmer. No wonder MicroSoft >> 260 makes buggy programs. >> 261 >> 262 LOCAL variable names should be short, and to the point. If you have >> 263 some random integer loop counter, it should probably be called "i". >> 264 Calling it "loop_counter" is non-productive, if there is no chance of it >> 265 being mis-understood. Similarly, "tmp" can be just about any type of >> 266 variable that is used to hold a temporary value. >> 267 >> 268 If you are afraid to mix up your local variable names, you have another >> 269 problem, which is called the function-growth-hormone-imbalance syndrome. >> 270 See chapter 6 (Functions). >> 271 >> 272 >> 273 Chapter 5: Typedefs >> 274 >> 275 Please don't use things like "vps_t". >> 276 It's a _mistake_ to use typedef for structures and pointers. When you see a >> 277 >> 278 vps_t a; >> 279 >> 280 in the source, what does it mean? >> 281 In contrast, if it says >> 282 >> 283 struct virtual_container *a; >> 284 >> 285 you can actually tell what "a" is. >> 286 >> 287 Lots of people think that typedefs "help readability". Not so. They are >> 288 useful only for: >> 289 >> 290 (a) totally opaque objects (where the typedef is actively used to _hide_ >> 291 what the object is). >> 292 >> 293 Example: "pte_t" etc. opaque objects that you can only access using >> 294 the proper accessor functions. >> 295 >> 296 NOTE! Opaqueness and "accessor functions" are not good in themselves. >> 297 The reason we have them for things like pte_t etc. is that there >> 298 really is absolutely _zero_ portably accessible information there. >> 299 >> 300 (b) Clear integer types, where the abstraction _helps_ avoid confusion >> 301 whether it is "int" or "long". >> 302 >> 303 u8/u16/u32 are perfectly fine typedefs, although they fit into >> 304 category (d) better than here. >> 305 >> 306 NOTE! Again - there needs to be a _reason_ for this. If something is >> 307 "unsigned long", then there's no reason to do >> 308 >> 309 typedef unsigned long myflags_t; >> 310 >> 311 but if there is a clear reason for why it under certain circumstances >> 312 might be an "unsigned int" and under other configurations might be >> 313 "unsigned long", then by all means go ahead and use a typedef. >> 314 >> 315 (c) when you use sparse to literally create a _new_ type for >> 316 type-checking. >> 317 >> 318 (d) New types which are identical to standard C99 types, in certain >> 319 exceptional circumstances. >> 320 >> 321 Although it would only take a short amount of time for the eyes and >> 322 brain to become accustomed to the standard types like 'uint32_t', >> 323 some people object to their use anyway. >> 324 >> 325 Therefore, the Linux-specific 'u8/u16/u32/u64' types and their >> 326 signed equivalents which are identical to standard types are >> 327 permitted -- although they are not mandatory in new code of your >> 328 own. >> 329 >> 330 When editing existing code which already uses one or the other set >> 331 of types, you should conform to the existing choices in that code. >> 332 >> 333 (e) Types safe for use in userspace. >> 334 >> 335 In certain structures which are visible to userspace, we cannot >> 336 require C99 types and cannot use the 'u32' form above. Thus, we >> 337 use __u32 and similar types in all structures which are shared >> 338 with userspace. >> 339 >> 340 Maybe there are other cases too, but the rule should basically be to NEVER >> 341 EVER use a typedef unless you can clearly match one of those rules. >> 342 >> 343 In general, a pointer, or a struct that has elements that can reasonably >> 344 be directly accessed should _never_ be a typedef. >> 345 >> 346 >> 347 Chapter 6: Functions >> 348 >> 349 Functions should be short and sweet, and do just one thing. They should >> 350 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, >> 351 as we all know), and do one thing and do that well. >> 352 >> 353 The maximum length of a function is inversely proportional to the >> 354 complexity and indentation level of that function. So, if you have a >> 355 conceptually simple function that is just one long (but simple) >> 356 case-statement, where you have to do lots of small things for a lot of >> 357 different cases, it's OK to have a longer function. >> 358 >> 359 However, if you have a complex function, and you suspect that a >> 360 less-than-gifted first-year high-school student might not even >> 361 understand what the function is all about, you should adhere to the >> 362 maximum limits all the more closely. Use helper functions with >> 363 descriptive names (you can ask the compiler to in-line them if you think >> 364 it's performance-critical, and it will probably do a better job of it >> 365 than you would have done). >> 366 >> 367 Another measure of the function is the number of local variables. They >> 368 shouldn't exceed 5-10, or you're doing something wrong. Re-think the >> 369 function, and split it into smaller pieces. A human brain can >> 370 generally easily keep track of about 7 different things, anything more >> 371 and it gets confused. You know you're brilliant, but maybe you'd like >> 372 to understand what you did 2 weeks from now. >> 373 >> 374 In source files, separate functions with one blank line. If the function is >> 375 exported, the EXPORT* macro for it should follow immediately after the closing >> 376 function brace line. E.g.: >> 377 >> 378 int system_is_up(void) >> 379 { >> 380 return system_state == SYSTEM_RUNNING; >> 381 } >> 382 EXPORT_SYMBOL(system_is_up); >> 383 >> 384 In function prototypes, include parameter names with their data types. >> 385 Although this is not required by the C language, it is preferred in Linux >> 386 because it is a simple way to add valuable information for the reader. >> 387 >> 388 >> 389 Chapter 7: Centralized exiting of functions >> 390 >> 391 Albeit deprecated by some people, the equivalent of the goto statement is >> 392 used frequently by compilers in form of the unconditional jump instruction. >> 393 >> 394 The goto statement comes in handy when a function exits from multiple >> 395 locations and some common work such as cleanup has to be done. If there is no >> 396 cleanup needed then just return directly. >> 397 >> 398 Choose label names which say what the goto does or why the goto exists. An >> 399 example of a good name could be "out_buffer:" if the goto frees "buffer". Avoid >> 400 using GW-BASIC names like "err1:" and "err2:". Also don't name them after the >> 401 goto location like "err_kmalloc_failed:" >> 402 >> 403 The rationale for using gotos is: >> 404 >> 405 - unconditional statements are easier to understand and follow >> 406 - nesting is reduced >> 407 - errors by not updating individual exit points when making >> 408 modifications are prevented >> 409 - saves the compiler work to optimize redundant code away ;) >> 410 >> 411 int fun(int a) >> 412 { >> 413 int result = 0; >> 414 char *buffer; >> 415 >> 416 buffer = kmalloc(SIZE, GFP_KERNEL); >> 417 if (!buffer) >> 418 return -ENOMEM; >> 419 >> 420 if (condition1) { >> 421 while (loop1) { >> 422 ... >> 423 } >> 424 result = 1; >> 425 goto out_buffer; >> 426 } >> 427 ... >> 428 out_buffer: >> 429 kfree(buffer); >> 430 return result; >> 431 } >> 432 >> 433 A common type of bug to be aware of it "one err bugs" which look like this: >> 434 >> 435 err: >> 436 kfree(foo->bar); >> 437 kfree(foo); >> 438 return ret; >> 439 >> 440 The bug in this code is that on some exit paths "foo" is NULL. Normally the >> 441 fix for this is to split it up into two error labels "err_bar:" and "err_foo:". >> 442 >> 443 >> 444 Chapter 8: Commenting >> 445 >> 446 Comments are good, but there is also a danger of over-commenting. NEVER >> 447 try to explain HOW your code works in a comment: it's much better to >> 448 write the code so that the _working_ is obvious, and it's a waste of >> 449 time to explain badly written code. >> 450 >> 451 Generally, you want your comments to tell WHAT your code does, not HOW. >> 452 Also, try to avoid putting comments inside a function body: if the >> 453 function is so complex that you need to separately comment parts of it, >> 454 you should probably go back to chapter 6 for a while. You can make >> 455 small comments to note or warn about something particularly clever (or >> 456 ugly), but try to avoid excess. Instead, put the comments at the head >> 457 of the function, telling people what it does, and possibly WHY it does >> 458 it. >> 459 >> 460 When commenting the kernel API functions, please use the kernel-doc format. >> 461 See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc >> 462 for details. >> 463 >> 464 Linux style for comments is the C89 "/* ... */" style. >> 465 Don't use C99-style "// ..." comments. >> 466 >> 467 The preferred style for long (multi-line) comments is: >> 468 >> 469 /* >> 470 * This is the preferred style for multi-line >> 471 * comments in the Linux kernel source code. >> 472 * Please use it consistently. >> 473 * >> 474 * Description: A column of asterisks on the left side, >> 475 * with beginning and ending almost-blank lines. >> 476 */ >> 477 >> 478 For files in net/ and drivers/net/ the preferred style for long (multi-line) >> 479 comments is a little different. >> 480 >> 481 /* The preferred comment style for files in net/ and drivers/net >> 482 * looks like this. >> 483 * >> 484 * It is nearly the same as the generally preferred comment style, >> 485 * but there is no initial almost-blank line. >> 486 */ >> 487 >> 488 It's also important to comment data, whether they are basic types or derived >> 489 types. To this end, use just one data declaration per line (no commas for >> 490 multiple data declarations). This leaves you room for a small comment on each >> 491 item, explaining its use. >> 492 >> 493 >> 494 Chapter 9: You've made a mess of it >> 495 >> 496 That's OK, we all do. You've probably been told by your long-time Unix >> 497 user helper that "GNU emacs" automatically formats the C sources for >> 498 you, and you've noticed that yes, it does do that, but the defaults it >> 499 uses are less than desirable (in fact, they are worse than random >> 500 typing - an infinite number of monkeys typing into GNU emacs would never >> 501 make a good program). >> 502 >> 503 So, you can either get rid of GNU emacs, or change it to use saner >> 504 values. To do the latter, you can stick the following in your .emacs file: >> 505 >> 506 (defun c-lineup-arglist-tabs-only (ignored) >> 507 "Line up argument lists by tabs, not spaces" >> 508 (let* ((anchor (c-langelem-pos c-syntactic-element)) >> 509 (column (c-langelem-2nd-pos c-syntactic-element)) >> 510 (offset (- (1+ column) anchor)) >> 511 (steps (floor offset c-basic-offset))) >> 512 (* (max steps 1) >> 513 c-basic-offset))) >> 514 >> 515 (add-hook 'c-mode-common-hook >> 516 (lambda () >> 517 ;; Add kernel style >> 518 (c-add-style >> 519 "linux-tabs-only" >> 520 '("linux" (c-offsets-alist >> 521 (arglist-cont-nonempty >> 522 c-lineup-gcc-asm-reg >> 523 c-lineup-arglist-tabs-only)))))) >> 524 >> 525 (add-hook 'c-mode-hook >> 526 (lambda () >> 527 (let ((filename (buffer-file-name))) >> 528 ;; Enable kernel mode for the appropriate files >> 529 (when (and filename >> 530 (string-match (expand-file-name "~/src/linux-trees") >> 531 filename)) >> 532 (setq indent-tabs-mode t) >> 533 (setq show-trailing-whitespace t) >> 534 (c-set-style "linux-tabs-only"))))) >> 535 >> 536 This will make emacs go better with the kernel coding style for C >> 537 files below ~/src/linux-trees. >> 538 >> 539 But even if you fail in getting emacs to do sane formatting, not >> 540 everything is lost: use "indent". >> 541 >> 542 Now, again, GNU indent has the same brain-dead settings that GNU emacs >> 543 has, which is why you need to give it a few command line options. >> 544 However, that's not too bad, because even the makers of GNU indent >> 545 recognize the authority of K&R (the GNU people aren't evil, they are >> 546 just severely misguided in this matter), so you just give indent the >> 547 options "-kr -i8" (stands for "K&R, 8 character indents"), or use >> 548 "scripts/Lindent", which indents in the latest style. >> 549 >> 550 "indent" has a lot of options, and especially when it comes to comment >> 551 re-formatting you may want to take a look at the man page. But >> 552 remember: "indent" is not a fix for bad programming. >> 553 >> 554 >> 555 Chapter 10: Kconfig configuration files >> 556 >> 557 For all of the Kconfig* configuration files throughout the source tree, >> 558 the indentation is somewhat different. Lines under a "config" definition >> 559 are indented with one tab, while help text is indented an additional two >> 560 spaces. Example: >> 561 >> 562 config AUDIT >> 563 bool "Auditing support" >> 564 depends on NET >> 565 help >> 566 Enable auditing infrastructure that can be used with another >> 567 kernel subsystem, such as SELinux (which requires this for >> 568 logging of avc messages output). Does not do system-call >> 569 auditing without CONFIG_AUDITSYSCALL. >> 570 >> 571 Seriously dangerous features (such as write support for certain >> 572 filesystems) should advertise this prominently in their prompt string: >> 573 >> 574 config ADFS_FS_RW >> 575 bool "ADFS write support (DANGEROUS)" >> 576 depends on ADFS_FS >> 577 ... >> 578 >> 579 For full documentation on the configuration files, see the file >> 580 Documentation/kbuild/kconfig-language.txt. >> 581 >> 582 >> 583 Chapter 11: Data structures >> 584 >> 585 Data structures that have visibility outside the single-threaded >> 586 environment they are created and destroyed in should always have >> 587 reference counts. In the kernel, garbage collection doesn't exist (and >> 588 outside the kernel garbage collection is slow and inefficient), which >> 589 means that you absolutely _have_ to reference count all your uses. >> 590 >> 591 Reference counting means that you can avoid locking, and allows multiple >> 592 users to have access to the data structure in parallel - and not having >> 593 to worry about the structure suddenly going away from under them just >> 594 because they slept or did something else for a while. >> 595 >> 596 Note that locking is _not_ a replacement for reference counting. >> 597 Locking is used to keep data structures coherent, while reference >> 598 counting is a memory management technique. Usually both are needed, and >> 599 they are not to be confused with each other. >> 600 >> 601 Many data structures can indeed have two levels of reference counting, >> 602 when there are users of different "classes". The subclass count counts >> 603 the number of subclass users, and decrements the global count just once >> 604 when the subclass count goes to zero. >> 605 >> 606 Examples of this kind of "multi-level-reference-counting" can be found in >> 607 memory management ("struct mm_struct": mm_users and mm_count), and in >> 608 filesystem code ("struct super_block": s_count and s_active). >> 609 >> 610 Remember: if another thread can find your data structure, and you don't >> 611 have a reference count on it, you almost certainly have a bug. >> 612 >> 613 >> 614 Chapter 12: Macros, Enums and RTL >> 615 >> 616 Names of macros defining constants and labels in enums are capitalized. >> 617 >> 618 #define CONSTANT 0x12345 >> 619 >> 620 Enums are preferred when defining several related constants. >> 621 >> 622 CAPITALIZED macro names are appreciated but macros resembling functions >> 623 may be named in lower case. >> 624 >> 625 Generally, inline functions are preferable to macros resembling functions. >> 626 >> 627 Macros with multiple statements should be enclosed in a do - while block: >> 628 >> 629 #define macrofun(a, b, c) \ >> 630 do { \ >> 631 if (a == 5) \ >> 632 do_this(b, c); \ >> 633 } while (0) >> 634 >> 635 Things to avoid when using macros: >> 636 >> 637 1) macros that affect control flow: >> 638 >> 639 #define FOO(x) \ >> 640 do { \ >> 641 if (blah(x) < 0) \ >> 642 return -EBUGGERED; \ >> 643 } while(0) >> 644 >> 645 is a _very_ bad idea. It looks like a function call but exits the "calling" >> 646 function; don't break the internal parsers of those who will read the code. >> 647 >> 648 2) macros that depend on having a local variable with a magic name: >> 649 >> 650 #define FOO(val) bar(index, val) >> 651 >> 652 might look like a good thing, but it's confusing as hell when one reads the >> 653 code and it's prone to breakage from seemingly innocent changes. >> 654 >> 655 3) macros with arguments that are used as l-values: FOO(x) = y; will >> 656 bite you if somebody e.g. turns FOO into an inline function. >> 657 >> 658 4) forgetting about precedence: macros defining constants using expressions >> 659 must enclose the expression in parentheses. Beware of similar issues with >> 660 macros using parameters. >> 661 >> 662 #define CONSTANT 0x4000 >> 663 #define CONSTEXP (CONSTANT | 3) >> 664 >> 665 5) namespace collisions when defining local variables in macros resembling >> 666 functions: >> 667 >> 668 #define FOO(x) \ >> 669 ({ \ >> 670 typeof(x) ret; \ >> 671 ret = calc_ret(x); \ >> 672 (ret); \ >> 673 }) >> 674 >> 675 ret is a common name for a local variable - __foo_ret is less likely >> 676 to collide with an existing variable. >> 677 >> 678 The cpp manual deals with macros exhaustively. The gcc internals manual also >> 679 covers RTL which is used frequently with assembly language in the kernel. >> 680 >> 681 >> 682 Chapter 13: Printing kernel messages >> 683 >> 684 Kernel developers like to be seen as literate. Do mind the spelling >> 685 of kernel messages to make a good impression. Do not use crippled >> 686 words like "dont"; use "do not" or "don't" instead. Make the messages >> 687 concise, clear, and unambiguous. >> 688 >> 689 Kernel messages do not have to be terminated with a period. >> 690 >> 691 Printing numbers in parentheses (%d) adds no value and should be avoided. >> 692 >> 693 There are a number of driver model diagnostic macros in <linux/device.h> >> 694 which you should use to make sure messages are matched to the right device >> 695 and driver, and are tagged with the right level: dev_err(), dev_warn(), >> 696 dev_info(), and so forth. For messages that aren't associated with a >> 697 particular device, <linux/printk.h> defines pr_notice(), pr_info(), >> 698 pr_warn(), pr_err(), etc. >> 699 >> 700 Coming up with good debugging messages can be quite a challenge; and once >> 701 you have them, they can be a huge help for remote troubleshooting. However >> 702 debug message printing is handled differently than printing other non-debug >> 703 messages. While the other pr_XXX() functions print unconditionally, >> 704 pr_debug() does not; it is compiled out by default, unless either DEBUG is >> 705 defined or CONFIG_DYNAMIC_DEBUG is set. That is true for dev_dbg() also, >> 706 and a related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to >> 707 the ones already enabled by DEBUG. >> 708 >> 709 Many subsystems have Kconfig debug options to turn on -DDEBUG in the >> 710 corresponding Makefile; in other cases specific files #define DEBUG. And >> 711 when a debug message should be unconditionally printed, such as if it is >> 712 already inside a debug-related #ifdef section, printk(KERN_DEBUG ...) can be >> 713 used. >> 714 >> 715 >> 716 Chapter 14: Allocating memory >> 717 >> 718 The kernel provides the following general purpose memory allocators: >> 719 kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc(), and >> 720 vzalloc(). Please refer to the API documentation for further information >> 721 about them. >> 722 >> 723 The preferred form for passing a size of a struct is the following: >> 724 >> 725 p = kmalloc(sizeof(*p), ...); >> 726 >> 727 The alternative form where struct name is spelled out hurts readability and >> 728 introduces an opportunity for a bug when the pointer variable type is changed >> 729 but the corresponding sizeof that is passed to a memory allocator is not. >> 730 >> 731 Casting the return value which is a void pointer is redundant. The conversion >> 732 from void pointer to any other pointer type is guaranteed by the C programming >> 733 language. >> 734 >> 735 The preferred form for allocating an array is the following: >> 736 >> 737 p = kmalloc_array(n, sizeof(...), ...); >> 738 >> 739 The preferred form for allocating a zeroed array is the following: >> 740 >> 741 p = kcalloc(n, sizeof(...), ...); >> 742 >> 743 Both forms check for overflow on the allocation size n * sizeof(...), >> 744 and return NULL if that occurred. >> 745 >> 746 >> 747 Chapter 15: The inline disease >> 748 >> 749 There appears to be a common misperception that gcc has a magic "make me >> 750 faster" speedup option called "inline". While the use of inlines can be >> 751 appropriate (for example as a means of replacing macros, see Chapter 12), it >> 752 very often is not. Abundant use of the inline keyword leads to a much bigger >> 753 kernel, which in turn slows the system as a whole down, due to a bigger >> 754 icache footprint for the CPU and simply because there is less memory >> 755 available for the pagecache. Just think about it; a pagecache miss causes a >> 756 disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles >> 757 that can go into these 5 milliseconds. >> 758 >> 759 A reasonable rule of thumb is to not put inline at functions that have more >> 760 than 3 lines of code in them. An exception to this rule are the cases where >> 761 a parameter is known to be a compiletime constant, and as a result of this >> 762 constantness you *know* the compiler will be able to optimize most of your >> 763 function away at compile time. For a good example of this later case, see >> 764 the kmalloc() inline function. >> 765 >> 766 Often people argue that adding inline to functions that are static and used >> 767 only once is always a win since there is no space tradeoff. While this is >> 768 technically correct, gcc is capable of inlining these automatically without >> 769 help, and the maintenance issue of removing the inline when a second user >> 770 appears outweighs the potential value of the hint that tells gcc to do >> 771 something it would have done anyway. >> 772 >> 773 >> 774 Chapter 16: Function return values and names >> 775 >> 776 Functions can return values of many different kinds, and one of the >> 777 most common is a value indicating whether the function succeeded or >> 778 failed. Such a value can be represented as an error-code integer >> 779 (-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure, >> 780 non-zero = success). >> 781 >> 782 Mixing up these two sorts of representations is a fertile source of >> 783 difficult-to-find bugs. If the C language included a strong distinction >> 784 between integers and booleans then the compiler would find these mistakes >> 785 for us... but it doesn't. To help prevent such bugs, always follow this >> 786 convention: >> 787 >> 788 If the name of a function is an action or an imperative command, >> 789 the function should return an error-code integer. If the name >> 790 is a predicate, the function should return a "succeeded" boolean. >> 791 >> 792 For example, "add work" is a command, and the add_work() function returns 0 >> 793 for success or -EBUSY for failure. In the same way, "PCI device present" is >> 794 a predicate, and the pci_dev_present() function returns 1 if it succeeds in >> 795 finding a matching device or 0 if it doesn't. >> 796 >> 797 All EXPORTed functions must respect this convention, and so should all >> 798 public functions. Private (static) functions need not, but it is >> 799 recommended that they do. >> 800 >> 801 Functions whose return value is the actual result of a computation, rather >> 802 than an indication of whether the computation succeeded, are not subject to >> 803 this rule. Generally they indicate failure by returning some out-of-range >> 804 result. Typical examples would be functions that return pointers; they use >> 805 NULL or the ERR_PTR mechanism to report failure. >> 806 >> 807 >> 808 Chapter 17: Don't re-invent the kernel macros >> 809 >> 810 The header file include/linux/kernel.h contains a number of macros that >> 811 you should use, rather than explicitly coding some variant of them yourself. >> 812 For example, if you need to calculate the length of an array, take advantage >> 813 of the macro >> 814 >> 815 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) >> 816 >> 817 Similarly, if you need to calculate the size of some structure member, use >> 818 >> 819 #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) >> 820 >> 821 There are also min() and max() macros that do strict type checking if you >> 822 need them. Feel free to peruse that header file to see what else is already >> 823 defined that you shouldn't reproduce in your code. >> 824 >> 825 >> 826 Chapter 18: Editor modelines and other cruft >> 827 >> 828 Some editors can interpret configuration information embedded in source files, >> 829 indicated with special markers. For example, emacs interprets lines marked >> 830 like this: >> 831 >> 832 -*- mode: c -*- >> 833 >> 834 Or like this: >> 835 >> 836 /* >> 837 Local Variables: >> 838 compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" >> 839 End: >> 840 */ >> 841 >> 842 Vim interprets markers that look like this: >> 843 >> 844 /* vim:set sw=8 noet */ >> 845 >> 846 Do not include any of these in source files. People have their own personal >> 847 editor configurations, and your source files should not override them. This >> 848 includes markers for indentation and mode configuration. People may use their >> 849 own custom mode, or may have some other magic method for making indentation >> 850 work correctly. >> 851 >> 852 >> 853 Chapter 19: Inline assembly >> 854 >> 855 In architecture-specific code, you may need to use inline assembly to interface >> 856 with CPU or platform functionality. Don't hesitate to do so when necessary. >> 857 However, don't use inline assembly gratuitously when C can do the job. You can >> 858 and should poke hardware from C when possible. >> 859 >> 860 Consider writing simple helper functions that wrap common bits of inline >> 861 assembly, rather than repeatedly writing them with slight variations. Remember >> 862 that inline assembly can use C parameters. >> 863 >> 864 Large, non-trivial assembly functions should go in .S files, with corresponding >> 865 C prototypes defined in C header files. The C prototypes for assembly >> 866 functions should use "asmlinkage". >> 867 >> 868 You may need to mark your asm statement as volatile, to prevent GCC from >> 869 removing it if GCC doesn't notice any side effects. You don't always need to >> 870 do so, though, and doing so unnecessarily can limit optimization. >> 871 >> 872 When writing a single inline assembly statement containing multiple >> 873 instructions, put each instruction on a separate line in a separate quoted >> 874 string, and end each string except the last with \n\t to properly indent the >> 875 next instruction in the assembly output: >> 876 >> 877 asm ("magic %reg1, #42\n\t" >> 878 "more_magic %reg2, %reg3" >> 879 : /* outputs */ : /* inputs */ : /* clobbers */); >> 880 >> 881 >> 882 Chapter 20: Conditional Compilation >> 883 >> 884 Wherever possible, don't use preprocessor conditionals (#if, #ifdef) in .c >> 885 files; doing so makes code harder to read and logic harder to follow. Instead, >> 886 use such conditionals in a header file defining functions for use in those .c >> 887 files, providing no-op stub versions in the #else case, and then call those >> 888 functions unconditionally from .c files. The compiler will avoid generating >> 889 any code for the stub calls, producing identical results, but the logic will >> 890 remain easy to follow. >> 891 >> 892 Prefer to compile out entire functions, rather than portions of functions or >> 893 portions of expressions. Rather than putting an ifdef in an expression, factor >> 894 out part or all of the expression into a separate helper function and apply the >> 895 conditional to that function. >> 896 >> 897 If you have a function or variable which may potentially go unused in a >> 898 particular configuration, and the compiler would warn about its definition >> 899 going unused, mark the definition as __maybe_unused rather than wrapping it in >> 900 a preprocessor conditional. (However, if a function or variable *always* goes >> 901 unused, delete it.) >> 902 >> 903 Within code, where possible, use the IS_ENABLED macro to convert a Kconfig >> 904 symbol into a C boolean expression, and use it in a normal C conditional: >> 905 >> 906 if (IS_ENABLED(CONFIG_SOMETHING)) { >> 907 ... >> 908 } >> 909 >> 910 The compiler will constant-fold the conditional away, and include or exclude >> 911 the block of code just as with an #ifdef, so this will not add any runtime >> 912 overhead. However, this approach still allows the C compiler to see the code >> 913 inside the block, and check it for correctness (syntax, types, symbol >> 914 references, etc). Thus, you still have to use an #ifdef if the code inside the >> 915 block references symbols that will not exist if the condition is not met. >> 916 >> 917 At the end of any non-trivial #if or #ifdef block (more than a few lines), >> 918 place a comment after the #endif on the same line, noting the conditional >> 919 expression used. For instance: >> 920 >> 921 #ifdef CONFIG_SOMETHING >> 922 ... >> 923 #endif /* CONFIG_SOMETHING */ >> 924 >> 925 >> 926 Appendix I: References >> 927 >> 928 The C Programming Language, Second Edition >> 929 by Brian W. Kernighan and Dennis M. Ritchie. >> 930 Prentice Hall, Inc., 1988. >> 931 ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). >> 932 >> 933 The Practice of Programming >> 934 by Brian W. Kernighan and Rob Pike. >> 935 Addison-Wesley, Inc., 1999. >> 936 ISBN 0-201-61586-X. >> 937 >> 938 GNU manuals - where in compliance with K&R and this text - for cpp, gcc, >> 939 gcc internals and indent, all available from http://www.gnu.org/manual/ >> 940 >> 941 WG14 is the international standardization working group for the programming >> 942 language C, URL: http://www.open-std.org/JTC1/SC22/WG14/ >> 943 >> 944 Kernel CodingStyle, by greg@kroah.com at OLS 2002: >> 945 http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ >> 946
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.