1 This file has moved to process/coding-style.rs !! 1 >> 2 Linux kernel coding style >> 3 >> 4 This is a short document describing the preferred coding style for the >> 5 linux kernel. Coding style is very personal, and I won't _force_ my >> 6 views on anybody, but this is what goes for anything that I have to be >> 7 able to maintain, and I'd prefer it for most other things too. Please >> 8 at least consider the points made here. >> 9 >> 10 First off, I'd suggest printing out a copy of the GNU coding standards, >> 11 and NOT read it. Burn them, it's a great symbolic gesture. >> 12 >> 13 Anyway, here goes: >> 14 >> 15 >> 16 Chapter 1: Indentation >> 17 >> 18 Tabs are 8 characters, and thus indentations are also 8 characters. >> 19 There are heretic movements that try to make indentations 4 (or even 2!) >> 20 characters deep, and that is akin to trying to define the value of PI to >> 21 be 3. >> 22 >> 23 Rationale: The whole idea behind indentation is to clearly define where >> 24 a block of control starts and ends. Especially when you've been looking >> 25 at your screen for 20 straight hours, you'll find it a lot easier to see >> 26 how the indentation works if you have large indentations. >> 27 >> 28 Now, some people will claim that having 8-character indentations makes >> 29 the code move too far to the right, and makes it hard to read on a >> 30 80-character terminal screen. The answer to that is that if you need >> 31 more than 3 levels of indentation, you're screwed anyway, and should fix >> 32 your program. >> 33 >> 34 In short, 8-char indents make things easier to read, and have the added >> 35 benefit of warning you when you're nesting your functions too deep. >> 36 Heed that warning. >> 37 >> 38 The preferred way to ease multiple indentation levels in a switch statement is >> 39 to align the "switch" and its subordinate "case" labels in the same column >> 40 instead of "double-indenting" the "case" labels. E.g.: >> 41 >> 42 switch (suffix) { >> 43 case 'G': >> 44 case 'g': >> 45 mem <<= 30; >> 46 break; >> 47 case 'M': >> 48 case 'm': >> 49 mem <<= 20; >> 50 break; >> 51 case 'K': >> 52 case 'k': >> 53 mem <<= 10; >> 54 /* fall through */ >> 55 default: >> 56 break; >> 57 } >> 58 >> 59 >> 60 Don't put multiple statements on a single line unless you have >> 61 something to hide: >> 62 >> 63 if (condition) do_this; >> 64 do_something_everytime; >> 65 >> 66 Don't put multiple assignments on a single line either. Kernel coding style >> 67 is super simple. Avoid tricky expressions. >> 68 >> 69 Outside of comments, documentation and except in Kconfig, spaces are never >> 70 used for indentation, and the above example is deliberately broken. >> 71 >> 72 Get a decent editor and don't leave whitespace at the end of lines. >> 73 >> 74 >> 75 Chapter 2: Breaking long lines and strings >> 76 >> 77 Coding style is all about readability and maintainability using commonly >> 78 available tools. >> 79 >> 80 The limit on the length of lines is 80 columns and this is a strongly >> 81 preferred limit. >> 82 >> 83 Statements longer than 80 columns will be broken into sensible chunks, unless >> 84 exceeding 80 columns significantly increases readability and does not hide >> 85 information. Descendants are always substantially shorter than the parent and >> 86 are placed substantially to the right. The same applies to function headers >> 87 with a long argument list. However, never break user-visible strings such as >> 88 printk messages, because that breaks the ability to grep for them. >> 89 >> 90 >> 91 Chapter 3: Placing Braces and Spaces >> 92 >> 93 The other issue that always comes up in C styling is the placement of >> 94 braces. Unlike the indent size, there are few technical reasons to >> 95 choose one placement strategy over the other, but the preferred way, as >> 96 shown to us by the prophets Kernighan and Ritchie, is to put the opening >> 97 brace last on the line, and put the closing brace first, thusly: >> 98 >> 99 if (x is true) { >> 100 we do y >> 101 } >> 102 >> 103 This applies to all non-function statement blocks (if, switch, for, >> 104 while, do). E.g.: >> 105 >> 106 switch (action) { >> 107 case KOBJ_ADD: >> 108 return "add"; >> 109 case KOBJ_REMOVE: >> 110 return "remove"; >> 111 case KOBJ_CHANGE: >> 112 return "change"; >> 113 default: >> 114 return NULL; >> 115 } >> 116 >> 117 However, there is one special case, namely functions: they have the >> 118 opening brace at the beginning of the next line, thus: >> 119 >> 120 int function(int x) >> 121 { >> 122 body of function >> 123 } >> 124 >> 125 Heretic people all over the world have claimed that this inconsistency >> 126 is ... well ... inconsistent, but all right-thinking people know that >> 127 (a) K&R are _right_ and (b) K&R are right. Besides, functions are >> 128 special anyway (you can't nest them in C). >> 129 >> 130 Note that the closing brace is empty on a line of its own, _except_ in >> 131 the cases where it is followed by a continuation of the same statement, >> 132 ie a "while" in a do-statement or an "else" in an if-statement, like >> 133 this: >> 134 >> 135 do { >> 136 body of do-loop >> 137 } while (condition); >> 138 >> 139 and >> 140 >> 141 if (x == y) { >> 142 .. >> 143 } else if (x > y) { >> 144 ... >> 145 } else { >> 146 .... >> 147 } >> 148 >> 149 Rationale: K&R. >> 150 >> 151 Also, note that this brace-placement also minimizes the number of empty >> 152 (or almost empty) lines, without any loss of readability. Thus, as the >> 153 supply of new-lines on your screen is not a renewable resource (think >> 154 25-line terminal screens here), you have more empty lines to put >> 155 comments on. >> 156 >> 157 Do not unnecessarily use braces where a single statement will do. >> 158 >> 159 if (condition) >> 160 action(); >> 161 >> 162 and >> 163 >> 164 if (condition) >> 165 do_this(); >> 166 else >> 167 do_that(); >> 168 >> 169 This does not apply if only one branch of a conditional statement is a single >> 170 statement; in the latter case use braces in both branches: >> 171 >> 172 if (condition) { >> 173 do_this(); >> 174 do_that(); >> 175 } else { >> 176 otherwise(); >> 177 } >> 178 >> 179 3.1: Spaces >> 180 >> 181 Linux kernel style for use of spaces depends (mostly) on >> 182 function-versus-keyword usage. Use a space after (most) keywords. The >> 183 notable exceptions are sizeof, typeof, alignof, and __attribute__, which look >> 184 somewhat like functions (and are usually used with parentheses in Linux, >> 185 although they are not required in the language, as in: "sizeof info" after >> 186 "struct fileinfo info;" is declared). >> 187 >> 188 So use a space after these keywords: >> 189 if, switch, case, for, do, while >> 190 but not with sizeof, typeof, alignof, or __attribute__. E.g., >> 191 s = sizeof(struct file); >> 192 >> 193 Do not add spaces around (inside) parenthesized expressions. This example is >> 194 *bad*: >> 195 >> 196 s = sizeof( struct file ); >> 197 >> 198 When declaring pointer data or a function that returns a pointer type, the >> 199 preferred use of '*' is adjacent to the data name or function name and not >> 200 adjacent to the type name. Examples: >> 201 >> 202 char *linux_banner; >> 203 unsigned long long memparse(char *ptr, char **retptr); >> 204 char *match_strdup(substring_t *s); >> 205 >> 206 Use one space around (on each side of) most binary and ternary operators, >> 207 such as any of these: >> 208 >> 209 = + - < > * / % | & ^ <= >= == != ? : >> 210 >> 211 but no space after unary operators: >> 212 & * + - ~ ! sizeof typeof alignof __attribute__ defined >> 213 >> 214 no space before the postfix increment & decrement unary operators: >> 215 ++ -- >> 216 >> 217 no space after the prefix increment & decrement unary operators: >> 218 ++ -- >> 219 >> 220 and no space around the '.' and "->" structure member operators. >> 221 >> 222 Do not leave trailing whitespace at the ends of lines. Some editors with >> 223 "smart" indentation will insert whitespace at the beginning of new lines as >> 224 appropriate, so you can start typing the next line of code right away. >> 225 However, some such editors do not remove the whitespace if you end up not >> 226 putting a line of code there, such as if you leave a blank line. As a result, >> 227 you end up with lines containing trailing whitespace. >> 228 >> 229 Git will warn you about patches that introduce trailing whitespace, and can >> 230 optionally strip the trailing whitespace for you; however, if applying a series >> 231 of patches, this may make later patches in the series fail by changing their >> 232 context lines. >> 233 >> 234 >> 235 Chapter 4: Naming >> 236 >> 237 C is a Spartan language, and so should your naming be. Unlike Modula-2 >> 238 and Pascal programmers, C programmers do not use cute names like >> 239 ThisVariableIsATemporaryCounter. A C programmer would call that >> 240 variable "tmp", which is much easier to write, and not the least more >> 241 difficult to understand. >> 242 >> 243 HOWEVER, while mixed-case names are frowned upon, descriptive names for >> 244 global variables are a must. To call a global function "foo" is a >> 245 shooting offense. >> 246 >> 247 GLOBAL variables (to be used only if you _really_ need them) need to >> 248 have descriptive names, as do global functions. If you have a function >> 249 that counts the number of active users, you should call that >> 250 "count_active_users()" or similar, you should _not_ call it "cntusr()". >> 251 >> 252 Encoding the type of a function into the name (so-called Hungarian >> 253 notation) is brain damaged - the compiler knows the types anyway and can >> 254 check those, and it only confuses the programmer. No wonder MicroSoft >> 255 makes buggy programs. >> 256 >> 257 LOCAL variable names should be short, and to the point. If you have >> 258 some random integer loop counter, it should probably be called "i". >> 259 Calling it "loop_counter" is non-productive, if there is no chance of it >> 260 being mis-understood. Similarly, "tmp" can be just about any type of >> 261 variable that is used to hold a temporary value. >> 262 >> 263 If you are afraid to mix up your local variable names, you have another >> 264 problem, which is called the function-growth-hormone-imbalance syndrome. >> 265 See chapter 6 (Functions). >> 266 >> 267 >> 268 Chapter 5: Typedefs >> 269 >> 270 Please don't use things like "vps_t". >> 271 >> 272 It's a _mistake_ to use typedef for structures and pointers. When you see a >> 273 >> 274 vps_t a; >> 275 >> 276 in the source, what does it mean? >> 277 >> 278 In contrast, if it says >> 279 >> 280 struct virtual_container *a; >> 281 >> 282 you can actually tell what "a" is. >> 283 >> 284 Lots of people think that typedefs "help readability". Not so. They are >> 285 useful only for: >> 286 >> 287 (a) totally opaque objects (where the typedef is actively used to _hide_ >> 288 what the object is). >> 289 >> 290 Example: "pte_t" etc. opaque objects that you can only access using >> 291 the proper accessor functions. >> 292 >> 293 NOTE! Opaqueness and "accessor functions" are not good in themselves. >> 294 The reason we have them for things like pte_t etc. is that there >> 295 really is absolutely _zero_ portably accessible information there. >> 296 >> 297 (b) Clear integer types, where the abstraction _helps_ avoid confusion >> 298 whether it is "int" or "long". >> 299 >> 300 u8/u16/u32 are perfectly fine typedefs, although they fit into >> 301 category (d) better than here. >> 302 >> 303 NOTE! Again - there needs to be a _reason_ for this. If something is >> 304 "unsigned long", then there's no reason to do >> 305 >> 306 typedef unsigned long myflags_t; >> 307 >> 308 but if there is a clear reason for why it under certain circumstances >> 309 might be an "unsigned int" and under other configurations might be >> 310 "unsigned long", then by all means go ahead and use a typedef. >> 311 >> 312 (c) when you use sparse to literally create a _new_ type for >> 313 type-checking. >> 314 >> 315 (d) New types which are identical to standard C99 types, in certain >> 316 exceptional circumstances. >> 317 >> 318 Although it would only take a short amount of time for the eyes and >> 319 brain to become accustomed to the standard types like 'uint32_t', >> 320 some people object to their use anyway. >> 321 >> 322 Therefore, the Linux-specific 'u8/u16/u32/u64' types and their >> 323 signed equivalents which are identical to standard types are >> 324 permitted -- although they are not mandatory in new code of your >> 325 own. >> 326 >> 327 When editing existing code which already uses one or the other set >> 328 of types, you should conform to the existing choices in that code. >> 329 >> 330 (e) Types safe for use in userspace. >> 331 >> 332 In certain structures which are visible to userspace, we cannot >> 333 require C99 types and cannot use the 'u32' form above. Thus, we >> 334 use __u32 and similar types in all structures which are shared >> 335 with userspace. >> 336 >> 337 Maybe there are other cases too, but the rule should basically be to NEVER >> 338 EVER use a typedef unless you can clearly match one of those rules. >> 339 >> 340 In general, a pointer, or a struct that has elements that can reasonably >> 341 be directly accessed should _never_ be a typedef. >> 342 >> 343 >> 344 Chapter 6: Functions >> 345 >> 346 Functions should be short and sweet, and do just one thing. They should >> 347 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, >> 348 as we all know), and do one thing and do that well. >> 349 >> 350 The maximum length of a function is inversely proportional to the >> 351 complexity and indentation level of that function. So, if you have a >> 352 conceptually simple function that is just one long (but simple) >> 353 case-statement, where you have to do lots of small things for a lot of >> 354 different cases, it's OK to have a longer function. >> 355 >> 356 However, if you have a complex function, and you suspect that a >> 357 less-than-gifted first-year high-school student might not even >> 358 understand what the function is all about, you should adhere to the >> 359 maximum limits all the more closely. Use helper functions with >> 360 descriptive names (you can ask the compiler to in-line them if you think >> 361 it's performance-critical, and it will probably do a better job of it >> 362 than you would have done). >> 363 >> 364 Another measure of the function is the number of local variables. They >> 365 shouldn't exceed 5-10, or you're doing something wrong. Re-think the >> 366 function, and split it into smaller pieces. A human brain can >> 367 generally easily keep track of about 7 different things, anything more >> 368 and it gets confused. You know you're brilliant, but maybe you'd like >> 369 to understand what you did 2 weeks from now. >> 370 >> 371 In source files, separate functions with one blank line. If the function is >> 372 exported, the EXPORT* macro for it should follow immediately after the closing >> 373 function brace line. E.g.: >> 374 >> 375 int system_is_up(void) >> 376 { >> 377 return system_state == SYSTEM_RUNNING; >> 378 } >> 379 EXPORT_SYMBOL(system_is_up); >> 380 >> 381 In function prototypes, include parameter names with their data types. >> 382 Although this is not required by the C language, it is preferred in Linux >> 383 because it is a simple way to add valuable information for the reader. >> 384 >> 385 >> 386 Chapter 7: Centralized exiting of functions >> 387 >> 388 Albeit deprecated by some people, the equivalent of the goto statement is >> 389 used frequently by compilers in form of the unconditional jump instruction. >> 390 >> 391 The goto statement comes in handy when a function exits from multiple >> 392 locations and some common work such as cleanup has to be done. >> 393 >> 394 The rationale is: >> 395 >> 396 - unconditional statements are easier to understand and follow >> 397 - nesting is reduced >> 398 - errors by not updating individual exit points when making >> 399 modifications are prevented >> 400 - saves the compiler work to optimize redundant code away ;) >> 401 >> 402 int fun(int a) >> 403 { >> 404 int result = 0; >> 405 char *buffer = kmalloc(SIZE); >> 406 >> 407 if (buffer == NULL) >> 408 return -ENOMEM; >> 409 >> 410 if (condition1) { >> 411 while (loop1) { >> 412 ... >> 413 } >> 414 result = 1; >> 415 goto out; >> 416 } >> 417 ... >> 418 out: >> 419 kfree(buffer); >> 420 return result; >> 421 } >> 422 >> 423 Chapter 8: Commenting >> 424 >> 425 Comments are good, but there is also a danger of over-commenting. NEVER >> 426 try to explain HOW your code works in a comment: it's much better to >> 427 write the code so that the _working_ is obvious, and it's a waste of >> 428 time to explain badly written code. >> 429 >> 430 Generally, you want your comments to tell WHAT your code does, not HOW. >> 431 Also, try to avoid putting comments inside a function body: if the >> 432 function is so complex that you need to separately comment parts of it, >> 433 you should probably go back to chapter 6 for a while. You can make >> 434 small comments to note or warn about something particularly clever (or >> 435 ugly), but try to avoid excess. Instead, put the comments at the head >> 436 of the function, telling people what it does, and possibly WHY it does >> 437 it. >> 438 >> 439 When commenting the kernel API functions, please use the kernel-doc format. >> 440 See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc >> 441 for details. >> 442 >> 443 Linux style for comments is the C89 "/* ... */" style. >> 444 Don't use C99-style "// ..." comments. >> 445 >> 446 The preferred style for long (multi-line) comments is: >> 447 >> 448 /* >> 449 * This is the preferred style for multi-line >> 450 * comments in the Linux kernel source code. >> 451 * Please use it consistently. >> 452 * >> 453 * Description: A column of asterisks on the left side, >> 454 * with beginning and ending almost-blank lines. >> 455 */ >> 456 >> 457 For files in net/ and drivers/net/ the preferred style for long (multi-line) >> 458 comments is a little different. >> 459 >> 460 /* The preferred comment style for files in net/ and drivers/net >> 461 * looks like this. >> 462 * >> 463 * It is nearly the same as the generally preferred comment style, >> 464 * but there is no initial almost-blank line. >> 465 */ >> 466 >> 467 It's also important to comment data, whether they are basic types or derived >> 468 types. To this end, use just one data declaration per line (no commas for >> 469 multiple data declarations). This leaves you room for a small comment on each >> 470 item, explaining its use. >> 471 >> 472 >> 473 Chapter 9: You've made a mess of it >> 474 >> 475 That's OK, we all do. You've probably been told by your long-time Unix >> 476 user helper that "GNU emacs" automatically formats the C sources for >> 477 you, and you've noticed that yes, it does do that, but the defaults it >> 478 uses are less than desirable (in fact, they are worse than random >> 479 typing - an infinite number of monkeys typing into GNU emacs would never >> 480 make a good program). >> 481 >> 482 So, you can either get rid of GNU emacs, or change it to use saner >> 483 values. To do the latter, you can stick the following in your .emacs file: >> 484 >> 485 (defun c-lineup-arglist-tabs-only (ignored) >> 486 "Line up argument lists by tabs, not spaces" >> 487 (let* ((anchor (c-langelem-pos c-syntactic-element)) >> 488 (column (c-langelem-2nd-pos c-syntactic-element)) >> 489 (offset (- (1+ column) anchor)) >> 490 (steps (floor offset c-basic-offset))) >> 491 (* (max steps 1) >> 492 c-basic-offset))) >> 493 >> 494 (add-hook 'c-mode-common-hook >> 495 (lambda () >> 496 ;; Add kernel style >> 497 (c-add-style >> 498 "linux-tabs-only" >> 499 '("linux" (c-offsets-alist >> 500 (arglist-cont-nonempty >> 501 c-lineup-gcc-asm-reg >> 502 c-lineup-arglist-tabs-only)))))) >> 503 >> 504 (add-hook 'c-mode-hook >> 505 (lambda () >> 506 (let ((filename (buffer-file-name))) >> 507 ;; Enable kernel mode for the appropriate files >> 508 (when (and filename >> 509 (string-match (expand-file-name "~/src/linux-trees") >> 510 filename)) >> 511 (setq indent-tabs-mode t) >> 512 (c-set-style "linux-tabs-only"))))) >> 513 >> 514 This will make emacs go better with the kernel coding style for C >> 515 files below ~/src/linux-trees. >> 516 >> 517 But even if you fail in getting emacs to do sane formatting, not >> 518 everything is lost: use "indent". >> 519 >> 520 Now, again, GNU indent has the same brain-dead settings that GNU emacs >> 521 has, which is why you need to give it a few command line options. >> 522 However, that's not too bad, because even the makers of GNU indent >> 523 recognize the authority of K&R (the GNU people aren't evil, they are >> 524 just severely misguided in this matter), so you just give indent the >> 525 options "-kr -i8" (stands for "K&R, 8 character indents"), or use >> 526 "scripts/Lindent", which indents in the latest style. >> 527 >> 528 "indent" has a lot of options, and especially when it comes to comment >> 529 re-formatting you may want to take a look at the man page. But >> 530 remember: "indent" is not a fix for bad programming. >> 531 >> 532 >> 533 Chapter 10: Kconfig configuration files >> 534 >> 535 For all of the Kconfig* configuration files throughout the source tree, >> 536 the indentation is somewhat different. Lines under a "config" definition >> 537 are indented with one tab, while help text is indented an additional two >> 538 spaces. Example: >> 539 >> 540 config AUDIT >> 541 bool "Auditing support" >> 542 depends on NET >> 543 help >> 544 Enable auditing infrastructure that can be used with another >> 545 kernel subsystem, such as SELinux (which requires this for >> 546 logging of avc messages output). Does not do system-call >> 547 auditing without CONFIG_AUDITSYSCALL. >> 548 >> 549 Seriously dangerous features (such as write support for certain >> 550 filesystems) should advertise this prominently in their prompt string: >> 551 >> 552 config ADFS_FS_RW >> 553 bool "ADFS write support (DANGEROUS)" >> 554 depends on ADFS_FS >> 555 ... >> 556 >> 557 For full documentation on the configuration files, see the file >> 558 Documentation/kbuild/kconfig-language.txt. >> 559 >> 560 >> 561 Chapter 11: Data structures >> 562 >> 563 Data structures that have visibility outside the single-threaded >> 564 environment they are created and destroyed in should always have >> 565 reference counts. In the kernel, garbage collection doesn't exist (and >> 566 outside the kernel garbage collection is slow and inefficient), which >> 567 means that you absolutely _have_ to reference count all your uses. >> 568 >> 569 Reference counting means that you can avoid locking, and allows multiple >> 570 users to have access to the data structure in parallel - and not having >> 571 to worry about the structure suddenly going away from under them just >> 572 because they slept or did something else for a while. >> 573 >> 574 Note that locking is _not_ a replacement for reference counting. >> 575 Locking is used to keep data structures coherent, while reference >> 576 counting is a memory management technique. Usually both are needed, and >> 577 they are not to be confused with each other. >> 578 >> 579 Many data structures can indeed have two levels of reference counting, >> 580 when there are users of different "classes". The subclass count counts >> 581 the number of subclass users, and decrements the global count just once >> 582 when the subclass count goes to zero. >> 583 >> 584 Examples of this kind of "multi-level-reference-counting" can be found in >> 585 memory management ("struct mm_struct": mm_users and mm_count), and in >> 586 filesystem code ("struct super_block": s_count and s_active). >> 587 >> 588 Remember: if another thread can find your data structure, and you don't >> 589 have a reference count on it, you almost certainly have a bug. >> 590 >> 591 >> 592 Chapter 12: Macros, Enums and RTL >> 593 >> 594 Names of macros defining constants and labels in enums are capitalized. >> 595 >> 596 #define CONSTANT 0x12345 >> 597 >> 598 Enums are preferred when defining several related constants. >> 599 >> 600 CAPITALIZED macro names are appreciated but macros resembling functions >> 601 may be named in lower case. >> 602 >> 603 Generally, inline functions are preferable to macros resembling functions. >> 604 >> 605 Macros with multiple statements should be enclosed in a do - while block: >> 606 >> 607 #define macrofun(a, b, c) \ >> 608 do { \ >> 609 if (a == 5) \ >> 610 do_this(b, c); \ >> 611 } while (0) >> 612 >> 613 Things to avoid when using macros: >> 614 >> 615 1) macros that affect control flow: >> 616 >> 617 #define FOO(x) \ >> 618 do { \ >> 619 if (blah(x) < 0) \ >> 620 return -EBUGGERED; \ >> 621 } while(0) >> 622 >> 623 is a _very_ bad idea. It looks like a function call but exits the "calling" >> 624 function; don't break the internal parsers of those who will read the code. >> 625 >> 626 2) macros that depend on having a local variable with a magic name: >> 627 >> 628 #define FOO(val) bar(index, val) >> 629 >> 630 might look like a good thing, but it's confusing as hell when one reads the >> 631 code and it's prone to breakage from seemingly innocent changes. >> 632 >> 633 3) macros with arguments that are used as l-values: FOO(x) = y; will >> 634 bite you if somebody e.g. turns FOO into an inline function. >> 635 >> 636 4) forgetting about precedence: macros defining constants using expressions >> 637 must enclose the expression in parentheses. Beware of similar issues with >> 638 macros using parameters. >> 639 >> 640 #define CONSTANT 0x4000 >> 641 #define CONSTEXP (CONSTANT | 3) >> 642 >> 643 The cpp manual deals with macros exhaustively. The gcc internals manual also >> 644 covers RTL which is used frequently with assembly language in the kernel. >> 645 >> 646 >> 647 Chapter 13: Printing kernel messages >> 648 >> 649 Kernel developers like to be seen as literate. Do mind the spelling >> 650 of kernel messages to make a good impression. Do not use crippled >> 651 words like "dont"; use "do not" or "don't" instead. Make the messages >> 652 concise, clear, and unambiguous. >> 653 >> 654 Kernel messages do not have to be terminated with a period. >> 655 >> 656 Printing numbers in parentheses (%d) adds no value and should be avoided. >> 657 >> 658 There are a number of driver model diagnostic macros in <linux/device.h> >> 659 which you should use to make sure messages are matched to the right device >> 660 and driver, and are tagged with the right level: dev_err(), dev_warn(), >> 661 dev_info(), and so forth. For messages that aren't associated with a >> 662 particular device, <linux/printk.h> defines pr_debug() and pr_info(). >> 663 >> 664 Coming up with good debugging messages can be quite a challenge; and once >> 665 you have them, they can be a huge help for remote troubleshooting. Such >> 666 messages should be compiled out when the DEBUG symbol is not defined (that >> 667 is, by default they are not included). When you use dev_dbg() or pr_debug(), >> 668 that's automatic. Many subsystems have Kconfig options to turn on -DDEBUG. >> 669 A related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to the >> 670 ones already enabled by DEBUG. >> 671 >> 672 >> 673 Chapter 14: Allocating memory >> 674 >> 675 The kernel provides the following general purpose memory allocators: >> 676 kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc(), and >> 677 vzalloc(). Please refer to the API documentation for further information >> 678 about them. >> 679 >> 680 The preferred form for passing a size of a struct is the following: >> 681 >> 682 p = kmalloc(sizeof(*p), ...); >> 683 >> 684 The alternative form where struct name is spelled out hurts readability and >> 685 introduces an opportunity for a bug when the pointer variable type is changed >> 686 but the corresponding sizeof that is passed to a memory allocator is not. >> 687 >> 688 Casting the return value which is a void pointer is redundant. The conversion >> 689 from void pointer to any other pointer type is guaranteed by the C programming >> 690 language. >> 691 >> 692 The preferred form for allocating an array is the following: >> 693 >> 694 p = kmalloc_array(n, sizeof(...), ...); >> 695 >> 696 The preferred form for allocating a zeroed array is the following: >> 697 >> 698 p = kcalloc(n, sizeof(...), ...); >> 699 >> 700 Both forms check for overflow on the allocation size n * sizeof(...), >> 701 and return NULL if that occurred. >> 702 >> 703 >> 704 Chapter 15: The inline disease >> 705 >> 706 There appears to be a common misperception that gcc has a magic "make me >> 707 faster" speedup option called "inline". While the use of inlines can be >> 708 appropriate (for example as a means of replacing macros, see Chapter 12), it >> 709 very often is not. Abundant use of the inline keyword leads to a much bigger >> 710 kernel, which in turn slows the system as a whole down, due to a bigger >> 711 icache footprint for the CPU and simply because there is less memory >> 712 available for the pagecache. Just think about it; a pagecache miss causes a >> 713 disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles >> 714 that can go into these 5 milliseconds. >> 715 >> 716 A reasonable rule of thumb is to not put inline at functions that have more >> 717 than 3 lines of code in them. An exception to this rule are the cases where >> 718 a parameter is known to be a compiletime constant, and as a result of this >> 719 constantness you *know* the compiler will be able to optimize most of your >> 720 function away at compile time. For a good example of this later case, see >> 721 the kmalloc() inline function. >> 722 >> 723 Often people argue that adding inline to functions that are static and used >> 724 only once is always a win since there is no space tradeoff. While this is >> 725 technically correct, gcc is capable of inlining these automatically without >> 726 help, and the maintenance issue of removing the inline when a second user >> 727 appears outweighs the potential value of the hint that tells gcc to do >> 728 something it would have done anyway. >> 729 >> 730 >> 731 Chapter 16: Function return values and names >> 732 >> 733 Functions can return values of many different kinds, and one of the >> 734 most common is a value indicating whether the function succeeded or >> 735 failed. Such a value can be represented as an error-code integer >> 736 (-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure, >> 737 non-zero = success). >> 738 >> 739 Mixing up these two sorts of representations is a fertile source of >> 740 difficult-to-find bugs. If the C language included a strong distinction >> 741 between integers and booleans then the compiler would find these mistakes >> 742 for us... but it doesn't. To help prevent such bugs, always follow this >> 743 convention: >> 744 >> 745 If the name of a function is an action or an imperative command, >> 746 the function should return an error-code integer. If the name >> 747 is a predicate, the function should return a "succeeded" boolean. >> 748 >> 749 For example, "add work" is a command, and the add_work() function returns 0 >> 750 for success or -EBUSY for failure. In the same way, "PCI device present" is >> 751 a predicate, and the pci_dev_present() function returns 1 if it succeeds in >> 752 finding a matching device or 0 if it doesn't. >> 753 >> 754 All EXPORTed functions must respect this convention, and so should all >> 755 public functions. Private (static) functions need not, but it is >> 756 recommended that they do. >> 757 >> 758 Functions whose return value is the actual result of a computation, rather >> 759 than an indication of whether the computation succeeded, are not subject to >> 760 this rule. Generally they indicate failure by returning some out-of-range >> 761 result. Typical examples would be functions that return pointers; they use >> 762 NULL or the ERR_PTR mechanism to report failure. >> 763 >> 764 >> 765 Chapter 17: Don't re-invent the kernel macros >> 766 >> 767 The header file include/linux/kernel.h contains a number of macros that >> 768 you should use, rather than explicitly coding some variant of them yourself. >> 769 For example, if you need to calculate the length of an array, take advantage >> 770 of the macro >> 771 >> 772 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) >> 773 >> 774 Similarly, if you need to calculate the size of some structure member, use >> 775 >> 776 #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) >> 777 >> 778 There are also min() and max() macros that do strict type checking if you >> 779 need them. Feel free to peruse that header file to see what else is already >> 780 defined that you shouldn't reproduce in your code. >> 781 >> 782 >> 783 Chapter 18: Editor modelines and other cruft >> 784 >> 785 Some editors can interpret configuration information embedded in source files, >> 786 indicated with special markers. For example, emacs interprets lines marked >> 787 like this: >> 788 >> 789 -*- mode: c -*- >> 790 >> 791 Or like this: >> 792 >> 793 /* >> 794 Local Variables: >> 795 compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" >> 796 End: >> 797 */ >> 798 >> 799 Vim interprets markers that look like this: >> 800 >> 801 /* vim:set sw=8 noet */ >> 802 >> 803 Do not include any of these in source files. People have their own personal >> 804 editor configurations, and your source files should not override them. This >> 805 includes markers for indentation and mode configuration. People may use their >> 806 own custom mode, or may have some other magic method for making indentation >> 807 work correctly. >> 808 >> 809 >> 810 Chapter 19: Inline assembly >> 811 >> 812 In architecture-specific code, you may need to use inline assembly to interface >> 813 with CPU or platform functionality. Don't hesitate to do so when necessary. >> 814 However, don't use inline assembly gratuitously when C can do the job. You can >> 815 and should poke hardware from C when possible. >> 816 >> 817 Consider writing simple helper functions that wrap common bits of inline >> 818 assembly, rather than repeatedly writing them with slight variations. Remember >> 819 that inline assembly can use C parameters. >> 820 >> 821 Large, non-trivial assembly functions should go in .S files, with corresponding >> 822 C prototypes defined in C header files. The C prototypes for assembly >> 823 functions should use "asmlinkage". >> 824 >> 825 You may need to mark your asm statement as volatile, to prevent GCC from >> 826 removing it if GCC doesn't notice any side effects. You don't always need to >> 827 do so, though, and doing so unnecessarily can limit optimization. >> 828 >> 829 When writing a single inline assembly statement containing multiple >> 830 instructions, put each instruction on a separate line in a separate quoted >> 831 string, and end each string except the last with \n\t to properly indent the >> 832 next instruction in the assembly output: >> 833 >> 834 asm ("magic %reg1, #42\n\t" >> 835 "more_magic %reg2, %reg3" >> 836 : /* outputs */ : /* inputs */ : /* clobbers */); >> 837 >> 838 >> 839 >> 840 Appendix I: References >> 841 >> 842 The C Programming Language, Second Edition >> 843 by Brian W. Kernighan and Dennis M. Ritchie. >> 844 Prentice Hall, Inc., 1988. >> 845 ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). >> 846 URL: http://cm.bell-labs.com/cm/cs/cbook/ >> 847 >> 848 The Practice of Programming >> 849 by Brian W. Kernighan and Rob Pike. >> 850 Addison-Wesley, Inc., 1999. >> 851 ISBN 0-201-61586-X. >> 852 URL: http://cm.bell-labs.com/cm/cs/tpop/ >> 853 >> 854 GNU manuals - where in compliance with K&R and this text - for cpp, gcc, >> 855 gcc internals and indent, all available from http://www.gnu.org/manual/ >> 856 >> 857 WG14 is the international standardization working group for the programming >> 858 language C, URL: http://www.open-std.org/JTC1/SC22/WG14/ >> 859 >> 860 Kernel CodingStyle, by greg@kroah.com at OLS 2002: >> 861 http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ >> 862
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.