1 .. SPDX-License-Identifier: GPL-2.0 2 3 =================================== 4 Backporting and conflict resolution 5 =================================== 6 7 :Author: Vegard Nossum <vegard.nossum@oracle.com> 8 9 .. contents:: 10 :local: 11 :depth: 3 12 :backlinks: none 13 14 Introduction 15 ============ 16 17 Some developers may never really have to deal with backporting patches, 18 merging branches, or resolving conflicts in their day-to-day work, so 19 when a merge conflict does pop up, it can be daunting. Luckily, 20 resolving conflicts is a skill like any other, and there are many useful 21 techniques you can use to make the process smoother and increase your 22 confidence in the result. 23 24 This document aims to be a comprehensive, step-by-step guide to 25 backporting and conflict resolution. 26 27 Applying the patch to a tree 28 ============================ 29 30 Sometimes the patch you are backporting already exists as a git commit, 31 in which case you just cherry-pick it directly using 32 ``git cherry-pick``. However, if the patch comes from an email, as it 33 often does for the Linux kernel, you will need to apply it to a tree 34 using ``git am``. 35 36 If you've ever used ``git am``, you probably already know that it is 37 quite picky about the patch applying perfectly to your source tree. In 38 fact, you've probably had nightmares about ``.rej`` files and trying to 39 edit the patch to make it apply. 40 41 It is strongly recommended to instead find an appropriate base version 42 where the patch applies cleanly and *then* cherry-pick it over to your 43 destination tree, as this will make git output conflict markers and let 44 you resolve conflicts with the help of git and any other conflict 45 resolution tools you might prefer to use. For example, if you want to 46 apply a patch that just arrived on LKML to an older stable kernel, you 47 can apply it to the most recent mainline kernel and then cherry-pick it 48 to your older stable branch. 49 50 It's generally better to use the exact same base as the one the patch 51 was generated from, but it doesn't really matter that much as long as it 52 applies cleanly and isn't too far from the original base. The only 53 problem with applying the patch to the "wrong" base is that it may pull 54 in more unrelated changes in the context of the diff when cherry-picking 55 it to the older branch. 56 57 A good reason to prefer ``git cherry-pick`` over ``git am`` is that git 58 knows the precise history of an existing commit, so it will know when 59 code has moved around and changed the line numbers; this in turn makes 60 it less likely to apply the patch to the wrong place (which can result 61 in silent mistakes or messy conflicts). 62 63 If you are using `b4`_. and you are applying the patch directly from an 64 email, you can use ``b4 am`` with the options ``-g``/``--guess-base`` 65 and ``-3``/``--prep-3way`` to do some of this automatically (see the 66 `b4 presentation`_ for more information). However, the rest of this 67 article will assume that you are doing a plain ``git cherry-pick``. 68 69 .. _b4: https://people.kernel.org/monsieuricon/introducing-b4-and-patch-attestation 70 .. _b4 presentation: https://youtu.be/mF10hgVIx9o?t=2996 71 72 Once you have the patch in git, you can go ahead and cherry-pick it into 73 your source tree. Don't forget to cherry-pick with ``-x`` if you want a 74 written record of where the patch came from! 75 76 Note that if you are submitting a patch for stable, the format is 77 slightly different; the first line after the subject line needs tobe 78 either:: 79 80 commit <upstream commit> upstream 81 82 or:: 83 84 [ Upstream commit <upstream commit> ] 85 86 Resolving conflicts 87 =================== 88 89 Uh-oh; the cherry-pick failed with a vaguely threatening message:: 90 91 CONFLICT (content): Merge conflict 92 93 What to do now? 94 95 In general, conflicts appear when the context of the patch (i.e., the 96 lines being changed and/or the lines surrounding the changes) doesn't 97 match what's in the tree you are trying to apply the patch *to*. 98 99 For backports, what likely happened was that the branch you are 100 backporting from contains patches not in the branch you are backporting 101 to. However, the reverse is also possible. In any case, the result is a 102 conflict that needs to be resolved. 103 104 If your attempted cherry-pick fails with a conflict, git automatically 105 edits the files to include so-called conflict markers showing you where 106 the conflict is and how the two branches have diverged. Resolving the 107 conflict typically means editing the end result in such a way that it 108 takes into account these other commits. 109 110 Resolving the conflict can be done either by hand in a regular text 111 editor or using a dedicated conflict resolution tool. 112 113 Many people prefer to use their regular text editor and edit the 114 conflict directly, as it may be easier to understand what you're doing 115 and to control the final result. There are definitely pros and cons to 116 each method, and sometimes there's value in using both. 117 118 We will not cover using dedicated merge tools here beyond providing some 119 pointers to various tools that you could use: 120 121 - `Emacs Ediff mode <https://www.emacswiki.org/emacs/EdiffMode>`__ 122 - `vimdiff/gvimdiff <https://linux.die.net/man/1/vimdiff>`__ 123 - `KDiff3 <http://kdiff3.sourceforge.net/>`__ 124 - `TortoiseMerge <https://tortoisesvn.net/TortoiseMerge.html>`__ 125 - `Meld <https://meldmerge.org/help/>`__ 126 - `P4Merge <https://www.perforce.com/products/helix-core-apps/merge-diff-tool-p4merge>`__ 127 - `Beyond Compare <https://www.scootersoftware.com/>`__ 128 - `IntelliJ <https://www.jetbrains.com/help/idea/resolve-conflicts.html>`__ 129 - `VSCode <https://code.visualstudio.com/docs/editor/versioncontrol>`__ 130 131 To configure git to work with these, see ``git mergetool --help`` or 132 the official `git-mergetool documentation`_. 133 134 .. _git-mergetool documentation: https://git-scm.com/docs/git-mergetool 135 136 Prerequisite patches 137 -------------------- 138 139 Most conflicts happen because the branch you are backporting to is 140 missing some patches compared to the branch you are backporting *from*. 141 In the more general case (such as merging two independent branches), 142 development could have happened on either branch, or the branches have 143 simply diverged -- perhaps your older branch had some other backports 144 applied to it that themselves needed conflict resolutions, causing a 145 divergence. 146 147 It's important to always identify the commit or commits that caused the 148 conflict, as otherwise you cannot be confident in the correctness of 149 your resolution. As an added bonus, especially if the patch is in an 150 area you're not that familiar with, the changelogs of these commits will 151 often give you the context to understand the code and potential problems 152 or pitfalls with your conflict resolution. 153 154 git log 155 ~~~~~~~ 156 157 A good first step is to look at ``git log`` for the file that has the 158 conflict -- this is usually sufficient when there aren't a lot of 159 patches to the file, but may get confusing if the file is big and 160 frequently patched. You should run ``git log`` on the range of commits 161 between your currently checked-out branch (``HEAD``) and the parent of 162 the patch you are picking (``<commit>``), i.e.:: 163 164 git log HEAD..<commit>^ -- <path> 165 166 Even better, if you want to restrict this output to a single function 167 (because that's where the conflict appears), you can use the following 168 syntax:: 169 170 git log -L:'\<function\>':<path> HEAD..<commit>^ 171 172 .. note:: 173 The ``\<`` and ``\>`` around the function name ensure that the 174 matches are anchored on a word boundary. This is important, as this 175 part is actually a regex and git only follows the first match, so 176 if you use ``-L:thread_stack:kernel/fork.c`` it may only give you 177 results for the function ``try_release_thread_stack_to_cache`` even 178 though there are many other functions in that file containing the 179 string ``thread_stack`` in their names. 180 181 Another useful option for ``git log`` is ``-G``, which allows you to 182 filter on certain strings appearing in the diffs of the commits you are 183 listing:: 184 185 git log -G'regex' HEAD..<commit>^ -- <path> 186 187 This can also be a handy way to quickly find when something (e.g. a 188 function call or a variable) was changed, added, or removed. The search 189 string is a regular expression, which means you can potentially search 190 for more specific things like assignments to a specific struct member:: 191 192 git log -G'\->index\>.*=' 193 194 git blame 195 ~~~~~~~~~ 196 197 Another way to find prerequisite commits (albeit only the most recent 198 one for a given conflict) is to run ``git blame``. In this case, you 199 need to run it against the parent commit of the patch you are 200 cherry-picking and the file where the conflict appeared, i.e.:: 201 202 git blame <commit>^ -- <path> 203 204 This command also accepts the ``-L`` argument (for restricting the 205 output to a single function), but in this case you specify the filename 206 at the end of the command as usual:: 207 208 git blame -L:'\<function\>' <commit>^ -- <path> 209 210 Navigate to the place where the conflict occurred. The first column of 211 the blame output is the commit ID of the patch that added a given line 212 of code. 213 214 It might be a good idea to ``git show`` these commits and see if they 215 look like they might be the source of the conflict. Sometimes there will 216 be more than one of these commits, either because multiple commits 217 changed different lines of the same conflict area *or* because multiple 218 subsequent patches changed the same line (or lines) multiple times. In 219 the latter case, you may have to run ``git blame`` again and specify the 220 older version of the file to look at in order to dig further back in 221 the history of the file. 222 223 Prerequisite vs. incidental patches 224 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 225 226 Having found the patch that caused the conflict, you need to determine 227 whether it is a prerequisite for the patch you are backporting or 228 whether it is just incidental and can be skipped. An incidental patch 229 would be one that touches the same code as the patch you are 230 backporting, but does not change the semantics of the code in any 231 material way. For example, a whitespace cleanup patch is completely 232 incidental -- likewise, a patch that simply renames a function or a 233 variable would be incidental as well. On the other hand, if the function 234 being changed does not even exist in your current branch then this would 235 not be incidental at all and you need to carefully consider whether the 236 patch adding the function should be cherry-picked first. 237 238 If you find that there is a necessary prerequisite patch, then you need 239 to stop and cherry-pick that instead. If you've already resolved some 240 conflicts in a different file and don't want to do it again, you can 241 create a temporary copy of that file. 242 243 To abort the current cherry-pick, go ahead and run 244 ``git cherry-pick --abort``, then restart the cherry-picking process 245 with the commit ID of the prerequisite patch instead. 246 247 Understanding conflict markers 248 ------------------------------ 249 250 Combined diffs 251 ~~~~~~~~~~~~~~ 252 253 Let's say you've decided against picking (or reverting) additional 254 patches and you just want to resolve the conflict. Git will have 255 inserted conflict markers into your file. Out of the box, this will look 256 something like:: 257 258 <<<<<<< HEAD 259 this is what's in your current tree before cherry-picking 260 ======= 261 this is what the patch wants it to be after cherry-picking 262 >>>>>>> <commit>... title 263 264 This is what you would see if you opened the file in your editor. 265 However, if you were to run ``git diff`` without any arguments, the 266 output would look something like this:: 267 268 $ git diff 269 [...] 270 ++<<<<<<<< HEAD 271 +this is what's in your current tree before cherry-picking 272 ++======== 273 + this is what the patch wants it to be after cherry-picking 274 ++>>>>>>>> <commit>... title 275 276 When you are resolving a conflict, the behavior of ``git diff`` differs 277 from its normal behavior. Notice the two columns of diff markers 278 instead of the usual one; this is a so-called "`combined diff`_", here 279 showing the 3-way diff (or diff-of-diffs) between 280 281 #. the current branch (before cherry-picking) and the current working 282 directory, and 283 #. the current branch (before cherry-picking) and the file as it looks 284 after the original patch has been applied. 285 286 .. _combined diff: https://git-scm.com/docs/diff-format#_combined_diff_format 287 288 289 Better diffs 290 ~~~~~~~~~~~~ 291 292 3-way combined diffs include all the other changes that happened to the 293 file between your current branch and the branch you are cherry-picking 294 from. While this is useful for spotting other changes that you need to 295 take into account, this also makes the output of ``git diff`` somewhat 296 intimidating and difficult to read. You may instead prefer to run 297 ``git diff HEAD`` (or ``git diff --ours``) which shows only the diff 298 between the current branch before cherry-picking and the current working 299 directory. It looks like this:: 300 301 $ git diff HEAD 302 [...] 303 +<<<<<<<< HEAD 304 this is what's in your current tree before cherry-picking 305 +======== 306 +this is what the patch wants it to be after cherry-picking 307 +>>>>>>>> <commit>... title 308 309 As you can see, this reads just like any other diff and makes it clear 310 which lines are in the current branch and which lines are being added 311 because they are part of the merge conflict or the patch being 312 cherry-picked. 313 314 Merge styles and diff3 315 ~~~~~~~~~~~~~~~~~~~~~~ 316 317 The default conflict marker style shown above is known as the ``merge`` 318 style. There is also another style available, known as the ``diff3`` 319 style, which looks like this:: 320 321 <<<<<<< HEAD 322 this is what is in your current tree before cherry-picking 323 ||||||| parent of <commit> (title) 324 this is what the patch expected to find there 325 ======= 326 this is what the patch wants it to be after being applied 327 >>>>>>> <commit> (title) 328 329 As you can see, this has 3 parts instead of 2, and includes what git 330 expected to find there but didn't. It is *highly recommended* to use 331 this conflict style as it makes it much clearer what the patch actually 332 changed; i.e., it allows you to compare the before-and-after versions 333 of the file for the commit you are cherry-picking. This allows you to 334 make better decisions about how to resolve the conflict. 335 336 To change conflict marker styles, you can use the following command:: 337 338 git config merge.conflictStyle diff3 339 340 There is a third option, ``zdiff3``, introduced in `Git 2.35`_, 341 which has the same 3 sections as ``diff3``, but where common lines have 342 been trimmed off, making the conflict area smaller in some cases. 343 344 .. _Git 2.35: https://github.blog/2022-01-24-highlights-from-git-2-35/ 345 346 Iterating on conflict resolutions 347 --------------------------------- 348 349 The first step in any conflict resolution process is to understand the 350 patch you are backporting. For the Linux kernel this is especially 351 important, since an incorrect change can lead to the whole system 352 crashing -- or worse, an undetected security vulnerability. 353 354 Understanding the patch can be easy or difficult depending on the patch 355 itself, the changelog, and your familiarity with the code being changed. 356 However, a good question for every change (or every hunk of the patch) 357 might be: "Why is this hunk in the patch?" The answers to these 358 questions will inform your conflict resolution. 359 360 Resolution process 361 ~~~~~~~~~~~~~~~~~~ 362 363 Sometimes the easiest thing to do is to just remove all but the first 364 part of the conflict, leaving the file essentially unchanged, and apply 365 the changes by hand. Perhaps the patch is changing a function call 366 argument from ``0`` to ``1`` while a conflicting change added an 367 entirely new (and insignificant) parameter to the end of the parameter 368 list; in that case, it's easy enough to change the argument from ``0`` 369 to ``1`` by hand and leave the rest of the arguments alone. This 370 technique of manually applying changes is mostly useful if the conflict 371 pulled in a lot of unrelated context that you don't really need to care 372 about. 373 374 For particularly nasty conflicts with many conflict markers, you can use 375 ``git add`` or ``git add -i`` to selectively stage your resolutions to 376 get them out of the way; this also lets you use ``git diff HEAD`` to 377 always see what remains to be resolved or ``git diff --cached`` to see 378 what your patch looks like so far. 379 380 Dealing with file renames 381 ~~~~~~~~~~~~~~~~~~~~~~~~~ 382 383 One of the most annoying things that can happen while backporting a 384 patch is discovering that one of the files being patched has been 385 renamed, as that typically means git won't even put in conflict markers, 386 but will just throw up its hands and say (paraphrased): "Unmerged path! 387 You do the work..." 388 389 There are generally a few ways to deal with this. If the patch to the 390 renamed file is small, like a one-line change, the easiest thing is to 391 just go ahead and apply the change by hand and be done with it. On the 392 other hand, if the change is big or complicated, you definitely don't 393 want to do it by hand. 394 395 As a first pass, you can try something like this, which will lower the 396 rename detection threshold to 30% (by default, git uses 50%, meaning 397 that two files need to have at least 50% in common for it to consider 398 an add-delete pair to be a potential rename):: 399 400 git cherry-pick -strategy=recursive -Xrename-threshold=30 401 402 Sometimes the right thing to do will be to also backport the patch that 403 did the rename, but that's definitely not the most common case. Instead, 404 what you can do is to temporarily rename the file in the branch you're 405 backporting to (using ``git mv`` and committing the result), restart the 406 attempt to cherry-pick the patch, rename the file back (``git mv`` and 407 committing again), and finally squash the result using ``git rebase -i`` 408 (see the `rebase tutorial`_) so it appears as a single commit when you 409 are done. 410 411 .. _rebase tutorial: https://medium.com/@slamflipstrom/a-beginners-guide-to-squashing-commits-with-git-rebase-8185cf6e62ec 412 413 Gotchas 414 ------- 415 416 Function arguments 417 ~~~~~~~~~~~~~~~~~~ 418 419 Pay attention to changing function arguments! It's easy to gloss over 420 details and think that two lines are the same but actually they differ 421 in some small detail like which variable was passed as an argument 422 (especially if the two variables are both a single character that look 423 the same, like i and j). 424 425 Error handling 426 ~~~~~~~~~~~~~~ 427 428 If you cherry-pick a patch that includes a ``goto`` statement (typically 429 for error handling), it is absolutely imperative to double check that 430 the target label is still correct in the branch you are backporting to. 431 The same goes for added ``return``, ``break``, and ``continue`` 432 statements. 433 434 Error handling is typically located at the bottom of the function, so it 435 may not be part of the conflict even though could have been changed by 436 other patches. 437 438 A good way to ensure that you review the error paths is to always use 439 ``git diff -W`` and ``git show -W`` (AKA ``--function-context``) when 440 inspecting your changes. For C code, this will show you the whole 441 function that's being changed in a patch. One of the things that often 442 go wrong during backports is that something else in the function changed 443 on either of the branches that you're backporting from or to. By 444 including the whole function in the diff you get more context and can 445 more easily spot problems that might otherwise go unnoticed. 446 447 Refactored code 448 ~~~~~~~~~~~~~~~ 449 450 Something that happens quite often is that code gets refactored by 451 "factoring out" a common code sequence or pattern into a helper 452 function. When backporting patches to an area where such a refactoring 453 has taken place, you effectively need to do the reverse when 454 backporting: a patch to a single location may need to be applied to 455 multiple locations in the backported version. (One giveaway for this 456 scenario is that a function was renamed -- but that's not always the 457 case.) 458 459 To avoid incomplete backports, it's worth trying to figure out if the 460 patch fixes a bug that appears in more than one place. One way to do 461 this would be to use ``git grep``. (This is actually a good idea to do 462 in general, not just for backports.) If you do find that the same kind 463 of fix would apply to other places, it's also worth seeing if those 464 places exist upstream -- if they don't, it's likely the patch may need 465 to be adjusted. ``git log`` is your friend to figure out what happened 466 to these areas as ``git blame`` won't show you code that has been 467 removed. 468 469 If you do find other instances of the same pattern in the upstream tree 470 and you're not sure whether it's also a bug, it may be worth asking the 471 patch author. It's not uncommon to find new bugs during backporting! 472 473 Verifying the result 474 ==================== 475 476 colordiff 477 --------- 478 479 Having committed a conflict-free new patch, you can now compare your 480 patch to the original patch. It is highly recommended that you use a 481 tool such as `colordiff`_ that can show two files side by side and color 482 them according to the changes between them:: 483 484 colordiff -yw -W 200 <(git diff -W <upstream commit>^-) <(git diff -W HEAD^-) | less -SR 485 486 .. _colordiff: https://www.colordiff.org/ 487 488 Here, ``-y`` means to do a side-by-side comparison; ``-w`` ignores 489 whitespace, and ``-W 200`` sets the width of the output (as otherwise it 490 will use 130 by default, which is often a bit too little). 491 492 The ``rev^-`` syntax is a handy shorthand for ``rev^..rev``, essentially 493 giving you just the diff for that single commit; also see 494 the official `git rev-parse documentation`_. 495 496 .. _git rev-parse documentation: https://git-scm.com/docs/git-rev-parse#_other_rev_parent_shorthand_notations 497 498 Again, note the inclusion of ``-W`` for ``git diff``; this ensures that 499 you will see the full function for any function that has changed. 500 501 One incredibly important thing that colordiff does is to highlight lines 502 that are different. For example, if an error-handling ``goto`` has 503 changed labels between the original and backported patch, colordiff will 504 show these side-by-side but highlighted in a different color. Thus, it 505 is easy to see that the two ``goto`` statements are jumping to different 506 labels. Likewise, lines that were not modified by either patch but 507 differ in the context will also be highlighted and thus stand out during 508 a manual inspection. 509 510 Of course, this is just a visual inspection; the real test is building 511 and running the patched kernel (or program). 512 513 Build testing 514 ------------- 515 516 We won't cover runtime testing here, but it can be a good idea to build 517 just the files touched by the patch as a quick sanity check. For the 518 Linux kernel you can build single files like this, assuming you have the 519 ``.config`` and build environment set up correctly:: 520 521 make path/to/file.o 522 523 Note that this won't discover linker errors, so you should still do a 524 full build after verifying that the single file compiles. By compiling 525 the single file first you can avoid having to wait for a full build *in 526 case* there are compiler errors in any of the files you've changed. 527 528 Runtime testing 529 --------------- 530 531 Even a successful build or boot test is not necessarily enough to rule 532 out a missing dependency somewhere. Even though the chances are small, 533 there could be code changes where two independent changes to the same 534 file result in no conflicts, no compile-time errors, and runtime errors 535 only in exceptional cases. 536 537 One concrete example of this was a pair of patches to the system call 538 entry code where the first patch saved/restored a register and a later 539 patch made use of the same register somewhere in the middle of this 540 sequence. Since there was no overlap between the changes, one could 541 cherry-pick the second patch, have no conflicts, and believe that 542 everything was fine, when in fact the code was now scribbling over an 543 unsaved register. 544 545 Although the vast majority of errors will be caught during compilation 546 or by superficially exercising the code, the only way to *really* verify 547 a backport is to review the final patch with the same level of scrutiny 548 as you would (or should) give to any other patch. Having unit tests and 549 regression tests or other types of automatic testing can help increase 550 the confidence in the correctness of a backport. 551 552 Submitting backports to stable 553 ============================== 554 555 As the stable maintainers try to cherry-pick mainline fixes onto their 556 stable kernels, they may send out emails asking for backports when when 557 encountering conflicts, see e.g. 558 <https://lore.kernel.org/stable/2023101528-jawed-shelving-071a@gregkh/">https://lore.kernel.org/stable/2023101528-jawed-shelving-071a@gregkh/>. 559 These emails typically include the exact steps you need to cherry-pick 560 the patch to the correct tree and submit the patch. 561 562 One thing to make sure is that your changelog conforms to the expected 563 format:: 564 565 <original patch title> 566 567 [ Upstream commit <mainline rev> ] 568 569 <rest of the original changelog> 570 [ <summary of the conflicts and their resolutions> ] 571 Signed-off-by: <your name and email> 572 573 The "Upstream commit" line is sometimes slightly different depending on 574 the stable version. Older version used this format:: 575 576 commit <mainline rev> upstream. 577 578 It is most common to indicate the kernel version the patch applies to 579 in the email subject line (using e.g. 580 ``git send-email --subject-prefix='PATCH 6.1.y'``), but you can also put 581 it in the Signed-off-by:-area or below the ``---`` line. 582 583 The stable maintainers expect separate submissions for each active 584 stable version, and each submission should also be tested separately. 585 586 A few final words of advice 587 =========================== 588 589 1) Approach the backporting process with humility. 590 2) Understand the patch you are backporting; this means reading both 591 the changelog and the code. 592 3) Be honest about your confidence in the result when submitting the 593 patch. 594 4) Ask relevant maintainers for explicit acks. 595 596 Examples 597 ======== 598 599 The above shows roughly the idealized process of backporting a patch. 600 For a more concrete example, see this video tutorial where two patches 601 are backported from mainline to stable: 602 `Backporting Linux Kernel Patches`_. 603 604 .. _Backporting Linux Kernel Patches: https://youtu.be/sBR7R1V2FeA
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.