Resolving merge conflicts when introducing formatting to an existing codebase
On September 22, 2022 by Sosthène Guédon
If a project doesn't use any formatting tool, introducing them can be a headache, and is almost guaranteed to cause merge conflicts with any ongoing PR. Here's how to fix them.
Context
At some point in the life of a project, you might want to introduce code formatting. If not used from the beginning, it is likely that the overall formatting will not follow any convention, which makes code harder to read and to maintain: developers will have to remember to turn off their format-on-save and manually format their modifications.
While this applies to almost all programming languages and formatting tooling, the examples assume you're using Rust and cargo fmt
. For your use case, replace the cargo fmt
command by your tool of choice.
Let's assume:
- You're using
git
- This repository has many active branches (pending Pull Requests for example)
- Running
cargo fmt
on the latest commit onmain
changes almost every file in the project, causing merge conflicts with every active PR. - You created a commit on top of
main
that runscargo fmt
for the first time. This commit must not add any modification to the code. This commit may add a couple configuration files. - All branches branch of the commit just before
cargo fmt
. If this isn't the case you should usegit rebase <commit before cargo fmt>
make it work.
This post presents two solutions to the merge conflicts. One that's easy but not ideal using merge commits, and one that's way too complex and abuses rebase
to get a perfect git history.
The easy solution with git merge
How-to
With the previous assumptions, this is what the tree looks like:
Head of the branch to merge into main (called feature-branch)
│
main branch ... The many commits of the branch you want to merge
│ │
... │
│ │
393fd90 Cargo fmt │
│ │
046732° common ancestor ──┘
│
The easy way is to simply merge main
into feature-branch
. This will cause a ton of conflicts, but they can be resolved easily with git
's merge strategies:
git merge -s ours 393fd90
called from the feature-branch
What the ours
strategy does is that it resolves every conflict by taking the solution of the feature-branch
. This essentially reverts all the formatting done in 393fd90
(the cargo fmt
commit)
You can then run cargo fmt
again, and use git commit --amend
to apply the results the merge commit.
This gets you the following tree:
Head of main Head of feature-branch
│ │
... │
│ │
├─── Merge branch 'main' into 'feature-branch'
│ │
│ │
│ ... The many commits of the branch you want to merge
│ │
│ │
│ │
393fd90 Cargo fmt │
│ │
0467320 common ancestor ──┘
│
Normally, git
should then allow you to merge feature-branch
into main
without issues.
Why you should avoid this
There are some issues with this strategy:
- This assumes that the
feature-branch
's ancestor is the commit just before393fd90
inmain
. If this is not the case, the-s ours
risks deleting any work done in those commits. - Some repositories do no accept merge commits that resolve conflicts and expect branches to rebase, o. It is the case of the Rust project for example. Having a linear history makes
git blame
andgit bisect
more efficient.
The over engineered rebase
solution
The objective
The ideal solution would be to use git rebase 393fd90
from feature-branch
before merging. This gives us the following initial tree (left) and our objective (right):
Head of feature-branch Head of feature-branch
│ │
... ...
│ │
<commit-id> Many commits <commit-id> Many commits
│ │
393fd90 Cargo fmt │ 393fd90 Cargo fmt ──┘
│ │ │
0467320 common ancestor ──┘ 0467320 common ancestor
│ │
This would requires us to rewrite every commit of the feature branch, as if it had been written from the beginning with formatting at each commit. This is annoying but can be easily automated.
How-to
Here's a magical shell script:
#!/bin/sh
&& \
&& \
&& \
&& \
Store this script somewhere, make it executable and run git rebase -i 393fd90
(obviously by replacing the commit ID by your own commit applying cargo fmt
).
This should open your editor for an interactive rebase:
Remove all the commit messages and instead of using pick
use the script. This should give you:
exec rebase-script.sh 1fc6c90
exec rebase-script.sh 6b24810
exec rebase-script.sh dd14750
exec rebase-script.sh c619260
exec rebase-script.sh fa39180
exec rebase-script.sh 4ca2ac0
exec rebase-script.sh 7b36970
exec rebase-script.sh 1952fb0
Save, quit, wait for git to make its magic... and Voila! You now have a rebased branch that looks just as if it had always been built with a formatting tool!
Why it works
Normally, when operating the rebase
with the default pick
option, git
tries to apply the commit to the current branch.
In our case, this would fail due to conflicts.
Here are the steps taken by the script instead:
git rm $(git ls-tree --full-tree -r --name-only "$1"\~1
removes all files tracked by git before the application of the current commit. To understand it, let's expand it. When$1
is replaced by the hash given to the script it looks like this:git rm $(git ls-tree --full-tree -r --name-only 1fc6c90~1)
.1fc6c90~1
means 1 commit before1fc6c90
, so this command lists all the filed tracked in1fc6c90
's parent and deletes them.git checkout $1 -- .
then tells git to load every file from1fc6c90
into the current directory.cargo fmt
runs the formattinggit add $(git ls-tree --full-tree -r --name-only "$1")
add every file that might have been modified by the previous step. We usegit ls-tree
(without~1
) again to avoid adding files that aren't meant to be tracked (for example the script itself).- Finally,
git commit -C "$1"
creates a new commit, with the same parameter (message, date and author) as the commit is is replacing. Because they were deleted in step 1, this doesn't commit files deleted by1fc6c90
. This is why step 1 is important.
Wrap-up
This is something I've had to deal with, I hope this can help someone in the future.
If you don't understand something or believe something should be added, please email me at sosthene@guedon.gdn or reach me on Mastodon: @sgued@pouet.chapril.org