Resolving merge conflicts when introducing formatting to an existing codebase
On September 22, 2022 by Sosthène Guédon
If a project doesn't use any formatting tool, introducing them can be a headache, and is almost guaranteed to cause merge conflicts with any ongoing PR. Here's how to fix them.
At some point in the life of a project, you might want to introduce code formatting. If not used from the beginning, it is likely that the overall formatting will not follow any convention, which makes code harder to read and to maintain: developers will have to remember to turn off their format-on-save and manually format their modifications.
While this applies to almost all programming languages and formatting tooling, the examples assume you're using Rust and
cargo fmt. For your use case, replace the
cargo fmt command by your tool of choice.
- You're using
- This repository has many active branches (pending Pull Requests for example)
cargo fmton the latest commit on
mainchanges almost every file in the project, causing merge conflicts with every active PR.
- You created a commit on top of
cargo fmtfor the first time. This commit must not add any modification to the code. This commit may add a couple configuration files.
- All branches branch of the commit just before
cargo fmt. If this isn't the case you should use
git rebase <commit before cargo fmt>make it work.
This post presents two solutions to the merge conflicts. One that's easy but not ideal using merge commits, and one that's way too complex and abuses
rebase to get a perfect git history.
The easy solution with
With the previous assumptions, this is what the tree looks like:
Head of the branch to merge into main (called feature-branch) │ main branch ... The many commits of the branch you want to merge │ │ ... │ │ │ 393fd90 Cargo fmt │ │ │ 046732° common ancestor ──┘ │
The easy way is to simply merge
feature-branch. This will cause a ton of conflicts, but they can be resolved easily with
git's merge strategies:
git merge -s ours 393fd90 called from the
ours strategy does is that it resolves every conflict by taking the solution of the
feature-branch. This essentially reverts all the formatting done in
cargo fmt commit)
You can then run
cargo fmt again, and use
git commit --amend to apply the results the merge commit.
This gets you the following tree:
Head of main Head of feature-branch │ │ ... │ │ │ ├─── Merge branch 'main' into 'feature-branch' │ │ │ │ │ ... The many commits of the branch you want to merge │ │ │ │ │ │ 393fd90 Cargo fmt │ │ │ 0467320 common ancestor ──┘ │
git should then allow you to merge
main without issues.
Why you should avoid this
There are some issues with this strategy:
- This assumes that the
feature-branch's ancestor is the commit just before
main. If this is not the case, the
-s oursrisks deleting any work done in those commits.
- Some repositories do no accept merge commits that resolve conflicts and expect branches to rebase, o. It is the case of the Rust project for example. Having a linear history makes
git bisectmore efficient.
The over engineered
The ideal solution would be to use
git rebase 393fd90 from
feature-branch before merging. This gives us the following initial tree (left) and our objective (right):
Head of feature-branch Head of feature-branch │ │ ... ... │ │ <commit-id> Many commits <commit-id> Many commits │ │ 393fd90 Cargo fmt │ 393fd90 Cargo fmt ──┘ │ │ │ 0467320 common ancestor ──┘ 0467320 common ancestor │ │
This would requires us to rewrite every commit of the feature branch, as if it had been written from the beginning with formatting at each commit. This is annoying but can be easily automated.
Here's a magical shell script:
#!/bin/sh && \ && \ && \ && \
Store this script somewhere, make it executable and run
git rebase -i 393fd90 (obviously by replacing the commit ID by your own commit applying
This should open your editor for an interactive rebase:
Remove all the commit messages and instead of using
pick use the script. This should give you:
exec rebase-script.sh 1fc6c90 exec rebase-script.sh 6b24810 exec rebase-script.sh dd14750 exec rebase-script.sh c619260 exec rebase-script.sh fa39180 exec rebase-script.sh 4ca2ac0 exec rebase-script.sh 7b36970 exec rebase-script.sh 1952fb0
Save, quit, wait for git to make its magic... and Voila! You now have a rebased branch that looks just as if it had always been built with a formatting tool!
Why it works
Normally, when operating the
rebase with the default
git tries to apply the commit to the current branch.
In our case, this would fail due to conflicts.
Here are the steps taken by the script instead:
git rm $(git ls-tree --full-tree -r --name-only "$1"\~1removes all files tracked by git before the application of the current commit. To understand it, let's expand it. When
$1is replaced by the hash given to the script it looks like this:
git rm $(git ls-tree --full-tree -r --name-only 1fc6c90~1).
1fc6c90~1means 1 commit before
1fc6c90, so this command lists all the filed tracked in
1fc6c90's parent and deltes them.
git checkout $1 -- .then tells git to load every file from
1fc6c90into the current directory.
cargo fmtruns the formatting
git add $(git ls-tree --full-tree -r --name-only "$1")add every file that might have been modified by the previous step. We use
~1) again to avoid adding files that aren't meant to be tracked (for example the script itself).
git commit -C "$1"creates a new commit, with the same parameter (message, date and author) as the commit is is replacing. Because they were deleted in step 1, this doesn't commit files deleted by
1fc6c90. This is why step 1 is important.
This is something I've had to deal with, I hope this can help someone in the future.
If you don't understand something or believe something should be added, please email me at or reach me on Mastodon: @firstname.lastname@example.org