Git rebasing seems to have a marmite effect on people – either they love it or hate it. It’s true that rebasing opens up a whole range of new and exciting ways to screw yourself over. Even better than that it offers up a range of wonderful opportunities to screw over your team mates as well. Yet despite that I am most definitely in the pro-rebase camp. After all, all the most powerful tools will burn you if you don’t use them properly. Having developed a fairly extensive set of rebase scars over the years here are my handy tips for dodging the pitfalls and making the most of rebase.
This assume that you are already familiar with the basic operations of git rebase. If not the git rebase documentation is probably a good place to start.
Git has two main forms of rebase – interactive and non-interactive. Essentially they do the same thing – take your branch and reapply the code changes on top of another branch. Interactive rebasing offers a bunch more options such as re-ordering and re-wording commits but for the safety-conscious rebaser, the main thing it offers is that it shows you a list of commits that will be re-applied before the rebase starts. In my experience, the vast majority of rebase disasters can be averted by simply reviewing this list and aborting the rebase before it even starts. If the list is not what you expect then run away, figure out why and then try again.
Example interactive rebase commit list
A corollary of always rebasing interactively is that you should never use git pull --rebase
. I would go a step further and say that the only version of git pull you should use is git pull --ff-only
. This is because both the rebase and the default case may result in the code you are working on changing without you realising it. They will both attempt to transparently incorporate any changes made upstream without advising you what they are or letting you review them. A classic case of where this is undesirable is when you and a colleague both add the same method to a class but in a different place in the file. Git will quite happily combine these changes resulting in a duplicate method definition. Maybe you or your colleague will catch it in code review but it is better to avoid it in the first place by being mindful of when the code you are basing your changes on may have changed.
It’s very easy to get excited when you start an interactive rebase. But instead of just diving in and throwing commits here, there and everywhere, taking it easy and spreading the rebase over a few steps will save you a lot of pain and misery in the long run.
The first thing I do is an interactive rebase without changing anything. This way you can review the commits and deal with any upstream conflicts up front without having to try and figure out whether the conflicts are a result of something that has changed in the branch you are rebasing onto or whether it is a problem you have created with all your commit juggling.
Then as you proceed keep moving, squashing and editing commits in small chunks. That way if you do get any unexpected conflicts you can abort the rebase and figure out the problem without losing any other changes you may have made. Nothing’s worse than going through a marathon rebase to get within sight of the end and having to abort the whole thing because you’ve swapped the wrong two commits around and now one of them won’t apply.
Rebasing shouldn’t be hard. If it is there is probably an underlying problem with what you are trying to do. Perhaps something has changed upstream that you weren’t aware of. Perhaps you have files in a commit that you weren’t expecting to be there. Don’t be afraid to abort the rebase so that you can stop and rethink what you are trying to do without the pressure of being in rebase limbo.
It is inevitable that you will sometimes run into conflicts caused by changes your colleagues have made. One way or another you will have to deal with them, whether it is via rebasing, when you merge or by shouting at your colleagues until they revert their changes so that you can merge first and thereby force them to deal with the conflicts.
Conflicts caused by moving commits around within your own branch are more avoidable however. Keep your commits small and focused. The fewer files and changes in a commit, the less likely it is to conflict and so the easier it is to reorder. If you are TDDing then a good place to start is committing each time you add a test and get the associated code passing. That way you are only ever committing two files at a time and each file will contain a small change.
The more here being the number of commits and the less being the amount of pain involved in rebasing. If you have ignored the previous suggestion of committing often then that doesn’t mean all is lost. Interactive rebasing allows you to edit a commit. If you have added too many files to a commit and so now can’t move it around, you can edit the commit, remove some of the changes from the commit and create a new commit for them.
This sounds complicated but in fact is as simple as marking the commit for editing in the interactive rebase commit list. Then when the rebase pauses run
git reset HEAD~ some/file/for/another/commitgit commit --amendgit add .git commit -m '- message for the separate commit'git rebase --continue
This will leave all the changes to that file in a separate commit.
Another variation on this is for when you have multiple changes to the same file that have accidentally ended up in the same commit. In this case git add -p .
is your friend. This will allow you to choose which of the changes to the file are merged back into your original commit and which stay unstaged ready to add to another commit.
One of the beauties of git is that it is actually quite hard to completely lose anything once it has been committed. If you do get to the end of your rebase and realise you’ve made a pig’s ear of it, your previous commits still exist and you can easily revert back.
First you need the SHA of the tip of your branch before the rebase. Assuming you don’t have this to hand, you can find it using git reflog
. You will be able to see where your rebase started easily enough as the line will contain rebase -i (start)
. You just need to find the SHA of the commit before this and reset to that commit with git reset --hard SHA
.
One of the oft-cited reasons for avoiding rebasing is that in order to push your changes back upstream you will need to force push, with the potential to overwrite other user’s changes. This is of course completely possible and a great way to antagonise your colleagues. However just because you can screw things up by force pushing doesn’t mean you have to.
First of all, you only use rebasing before you merge. Once you have merged something into one of your “main” branches (typically master but perhaps you have other special release branches) then it’s there for good. If you find a problem afterwards you have to live with it or add another commit.
You can rebase other branches that you share with colleagues if you choose to. You simply need to communicate as to who is doing what. We do this on a daily basis without issue. It helps if people are responsive. So if one of my colleagues tells me that they have force pushed a branch we are sharing then I will update my local copy straight-away. It helps avoid conflicts and trying to figure out whether a force push should or shouldn’t be necessary.
You should also know before you attempt to push whether you are expecting a force push to be necessary. If you get the classic force push error message (along the lines of error: failed to push some refs...
) when you weren’t expecting it then don’t just automatically force push. Stop and see what commits you have locally compared to the remote version and then it should become clear what you need to do to fix it.
My personal rule of thumb around shared branches is that if a colleague and I are working on the same feature at the same time, I will create my own branch from the feature branch, work on that and then rebase and merge back into the shared feature branch. It is the safest option and since you both may be pushing frequently it saves constant back and forth. If we are working independently in more of a handover type scenario then we will probably both just work on the same branch as the chances of treading on each other’s toes is much smaller.
The other approach, which is probably what we do most commonly, is save the rebase until the end. So we both just keep adding commits to the tip of the branch and then one or both of us will perform the rebase tidy up once we are happy that the feature is ready to go.
The aim of rebasing is not to avoid merge commits or maintain a linear history. Merge commits can be very helpful in identifying a set of commits that have some degree of cohesion. However at the other extreme where every single commit has a matching merge commit they are unhelpful and clutter up your log.
The answer is simply to always merge with the correct flag set. If I am merging a single commit or if the branch contains multiple unrelated commits then I will do a git merge --ff-only
. If not I will do a git merge --no-ff
. Simple as that.
There are definite benefits to rebasing your code once you have completed all the tasks as discussed above and in The poetry of pull requests. However sometimes you will add a commit just to fix a typo in an earlier commit or something equally trivial. It can be beneficial to just squash these changes straight into the correct commit immediately in order to reduce the number of commits you are working with in your final pre-PR rebase.
So there we have it, 10 handy tips for smooth and efficient rebasing. However knowing techniques for juggling commits is only part of the picture. You also need to know what kind of end result you are looking to achieve. For more on that, check out The poetry of pull requests.